Hello:
I got this error when putting files into hdfs,it seems a old issue,and I
followed the solution of this link:
----------------------------------------------------------------------------------------------------------------------------
http://adityadesai.wordpress.com/2009/02/26/another-problem-with-hadoop-
jobjar-could-only-be-replicated-to-0-nodes-instead-of-1io-exception/
-----------------------------------------------------------------------------------------------------------------------------
but problem still exists.so I tried to figure it out through source code:
-----------------------------------------------------------------------------------------------------------------------------------
org.apache.hadoop.hdfs.server.namenode.FSNameSystem.getAdditionalBlock()
-----------------------------------------------------------------------------------------------------------------------------------
// choose targets for the new block tobe allocated.
DatanodeDescriptor targets[] = replicator.chooseTarget(replication,
clientNode,
null,
blockSize);
if (targets.length < this.minReplication) {
throw new IOException("File " + src + " could only be replicated to " +
targets.length + " nodes, instead of " +
minReplication);
--------------------------------------------------------------------------------------------------------------------------------------
I think "DatanodeDescriptor" represents datanode,so here "targets.length"
means the number of datanode,clearly,it is 0,in other words,no datanode is
available.But in the web interface:localhost:50070,I can see 4 live nodes(I
have 4 nodes only),and "hadoop dfsadmin -report" shows 4 nodes also.that is
strange.
And I got this error message in secondary namenode:
---------------------------------------------------------------------------------------------------------------------------------
2010-05-26 16:26:39,588 INFO org.apache.hadoop.hdfs.server.common.Storage:
Recovering storage directory /home/alex/tmp/dfs/namesecondary from failed
checkpoint.
2010-05-26 16:26:39,593 ERROR
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in
doCheckpoint:
2010-05-26 16:26:39,594 ERROR
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode:
java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:193)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
..................................
---------------------------------------------------------------------------------------------------------------------------------
and error message in datanode:
---------------------------------------------------------------------------------------------------------------------------------
2010-05-26 16:07:49,039 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(192.168.1.3:50010,
storageID=DS-1180479012-192.168.1.3-50010-1274799233678, infoPort=50075,
ipcPort=50020):DataXceiver
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcher.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233)
at sun.nio.ch.IOUtil.read(IOUtil.java:206)
.........................
---------------------------------------------------------------------------------------------------------------------------------
Seems like that network ports don't open,but after scaning by nmap,I can
confirm that all network ports in relevant nodes are being opened.After two
days effort,result is zero.
Can anybody help me troubleshooting?Thank you.
(following is relevant info:my cluster configuration,content conf files
and oupt or "hadoop dfsadmin -report" and java error message stack )
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
my configuration is:
-----------------------------------------------------------------------------------------
ubuntu 10.04 64 bit+jdk1.6.0_20+hadoop 0.20.2,
-----------------------------------------------------------------------------------------
core-site.xml
-----------------------------------------------------------------------------------------
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://AlexLuya</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/alex/tmp</value>
</property>
</configuration>
-----------------------------------------------------------------------------------------
hdfs-site.xml
-----------------------------------------------------------------------------------------
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/home/alex/hadoop/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/alex/hadoop/dfs</value>
</property>
<property>
<name>dfs.block.size</name>
<value>134217728</value>
</property>
<property>
<name>dfs.datanode.max.xcievers</name>
<value>2047</value>
</property>
</configuration>
-----------------------------------------------------------------------------------------
masters
-----------------------------------------------------------------------------------------
192.168.1.2
-----------------------------------------------------------------------------------------
slaves
-----------------------------------------------------------------------------------------
192.168.1.3
192.168.1.4
192.168.1.5
192.168.1.6
-----------------------------------------------------------------------------------------
result of hadoop dfsadmin -report
-----------------------------------------------------------------------------------------
Configured Capacity: 6836518912 (6.37 GB)
Present Capacity: 1406951424 (1.31 GB)
DFS Remaining: 1406853120 (1.31 GB)
DFS Used: 98304 (96 KB)
DFS Used%: 0.01%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
-------------------------------------------------
Datanodes available: 4 (4 total, 0 dead)
Name: 192.168.1.5:50010
Decommission Status : Normal
Configured Capacity: 1709129728 (1.59 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 1345765376 (1.25 GB)
DFS Remaining: 363339776(346.51 MB)
DFS Used%: 0%
DFS Remaining%: 21.26%
Last contact: Tue May 25 20:51:09 CST 2010
Name: 192.168.1.3:50010
Decommission Status : Normal
Configured Capacity: 1709129728 (1.59 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 1373503488 (1.28 GB)
DFS Remaining: 335601664(320.05 MB)
DFS Used%: 0%
DFS Remaining%: 19.64%
Last contact: Tue May 25 20:51:10 CST 2010
Name: 192.168.1.6:50010
Decommission Status : Normal
Configured Capacity: 1709129728 (1.59 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 1346879488 (1.25 GB)
DFS Remaining: 362225664(345.45 MB)
DFS Used%: 0%
DFS Remaining%: 21.19%
Last contact: Tue May 25 20:51:08 CST 2010
Name: 192.168.1.4:50010
Decommission Status : Normal
Configured Capacity: 1709129728 (1.59 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 1363419136 (1.27 GB)
DFS Remaining: 345686016(329.67 MB)
DFS Used%: 0%
DFS Remaining%: 20.23%
Last contact: Tue May 25 20:51:08 CST 2010
-----------------------------------------------------------------------------------------
Java error stack:
-----------------------------------------------------------------------------------------
10/05/25 20:43:24 WARN hdfs.DFSClient: DataStreamer Exception:
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/user/alex/input could only be replicated to 0 nodes, instead of 1
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
at org.apache.hadoop.ipc.Client.call(Client.java:740)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
at $Proxy0.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at $Proxy0.addBlock(Unknown Source)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2937)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2819)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288)
10/05/25 20:43:24 WARN hdfs.DFSClient: Error Recovery for block null bad
datanode[0] nodes == null
10/05/25 20:43:24 WARN hdfs.DFSClient: Could not get block locations. Source
file "/user/alex/input" - Aborting...
put: java.io.IOException: File /user/alex/input could only be replicated to 0
nodes, instead of 1
10/05/25 20:43:24 ERROR hdfs.DFSClient: Exception closing file /user/alex/input
: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/user/alex/input could only be replicated to 0 nodes, instead of 1
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/user/alex/input could only be replicated to 0 nodes, instead of 1
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
at org.apache.hadoop.ipc.Client.call(Client.java:740)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
at $Proxy0.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at $Proxy0.addBlock(Unknown Source)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2937)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2819)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288)
-----------------------------------------------------------------------------------------