FAQ
Could only be replicated to 0 nodes, instead of 1 when 2 of 3 DataNodes are full
--------------------------------------------------------------------------------

Key: HADOOP-5886
URL: https://issues.apache.org/jira/browse/HADOOP-5886
Project: Hadoop Core
Issue Type: Bug
Components: dfs
Affects Versions: 0.18.3
Environment: * 3 machines, 2 of them with only 80GB of space, and 1 with 1.5GB
* Two clients are copying files all the time (one of them is the 1.5GB machine)
* The replication is set on 2

Reporter: Stas Oskin


I let the space on 2 smaller machines to end, to test the behavior.

Now, one of the clients (the one located on 1.5GB) works fine, and the other one - the external, unable to copy and displays the error + the exception below:

10:51:03 WARN dfs.DFSClient: NotReplicatedYetException sleeping /test/test.bin retries left 1

09/05/21 10:51:06 WARN dfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /test/test.bin could only be replicated to 0 nodes, instead of 1

at org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1123)

at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)

at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

at java.lang.reflect.Method.invoke(Method.java:597)

at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)

at org.apache.hadoop.ipc.Server$Handler.run(Server.java:890)



at org.apache.hadoop.ipc.Client.call(Client.java:716)

at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)

at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

at java.lang.reflect.Method.invoke(Method.java:597)

at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)

at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)

at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)

at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2450)

at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2333)

at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1800(DFSClient.java:1745)

at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1922)



09/05/21 10:51:06 WARN dfs.DFSClient: Error Recovery for block null bad datanode[0]

java.io.IOException: Could not get block locations. Aborting...

at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2153)

at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1745)

at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1899)


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Stas Oskin (JIRA) at May 21, 2009 at 8:41 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Stas Oskin updated HADOOP-5886:
    -------------------------------

    Summary: Error when 2 of 3 DataNodes are full: "Could only be replicated to 0 nodes, instead of 1" (was: Could only be replicated to 0 nodes, instead of 1 when 2 of 3 DataNodes are full)
    Error when 2 of 3 DataNodes are full: "Could only be replicated to 0 nodes, instead of 1"
    -----------------------------------------------------------------------------------------

    Key: HADOOP-5886
    URL: https://issues.apache.org/jira/browse/HADOOP-5886
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Affects Versions: 0.18.3
    Environment: * 3 machines, 2 of them with only 80GB of space, and 1 with 1.5GB
    * Two clients are copying files all the time (one of them is the 1.5GB machine)
    * The replication is set on 2
    Reporter: Stas Oskin

    I let the space on 2 smaller machines to end, to test the behavior.
    Now, one of the clients (the one located on 1.5GB) works fine, and the other one - the external, unable to copy and displays the error + the exception below:
    10:51:03 WARN dfs.DFSClient: NotReplicatedYetException sleeping /test/test.bin retries left 1
    09/05/21 10:51:06 WARN dfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /test/test.bin could only be replicated to 0 nodes, instead of 1
    at org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1123)
    at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)
    at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:890)

    at org.apache.hadoop.ipc.Client.call(Client.java:716)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
    at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
    at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)
    at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2450)
    at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2333)
    at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1800(DFSClient.java:1745)
    at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1922)

    09/05/21 10:51:06 WARN dfs.DFSClient: Error Recovery for block null bad datanode[0]
    java.io.IOException: Could not get block locations. Aborting...
    at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2153)
    at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1745)
    at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1899)
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Raghu Angadi (JIRA) at May 21, 2009 at 8:54 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711796#action_12711796 ]

    Raghu Angadi commented on HADOOP-5886:
    --------------------------------------

    from core-user thread at http://www.nabble.com/Could-only-be-replicated-to-0-nodes%2C-instead-of-1-td23650042.html :

    Most likely this is what is happening :

    * two out of 3 dns can not take anymore blocks.
    * While picking nodes for a new block, NN mostly skips the third dn as
    well since '# active writes' on it is larger than '2 * avg'.
    * Even if there is one other block is being written on the 3rd, it is
    still greater than (2 * 1/3).

    To test this, if you write just one block to an idle cluster it should
    succeed.
    [...]

    This particular problem is not that severe on a large cluster but HDFS
    should do the sensible thing.

    Raghu.

    Error when 2 of 3 DataNodes are full: "Could only be replicated to 0 nodes, instead of 1"
    -----------------------------------------------------------------------------------------

    Key: HADOOP-5886
    URL: https://issues.apache.org/jira/browse/HADOOP-5886
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Affects Versions: 0.18.3
    Environment: * 3 machines, 2 of them with only 80GB of space, and 1 with 1.5GB
    * Two clients are copying files all the time (one of them is the 1.5GB machine)
    * The replication is set on 2
    Reporter: Stas Oskin

    I let the space on 2 smaller machines to end, to test the behavior.
    Now, one of the clients (the one located on 1.5GB) works fine, and the other one - the external, unable to copy and displays the error + the exception below:
    10:51:03 WARN dfs.DFSClient: NotReplicatedYetException sleeping /test/test.bin retries left 1
    09/05/21 10:51:06 WARN dfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /test/test.bin could only be replicated to 0 nodes, instead of 1
    at org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1123)
    at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)
    at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:890)

    at org.apache.hadoop.ipc.Client.call(Client.java:716)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
    at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
    at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)
    at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2450)
    at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2333)
    at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1800(DFSClient.java:1745)
    at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1922)

    09/05/21 10:51:06 WARN dfs.DFSClient: Error Recovery for block null bad datanode[0]
    java.io.IOException: Could not get block locations. Aborting...
    at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2153)
    at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1745)
    at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1899)
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Stas Oskin (JIRA) at Jun 1, 2009 at 11:50 am
    [ https://issues.apache.org/jira/browse/HADOOP-5886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12715067#action_12715067 ]

    Stas Oskin commented on HADOOP-5886:
    ------------------------------------

    Any idea if the fixes for this will go into the latest trunk?

    Will it be back-portable to 0.18.3?

    Regards.
    Error when 2 of 3 DataNodes are full: "Could only be replicated to 0 nodes, instead of 1"
    -----------------------------------------------------------------------------------------

    Key: HADOOP-5886
    URL: https://issues.apache.org/jira/browse/HADOOP-5886
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Affects Versions: 0.18.3
    Environment: * 3 machines, 2 of them with only 80GB of space, and 1 with 1.5GB
    * Two clients are copying files all the time (one of them is the 1.5GB machine)
    * The replication is set on 2
    Reporter: Stas Oskin

    I let the space on 2 smaller machines to end, to test the behavior.
    Now, one of the clients (the one located on 1.5GB) works fine, and the other one - the external, unable to copy and displays the error + the exception below:
    10:51:03 WARN dfs.DFSClient: NotReplicatedYetException sleeping /test/test.bin retries left 1
    09/05/21 10:51:06 WARN dfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /test/test.bin could only be replicated to 0 nodes, instead of 1
    at org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1123)
    at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)
    at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:890)

    at org.apache.hadoop.ipc.Client.call(Client.java:716)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
    at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
    at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)
    at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2450)
    at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2333)
    at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1800(DFSClient.java:1745)
    at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1922)

    09/05/21 10:51:06 WARN dfs.DFSClient: Error Recovery for block null bad datanode[0]
    java.io.IOException: Could not get block locations. Aborting...
    at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2153)
    at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1745)
    at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1899)
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedMay 21, '09 at 8:39p
activeJun 1, '09 at 11:50a
posts4
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Stas Oskin (JIRA): 4 posts

People

Translate

site design / logo © 2022 Grokbase