FAQ
Hi

Since upgrading to 0.18.0 I've noticed that restarting the datanode corrupts
the hdfs so that the only option is to delete it and start again. I'm running
hadoop in distributed mode, on a single host. It runs as the user hadoop and
the hdfs is contained in a directory /home/hadoop/dfs.

When I restart hadoop using start-all.sh the datanode fails with the following
message:

STARTUP_MSG: args = []
STARTUP_MSG: version = 0.18.0
STARTUP_MSG: build =
http://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.18 -r 686010;
compiled by 'hadoopqa' on Thu Aug 14 19:48:33 UTC 2008
************************************************************/
2008-09-01 12:06:55,871 ERROR org.apache.hadoop.dfs.DataNode:
java.io.IOException: Found /home/hadoop/dfs/tmp/hadoop
in /home/hadoop/dfs/tmp but it is not a file.
at
org.apache.hadoop.dfs.FSDataset$FSVolume.recoverDetachedBlocks(FSDataset.java:437)
at org.apache.hadoop.dfs.FSDataset$FSVolume.(FSDataset.java:671)
at org.apache.hadoop.dfs.DataNode.startDataNode(DataNode.java:277)
at org.apache.hadoop.dfs.DataNode.(DataNode.java:2987)
at
org.apache.hadoop.dfs.DataNode.instantiateDataNode(DataNode.java:2942)
at org.apache.hadoop.dfs.DataNode.createDataNode(DataNode.java:2950)
at org.apache.hadoop.dfs.DataNode.main(DataNode.java:3072)

2008-09-01 12:06:55,872 INFO org.apache.hadoop.dfs.DataNode: SHUTDOWN_MSG:

Running an fsck on the hdfs shows that it is corrupt, and the only way to fix
it seems to be to delete it and reformat.

Any suggestions?
regards
Barry

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

Search Discussions

  • Konstantin Shvachko at Sep 4, 2008 at 12:24 am
    I can see 3 reasons for that:
    1. dfs.data.dir is pointing to a wrong data-node storage directory, or
    2. somebody manually moved directory "hadoop" into /home/hadoop/dfs/tmp/,
    which is supposed to contain only block files named blk_<number>
    3. There is some collision of configuration variables so that the same directory
    /home/hadoop/dfs/ is used by different servers (e.g. data-node and task tracker)
    on your single node cluster.

    To save hdfs data you can manually remove "hadoop" from /home/hadoop/dfs/tmp/
    and then restart the data-node.
    Or you can also manully remove "tmp" from /home/hadoop/dfs/.
    In the latter case you risk to loose some latest blocks, but not the whole system.

    --Konstantin

    Barry Haddow wrote:
    Hi

    Since upgrading to 0.18.0 I've noticed that restarting the datanode corrupts
    the hdfs so that the only option is to delete it and start again. I'm running
    hadoop in distributed mode, on a single host. It runs as the user hadoop and
    the hdfs is contained in a directory /home/hadoop/dfs.

    When I restart hadoop using start-all.sh the datanode fails with the following
    message:

    STARTUP_MSG: args = []
    STARTUP_MSG: version = 0.18.0
    STARTUP_MSG: build =
    http://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.18 -r 686010;
    compiled by 'hadoopqa' on Thu Aug 14 19:48:33 UTC 2008
    ************************************************************/
    2008-09-01 12:06:55,871 ERROR org.apache.hadoop.dfs.DataNode:
    java.io.IOException: Found /home/hadoop/dfs/tmp/hadoop
    in /home/hadoop/dfs/tmp but it is not a file.
    at
    org.apache.hadoop.dfs.FSDataset$FSVolume.recoverDetachedBlocks(FSDataset.java:437)
    at org.apache.hadoop.dfs.FSDataset$FSVolume.<init>(FSDataset.java:310)
    at org.apache.hadoop.dfs.FSDataset.<init>(FSDataset.java:671)
    at org.apache.hadoop.dfs.DataNode.startDataNode(DataNode.java:277)
    at org.apache.hadoop.dfs.DataNode.<init>(DataNode.java:190)
    at org.apache.hadoop.dfs.DataNode.makeInstance(DataNode.java:2987)
    at
    org.apache.hadoop.dfs.DataNode.instantiateDataNode(DataNode.java:2942)
    at org.apache.hadoop.dfs.DataNode.createDataNode(DataNode.java:2950)
    at org.apache.hadoop.dfs.DataNode.main(DataNode.java:3072)

    2008-09-01 12:06:55,872 INFO org.apache.hadoop.dfs.DataNode: SHUTDOWN_MSG:

    Running an fsck on the hdfs shows that it is corrupt, and the only way to fix
    it seems to be to delete it and reformat.

    Any suggestions?
    regards
    Barry
  • Barry Haddow at Sep 4, 2008 at 3:49 pm
    Hi Konstantin

    Thanks for your suggestions - I think option 3 was the culprit. I deleted the
    old dfs, fixed the configuration, and it's fine now,

    cheers
    Barry
    On Thursday 04 September 2008 01:23:20 Konstantin Shvachko wrote:
    I can see 3 reasons for that:
    1. dfs.data.dir is pointing to a wrong data-node storage directory, or
    2. somebody manually moved directory "hadoop" into /home/hadoop/dfs/tmp/,
    which is supposed to contain only block files named blk_<number>
    3. There is some collision of configuration variables so that the same
    directory /home/hadoop/dfs/ is used by different servers (e.g. data-node
    and task tracker) on your single node cluster.

    To save hdfs data you can manually remove "hadoop" from
    /home/hadoop/dfs/tmp/ and then restart the data-node.
    Or you can also manully remove "tmp" from /home/hadoop/dfs/.
    In the latter case you risk to loose some latest blocks, but not the whole
    system.

    --Konstantin

    Barry Haddow wrote:
    Hi

    Since upgrading to 0.18.0 I've noticed that restarting the datanode
    corrupts the hdfs so that the only option is to delete it and start
    again. I'm running hadoop in distributed mode, on a single host. It runs
    as the user hadoop and the hdfs is contained in a directory
    /home/hadoop/dfs.

    When I restart hadoop using start-all.sh the datanode fails with the
    following message:

    STARTUP_MSG: args = []
    STARTUP_MSG: version = 0.18.0
    STARTUP_MSG: build =
    http://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.18 -r
    686010; compiled by 'hadoopqa' on Thu Aug 14 19:48:33 UTC 2008
    ************************************************************/
    2008-09-01 12:06:55,871 ERROR org.apache.hadoop.dfs.DataNode:
    java.io.IOException: Found /home/hadoop/dfs/tmp/hadoop
    in /home/hadoop/dfs/tmp but it is not a file.
    at
    org.apache.hadoop.dfs.FSDataset$FSVolume.recoverDetachedBlocks(FSDataset.
    java:437) at
    org.apache.hadoop.dfs.FSDataset$FSVolume.<init>(FSDataset.java:310) at
    org.apache.hadoop.dfs.FSDataset.<init>(FSDataset.java:671) at
    org.apache.hadoop.dfs.DataNode.startDataNode(DataNode.java:277) at
    org.apache.hadoop.dfs.DataNode.<init>(DataNode.java:190) at
    org.apache.hadoop.dfs.DataNode.makeInstance(DataNode.java:2987) at
    org.apache.hadoop.dfs.DataNode.instantiateDataNode(DataNode.java:2942)
    at
    org.apache.hadoop.dfs.DataNode.createDataNode(DataNode.java:2950) at
    org.apache.hadoop.dfs.DataNode.main(DataNode.java:3072)

    2008-09-01 12:06:55,872 INFO org.apache.hadoop.dfs.DataNode:
    SHUTDOWN_MSG:

    Running an fsck on the hdfs shows that it is corrupt, and the only way to
    fix it seems to be to delete it and reformat.

    Any suggestions?
    regards
    Barry


    --
    The University of Edinburgh is a charitable body, registered in
    Scotland, with registration number SC005336.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedSep 1, '08 at 11:30a
activeSep 4, '08 at 3:49p
posts3
users2
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase