FAQ
Hello all,

What can cause HDFS to become corrupt? I was running some jobs which
were failing.
When I checked logs I saw that some files were corrupt so I ran 'hadoop
fsck /' which
showed that a few files were corrupt:

/user/data/2009-07-01/165_2009-07-01.log: CORRUPT block
blk_1697509332927954816
/user/data/2009-07-21/060_2009-07-21.log: CORRUPT block
blk_8841160612810933777
/user/data/2009-07-26/173_2009-07-26.log: CORRUPT block
blk_-6669973789246139664

I had backups of these files so what I did was delete these and reload
them, so the file system
is OK now. What I'm wondering is how these files became corrupt. There
are 6 nodes in the
cluster and I have a replication factor of 3.

I had assumed that if a replica became corrupt that it would be replaced
by a non-corrupt copy.
Is this not the case?

Would there have been some way to recover the files if I didn't have any
backups ?

Another concern is that I only found out HDFS was corrupt by accident.
I suppose I should have
a script run every few minutes to parse the results of 'hadoop fsck /'
and email if anything becomes
corrupt. How are people currently handling this ?

thank you very much
M

Search Discussions

  • Raghu Angadi at Aug 10, 2009 at 10:38 pm

    I had assumed that if a replica became corrupt that it would be replaced
    by a non-corrupt copy.
    Is this not the case?
    yes it is. Usually some random block might be corrupted for various
    reasons and it gets replaced by another replica of the block.

    A block might stay in corrupt state if there are no good replicas left
    or new replicas could not be created. The actual reason might be
    hardware related (say a lot of nodes die) or a real software bug.

    If you use Hadoop-0.20 or later, you will notice a warning in red on the
    NameNode front page if some blocks are left with no good replicas. You
    don't need to run FSCK (which could be costly) each time.

    If you are interested, you could try to trace one of these block ids in
    NameNode log to see what happened it. We are always eager to hear about
    irrecoverable errors. Please mention hadoop version you are using.

    If the data is corrupt (rather than truncated or missing), you can fetch
    the data by using "-ignoreCrc" option to 'fs -get'.

    Raghu.

    Mayuran Yogarajah wrote:
    Hello all,

    What can cause HDFS to become corrupt? I was running some jobs which
    were failing. When I checked logs I saw that some files were corrupt so
    I ran 'hadoop fsck /' which
    showed that a few files were corrupt:

    /user/data/2009-07-01/165_2009-07-01.log: CORRUPT block
    blk_1697509332927954816
    /user/data/2009-07-21/060_2009-07-21.log: CORRUPT block
    blk_8841160612810933777
    /user/data/2009-07-26/173_2009-07-26.log: CORRUPT block
    blk_-6669973789246139664

    I had backups of these files so what I did was delete these and reload
    them, so the file system
    is OK now. What I'm wondering is how these files became corrupt. There
    are 6 nodes in the
    cluster and I have a replication factor of 3.

    I had assumed that if a replica became corrupt that it would be replaced
    by a non-corrupt copy.
    Is this not the case?

    Would there have been some way to recover the files if I didn't have any
    backups ?

    Another concern is that I only found out HDFS was corrupt by accident.
    I suppose I should have
    a script run every few minutes to parse the results of 'hadoop fsck /'
    and email if anything becomes
    corrupt. How are people currently handling this ?

    thank you very much
    M
  • Mayuran Yogarajah at Aug 10, 2009 at 11:15 pm
    Hello,
    If you are interested, you could try to trace one of these block ids in
    NameNode log to see what happened it. We are always eager to hear about
    irrecoverable errors. Please mention hadoop version you are using.
    I'm using Hadoop 0.18.3. I just checked namenode log for one of the bad
    blocks.
    I see entries from Saturday saying:
    ask 1.1.1.6:50010 to replicate blk_1697509332927954816_8724 to
    datanode(s) < all other data nodes >

    I only loaded this data Saturday, and the .6 data node became full at
    some point.
    When data is first loaded into the cluster, does the name node send the
    data to as many nodes as
    it can to satisfy the replication factor, or does it send it to one node
    and ask that node send it to others?

    If its the latter then its possible that the block became corrupt when I
    first loaded it to .6 (since it was full),
    and since it was designated to send the block to other nodes none of the
    nodes would have a non-corrupt
    copy.

    Raghu, please let me know what you think.

    thanks,

    M
  • Harish Mallipeddi at Aug 11, 2009 at 11:34 am

    On Tue, Aug 11, 2009 at 4:45 AM, Mayuran Yogarajah wrote:

    Hello,

    If you are interested, you could try to trace one of these block ids in
    NameNode log to see what happened it. We are always eager to hear about
    irrecoverable errors. Please mention hadoop version you are using.

    I'm using Hadoop 0.18.3. I just checked namenode log for one of the bad
    blocks. I see entries from Saturday saying:
    ask 1.1.1.6:50010 to replicate blk_1697509332927954816_8724 to datanode(s)
    < all other data nodes >

    I only loaded this data Saturday, and the .6 data node became full at some
    point.
    When data is first loaded into the cluster, does the name node send the
    data to as many nodes as
    it can to satisfy the replication factor, or does it send it to one node
    and ask that node send it to others?
    A DN is instructed by the NN (this actually happens when the DN sends a
    heartbeat to NN) to replicate its block to the first replication target DN.
    The NN chooses replication targets based on some logic (2nd replica is
    always placed on a node in a different rack; 3rd replica is placed on a DN
    in the same rack as the first one).

    --
    Harish Mallipeddi
    http://blog.poundbang.in
  • Raghu Angadi at Aug 11, 2009 at 4:44 pm
    Note that there are multiple log files (one for each day). Make sure you
    searched all the relevant days. You can also check datanode log for this
    block.

    HDFS writes to all three datanodes at the time you write the data. It is
    possible that other two datanodes also encountered errors.

    This would result in an error when you tried to copy and such corrupt
    block should not even appear in HDFS. Did you restart the cluster after
    copying? 0.18.3 has various fixes related to handling block replication
    correctly.

    Please include the complete log lines (at the end of your response), it
    makes it simpler to interpret. Alternately you file a JIRA and attach
    log files there.

    Raghu.

    Mayuran Yogarajah wrote:
    Hello,
    If you are interested, you could try to trace one of these block ids in
    NameNode log to see what happened it. We are always eager to hear about
    irrecoverable errors. Please mention hadoop version you are using.
    I'm using Hadoop 0.18.3. I just checked namenode log for one of the bad
    blocks. I see entries from Saturday saying:
    ask 1.1.1.6:50010 to replicate blk_1697509332927954816_8724 to
    datanode(s) < all other data nodes >

    I only loaded this data Saturday, and the .6 data node became full at
    some point.
    When data is first loaded into the cluster, does the name node send the
    data to as many nodes as
    it can to satisfy the replication factor, or does it send it to one node
    and ask that node send it to others?

    If its the latter then its possible that the block became corrupt when I
    first loaded it to .6 (since it was full),
    and since it was designated to send the block to other nodes none of the
    nodes would have a non-corrupt
    copy.

    Raghu, please let me know what you think.

    thanks,

    M

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedAug 10, '09 at 10:08p
activeAug 11, '09 at 4:44p
posts5
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2021 Grokbase