FAQ
Datanode should verify block sizes vs metadata on startup
---------------------------------------------------------

Key: HADOOP-4994
URL: https://issues.apache.org/jira/browse/HADOOP-4994
Project: Hadoop Core
Issue Type: Bug
Reporter: Brian Bockelman


I could have sworn this bug had been reported by someone else already, but I can't find it on JIRA after searching.... apologies if this is a duplicate.

The datanode, upon starting up, should check and make sure that all block sizes as reported via `stat` are the same as the block sizes as reported via the block's metadata.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • dhruba borthakur (JIRA) at Jan 9, 2009 at 5:37 am
    [ https://issues.apache.org/jira/browse/HADOOP-4994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    dhruba borthakur updated HADOOP-4994:
    -------------------------------------

    Component/s: dfs
    Datanode should verify block sizes vs metadata on startup
    ---------------------------------------------------------

    Key: HADOOP-4994
    URL: https://issues.apache.org/jira/browse/HADOOP-4994
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Reporter: Brian Bockelman

    I could have sworn this bug had been reported by someone else already, but I can't find it on JIRA after searching.... apologies if this is a duplicate.
    The datanode, upon starting up, should check and make sure that all block sizes as reported via `stat` are the same as the block sizes as reported via the block's metadata.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • dhruba borthakur (JIRA) at Jan 9, 2009 at 5:53 am
    [ https://issues.apache.org/jira/browse/HADOOP-4994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662251#action_12662251 ]

    dhruba borthakur commented on HADOOP-4994:
    ------------------------------------------

    let me understand this one. Suppose the datanode is storing its blocks on a Linux ext3 filesystem. are you saying that a stat on the Linux ext3 block file should return a file size that should be the same as reported by InterDatanodeProtocol.getBlockMetaDataInfo().getNumBytes()?


    Datanode should verify block sizes vs metadata on startup
    ---------------------------------------------------------

    Key: HADOOP-4994
    URL: https://issues.apache.org/jira/browse/HADOOP-4994
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Reporter: Brian Bockelman

    I could have sworn this bug had been reported by someone else already, but I can't find it on JIRA after searching.... apologies if this is a duplicate.
    The datanode, upon starting up, should check and make sure that all block sizes as reported via `stat` are the same as the block sizes as reported via the block's metadata.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Brian Bockelman (JIRA) at Jan 9, 2009 at 4:06 pm
    [ https://issues.apache.org/jira/browse/HADOOP-4994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662409#action_12662409 ]

    Brian Bockelman commented on HADOOP-4994:
    -----------------------------------------

    Hey Dhruba,

    That is correct (I guess I should mention, as this is a Java project, not a Unix project, stat is equivalent to File.length...).

    This is the use case:
    1) Node loses power.
    2) On reboot, linux triggers an automatic fsck of hadoop's storage system
    3) To clean up some discovered corruption, linux truncates one of Hadoop's blocks
    4) Hadoop starts up - reads in the metadata, and assumes the block is OK.

    I would like to alter step (4) to be:
    4) Hadoop starts up, reads in metadata
    5) Hadoop checks to make sure block length recorded in the metadata file is the same as the block length recorded by the ext3 filesystem.

    My apologies if this is already done and I am just not understanding things correctly.
    Datanode should verify block sizes vs metadata on startup
    ---------------------------------------------------------

    Key: HADOOP-4994
    URL: https://issues.apache.org/jira/browse/HADOOP-4994
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Reporter: Brian Bockelman

    I could have sworn this bug had been reported by someone else already, but I can't find it on JIRA after searching.... apologies if this is a duplicate.
    The datanode, upon starting up, should check and make sure that all block sizes as reported via `stat` are the same as the block sizes as reported via the block's metadata.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Raghu Angadi (JIRA) at Jan 9, 2009 at 9:46 pm
    [ https://issues.apache.org/jira/browse/HADOOP-4994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662528#action_12662528 ]

    Raghu Angadi commented on HADOOP-4994:
    --------------------------------------


    Currently block metadata does not store size of the block. I don't think it should either. But DN can still detect the discrepancy since file lengths of metadata and block sizes don't tally (metadata file length shold be : header + ((block size + 511)/512)*4).
    This is the use case: [...]
    In this case, NN should have detected that that block is smaller than expected. I think it does.
    Datanode should verify block sizes vs metadata on startup
    ---------------------------------------------------------

    Key: HADOOP-4994
    URL: https://issues.apache.org/jira/browse/HADOOP-4994
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Reporter: Brian Bockelman

    I could have sworn this bug had been reported by someone else already, but I can't find it on JIRA after searching.... apologies if this is a duplicate.
    The datanode, upon starting up, should check and make sure that all block sizes as reported via `stat` are the same as the block sizes as reported via the block's metadata.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedJan 8, '09 at 6:47p
activeJan 9, '09 at 9:46p
posts5
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Raghu Angadi (JIRA): 5 posts

People

Translate

site design / logo © 2022 Grokbase