FAQ
Add BlockTool to query file and its block info
----------------------------------------------

Key: HADOOP-4945
URL: https://issues.apache.org/jira/browse/HADOOP-4945
Project: Hadoop Core
Issue Type: New Feature
Affects Versions: 0.21.0
Reporter: zhangwei


The fsck can get the file's block detail,but when you want see which file or datanode the block belongs to ,it will be helpless.
The BlockTool will be helpfull in developing,for example when you happened to these message :

2008-12-25 12:12:10,049 WARN dfs.DataNode (DataNode.java:readBlock(901)) - Got exception while serving blk_28622148 to /10.7
3.4.101:
java.io.IOException: Block blk_28622148 is not valid.
at org.apache.hadoop.dfs.FSDataset.getBlockFile(FSDataset.java:541)
at org.apache.hadoop.dfs.DataNode$BlockSender.(DataNode.java:882)
at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:840)
at java.lang.Thread.run(Thread.java:595)


the Blocktool may help you to get the location,it can get the file name and which datanodes hold the block.
Also it can get the file or directory 's block details too.




--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • zhangwei (JIRA) at Dec 25, 2008 at 4:33 am
    [ https://issues.apache.org/jira/browse/HADOOP-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    zhangwei updated HADOOP-4945:
    -----------------------------

    Attachment: HADOOP-4945.patch

    I add the tool under the under the package "org.apache.hadoop.tools", and can be used by type:
    $HADOOP_HOME/bin/hadoop blocktool -f <file or directory>
    to fetch the block information of the file or directory, if the dest is directory it query recursively to fecth the file's info under the dir.

    also you can get the block's info by type:
    $HADOOP_HOME/bin/hadoop blocktool -b <block>

    ++++++++++++
    the output style is below:

    mode1:(get the file's block info)
    $ ./hadoop blocktool -f /test/wordcount2/input/english_story.txt
    /test/wordcount2/input/english_story.txt:
    rw-r--r-- rep=3 root root 5230(blksize:268435456) 2008-12-24 14:15:24 english_story.txt
    block detail: (#BlockName #OffSet #Len #Locations)
    blk_155_1158 0 5230 jx-hadoop-data04.jx.baidu.com
    --------

    Dest can be dir yet:
    $ ./hadoop blocktool -f /test/
    /test/wordcount/input/english_story.txt:
    rw-r--r-- rep=3 root root 5230(blksize:268435456) 2008-12-24 14:03:39 english_story.txt
    block detail: (#BlockName #OffSet #Len #Locations)
    blk_151_1153 0 5230 jx-hadoop-data04.jx.baidu.com
    --------

    /test/wordcount2/input/english_story.txt:
    rw-r--r-- rep=3 root root 5230(blksize:268435456) 2008-12-24 14:15:24 english_story.txt
    block detail: (#BlockName #OffSet #Len #Locations)
    blk_155_1158 0 5230 jx-hadoop-data04.jx.baidu.com
    --------



    mode2:(get the block's info)
    $ ./hadoop blocktool -b blk_155_1158
    "//test/wordcount2/input/english_story.txt":root:root:rw-r--r--
    Loaction of block:
    jx-hadoop-data04.jx.baidu.com

    Add BlockTool to query file and its block info
    ----------------------------------------------

    Key: HADOOP-4945
    URL: https://issues.apache.org/jira/browse/HADOOP-4945
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: 0.21.0
    Reporter: zhangwei
    Attachments: HADOOP-4945.patch


    The fsck can get the file's block detail,but when you want see which file or datanode the block belongs to ,it will be helpless.
    The BlockTool will be helpfull in developing,for example when you happened to these message :
    2008-12-25 12:12:10,049 WARN dfs.DataNode (DataNode.java:readBlock(901)) - Got exception while serving blk_28622148 to /10.7
    3.4.101:
    java.io.IOException: Block blk_28622148 is not valid.
    at org.apache.hadoop.dfs.FSDataset.getBlockFile(FSDataset.java:541)
    at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1090)
    at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:882)
    at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:840)
    at java.lang.Thread.run(Thread.java:595)
    the Blocktool may help you to get the location,it can get the file name and which datanodes hold the block.
    Also it can get the file or directory 's block details too.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • zhangwei (JIRA) at Dec 25, 2008 at 5:43 am
    [ https://issues.apache.org/jira/browse/HADOOP-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    zhangwei updated HADOOP-4945:
    -----------------------------

    Fix Version/s: site
    Affects Version/s: (was: 0.21.0)
    site
    Release Note: patch is based on hadoop core trunk ,revision 728879
    Status: Patch Available (was: Open)
    Add BlockTool to query file and its block info
    ----------------------------------------------

    Key: HADOOP-4945
    URL: https://issues.apache.org/jira/browse/HADOOP-4945
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: site
    Reporter: zhangwei
    Fix For: site

    Attachments: HADOOP-4945.patch


    The fsck can get the file's block detail,but when you want see which file or datanode the block belongs to ,it will be helpless.
    The BlockTool will be helpfull in developing,for example when you happened to these message :
    2008-12-25 12:12:10,049 WARN dfs.DataNode (DataNode.java:readBlock(901)) - Got exception while serving blk_28622148 to /10.7
    3.4.101:
    java.io.IOException: Block blk_28622148 is not valid.
    at org.apache.hadoop.dfs.FSDataset.getBlockFile(FSDataset.java:541)
    at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1090)
    at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:882)
    at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:840)
    at java.lang.Thread.run(Thread.java:595)
    the Blocktool may help you to get the location,it can get the file name and which datanodes hold the block.
    Also it can get the file or directory 's block details too.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • zhangwei (JIRA) at Dec 30, 2008 at 2:44 am
    [ https://issues.apache.org/jira/browse/HADOOP-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12659734#action_12659734 ]

    zhangwei commented on HADOOP-4945:
    ----------------------------------

    can anyone review it ?
    Add BlockTool to query file and its block info
    ----------------------------------------------

    Key: HADOOP-4945
    URL: https://issues.apache.org/jira/browse/HADOOP-4945
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: site
    Reporter: zhangwei
    Fix For: site

    Attachments: HADOOP-4945.patch


    The fsck can get the file's block detail,but when you want see which file or datanode the block belongs to ,it will be helpless.
    The BlockTool will be helpfull in developing,for example when you happened to these message :
    2008-12-25 12:12:10,049 WARN dfs.DataNode (DataNode.java:readBlock(901)) - Got exception while serving blk_28622148 to /10.7
    3.4.101:
    java.io.IOException: Block blk_28622148 is not valid.
    at org.apache.hadoop.dfs.FSDataset.getBlockFile(FSDataset.java:541)
    at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1090)
    at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:882)
    at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:840)
    at java.lang.Thread.run(Thread.java:595)
    the Blocktool may help you to get the location,it can get the file name and which datanodes hold the block.
    Also it can get the file or directory 's block details too.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • dhruba borthakur (JIRA) at Dec 30, 2008 at 11:05 pm
    [ https://issues.apache.org/jira/browse/HADOOP-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12659977#action_12659977 ]

    dhruba borthakur commented on HADOOP-4945:
    ------------------------------------------
    From my understanding, the functionality of "mode1" as described above is the same functionality as bin/hadoop fsck -files -blocks -locations <hdfs pathname>.
    The functionality described in "mode2" is similar to running "bin/hadoop fsck -files -blocks -locations / | grep <blockid>"

    Do you agree?


    Add BlockTool to query file and its block info
    ----------------------------------------------

    Key: HADOOP-4945
    URL: https://issues.apache.org/jira/browse/HADOOP-4945
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: site
    Reporter: zhangwei
    Fix For: site

    Attachments: HADOOP-4945.patch


    The fsck can get the file's block detail,but when you want see which file or datanode the block belongs to ,it will be helpless.
    The BlockTool will be helpfull in developing,for example when you happened to these message :
    2008-12-25 12:12:10,049 WARN dfs.DataNode (DataNode.java:readBlock(901)) - Got exception while serving blk_28622148 to /10.7
    3.4.101:
    java.io.IOException: Block blk_28622148 is not valid.
    at org.apache.hadoop.dfs.FSDataset.getBlockFile(FSDataset.java:541)
    at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1090)
    at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:882)
    at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:840)
    at java.lang.Thread.run(Thread.java:595)
    the Blocktool may help you to get the location,it can get the file name and which datanodes hold the block.
    Also it can get the file or directory 's block details too.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • zhangwei (JIRA) at Dec 31, 2008 at 2:19 am
    [ https://issues.apache.org/jira/browse/HADOOP-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12660005#action_12660005 ]

    zhangwei commented on HADOOP-4945:
    ----------------------------------

    Hi,dhruba borthakur
    Indeed the mode 1 is equal to.and the fsck can do the things above,but consider the situation that when you find the block's file in a product cluster with a huge data set.
    if you run "bin/hadoop fsck -files -blocks -locations / | grep <blockid>" ,it's not a effective way.
    Or maybe the fsck can be impoved a little else,i will consider it .

    thanks your comment
    Add BlockTool to query file and its block info
    ----------------------------------------------

    Key: HADOOP-4945
    URL: https://issues.apache.org/jira/browse/HADOOP-4945
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: site
    Reporter: zhangwei
    Fix For: site

    Attachments: HADOOP-4945.patch


    The fsck can get the file's block detail,but when you want see which file or datanode the block belongs to ,it will be helpless.
    The BlockTool will be helpfull in developing,for example when you happened to these message :
    2008-12-25 12:12:10,049 WARN dfs.DataNode (DataNode.java:readBlock(901)) - Got exception while serving blk_28622148 to /10.7
    3.4.101:
    java.io.IOException: Block blk_28622148 is not valid.
    at org.apache.hadoop.dfs.FSDataset.getBlockFile(FSDataset.java:541)
    at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1090)
    at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:882)
    at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:840)
    at java.lang.Thread.run(Thread.java:595)
    the Blocktool may help you to get the location,it can get the file name and which datanodes hold the block.
    Also it can get the file or directory 's block details too.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • dhruba borthakur (JIRA) at Dec 31, 2008 at 5:46 pm
    [ https://issues.apache.org/jira/browse/HADOOP-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12660106#action_12660106 ]

    dhruba borthakur commented on HADOOP-4945:
    ------------------------------------------

    Hello zhangwei, thanks for looking into whether the "mode2" functionality be incorporated into fsck itself (without making the fsck code much more complicated).
    Add BlockTool to query file and its block info
    ----------------------------------------------

    Key: HADOOP-4945
    URL: https://issues.apache.org/jira/browse/HADOOP-4945
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: site
    Reporter: zhangwei
    Fix For: site

    Attachments: HADOOP-4945.patch


    The fsck can get the file's block detail,but when you want see which file or datanode the block belongs to ,it will be helpless.
    The BlockTool will be helpfull in developing,for example when you happened to these message :
    2008-12-25 12:12:10,049 WARN dfs.DataNode (DataNode.java:readBlock(901)) - Got exception while serving blk_28622148 to /10.7
    3.4.101:
    java.io.IOException: Block blk_28622148 is not valid.
    at org.apache.hadoop.dfs.FSDataset.getBlockFile(FSDataset.java:541)
    at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1090)
    at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:882)
    at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:840)
    at java.lang.Thread.run(Thread.java:595)
    the Blocktool may help you to get the location,it can get the file name and which datanodes hold the block.
    Also it can get the file or directory 's block details too.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • zhangwei (JIRA) at Jan 13, 2009 at 9:15 am
    [ https://issues.apache.org/jira/browse/HADOOP-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    zhangwei updated HADOOP-4945:
    -----------------------------

    Status: Open (was: Patch Available)

    I decided to create a new issue which add ability on the fsck facility. And abort the patch so.
    Add BlockTool to query file and its block info
    ----------------------------------------------

    Key: HADOOP-4945
    URL: https://issues.apache.org/jira/browse/HADOOP-4945
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: site
    Reporter: zhangwei
    Fix For: site

    Attachments: HADOOP-4945.patch


    The fsck can get the file's block detail,but when you want see which file or datanode the block belongs to ,it will be helpless.
    The BlockTool will be helpfull in developing,for example when you happened to these message :
    2008-12-25 12:12:10,049 WARN dfs.DataNode (DataNode.java:readBlock(901)) - Got exception while serving blk_28622148 to /10.7
    3.4.101:
    java.io.IOException: Block blk_28622148 is not valid.
    at org.apache.hadoop.dfs.FSDataset.getBlockFile(FSDataset.java:541)
    at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1090)
    at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:882)
    at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:840)
    at java.lang.Thread.run(Thread.java:595)
    the Blocktool may help you to get the location,it can get the file name and which datanodes hold the block.
    Also it can get the file or directory 's block details too.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • dhruba borthakur (JIRA) at Jan 13, 2009 at 6:46 pm
    [ https://issues.apache.org/jira/browse/HADOOP-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    dhruba borthakur resolved HADOOP-4945.
    --------------------------------------

    Resolution: Won't Fix

    I am closing this issue because HADOOP-5019 can address this issue. Please re-open if you feel otherwise. Thanks.
    Add BlockTool to query file and its block info
    ----------------------------------------------

    Key: HADOOP-4945
    URL: https://issues.apache.org/jira/browse/HADOOP-4945
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: site
    Reporter: zhangwei
    Fix For: site

    Attachments: HADOOP-4945.patch


    The fsck can get the file's block detail,but when you want see which file or datanode the block belongs to ,it will be helpless.
    The BlockTool will be helpfull in developing,for example when you happened to these message :
    2008-12-25 12:12:10,049 WARN dfs.DataNode (DataNode.java:readBlock(901)) - Got exception while serving blk_28622148 to /10.7
    3.4.101:
    java.io.IOException: Block blk_28622148 is not valid.
    at org.apache.hadoop.dfs.FSDataset.getBlockFile(FSDataset.java:541)
    at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1090)
    at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:882)
    at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:840)
    at java.lang.Thread.run(Thread.java:595)
    the Blocktool may help you to get the location,it can get the file name and which datanodes hold the block.
    Also it can get the file or directory 's block details too.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedDec 25, '08 at 4:21a
activeJan 13, '09 at 6:46p
posts9
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

dhruba borthakur (JIRA): 9 posts

People

Translate

site design / logo © 2022 Grokbase