Note that there are multiple log files (one for each day). Make sure you
searched all the relevant days. You can also check datanode log for this

HDFS writes to all three datanodes at the time you write the data. It is
possible that other two datanodes also encountered errors.

This would result in an error when you tried to copy and such corrupt
block should not even appear in HDFS. Did you restart the cluster after
copying? 0.18.3 has various fixes related to handling block replication

Please include the complete log lines (at the end of your response), it
makes it simpler to interpret. Alternately you file a JIRA and attach
log files there.


Mayuran Yogarajah wrote:
If you are interested, you could try to trace one of these block ids in
NameNode log to see what happened it. We are always eager to hear about
irrecoverable errors. Please mention hadoop version you are using.
I'm using Hadoop 0.18.3. I just checked namenode log for one of the bad
blocks. I see entries from Saturday saying:
ask to replicate blk_1697509332927954816_8724 to
datanode(s) < all other data nodes >

I only loaded this data Saturday, and the .6 data node became full at
some point.
When data is first loaded into the cluster, does the name node send the
data to as many nodes as
it can to satisfy the replication factor, or does it send it to one node
and ask that node send it to others?

If its the latter then its possible that the block became corrupt when I
first loaded it to .6 (since it was full),
and since it was designated to send the block to other nodes none of the
nodes would have a non-corrupt

Raghu, please let me know what you think.



Search Discussions

Discussion Posts


Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 5 of 5 | next ›
Discussion Overview
groupcommon-user @
postedAug 10, '09 at 10:08p
activeAug 11, '09 at 4:44p



site design / logo © 2021 Grokbase