FAQ
Hello all,

What can cause HDFS to become corrupt? I was running some jobs which
were failing.
When I checked logs I saw that some files were corrupt so I ran 'hadoop
fsck /' which
showed that a few files were corrupt:

/user/data/2009-07-01/165_2009-07-01.log: CORRUPT block
blk_1697509332927954816
/user/data/2009-07-21/060_2009-07-21.log: CORRUPT block
blk_8841160612810933777
/user/data/2009-07-26/173_2009-07-26.log: CORRUPT block
blk_-6669973789246139664

I had backups of these files so what I did was delete these and reload
them, so the file system
is OK now. What I'm wondering is how these files became corrupt. There
are 6 nodes in the
cluster and I have a replication factor of 3.

I had assumed that if a replica became corrupt that it would be replaced
by a non-corrupt copy.
Is this not the case?

Would there have been some way to recover the files if I didn't have any
backups ?

Another concern is that I only found out HDFS was corrupt by accident.
I suppose I should have
a script run every few minutes to parse the results of 'hadoop fsck /'
and email if anything becomes
corrupt. How are people currently handling this ?

thank you very much
M

Search Discussions

Discussion Posts

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 1 of 5 | next ›
Discussion Overview
groupcommon-user @
categorieshadoop
postedAug 10, '09 at 10:08p
activeAug 11, '09 at 4:44p
posts5
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2021 Grokbase