FAQ
DataNode fails stop due to a bad disk (or storage directory)
------------------------------------------------------------

Key: HDFS-1223
URL: https://issues.apache.org/jira/browse/HDFS-1223
Project: Hadoop HDFS
Issue Type: Bug
Components: data-node
Affects Versions: 0.20.1
Reporter: Thanh Do


A datanode can store block files in multiple volumes.
If a datanode sees a bad volume during start up (i.e, face an exception
when accessing that volume), it simply fail stops, making all block files
stored in other healthy volumes inaccessible. Consequently, these lost
replicas will be generated later on in other datanodes.
If a datanode is able to mark the bad disk and continue working with
healthy ones, this will increase availability and avoid unnecessary
regeneration. As an extreme example, consider one datanode which has
2 volumes V1 and V2, each contains about 10000 64MB block files.
During startup, the datanode gets an exception when accessing V1, it then
fail stops, making 20000 block files generated later on.
If the datanode masks V1 as bad and continues working with V2, the number
of replicas needed to be regenerated is cut in to half.

This bug was found by our Failure Testing Service framework:
http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-98.html
For questions, please email us: Thanh Do (thanhdo@cs.wisc.edu) and
Haryadi Gunawi (haryadi@eecs.berkeley.edu)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

Discussion Posts

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 1 of 2 | next ›
Discussion Overview
grouphdfs-dev @
categorieshadoop
postedJun 17, '10 at 1:04p
activeJun 23, '10 at 2:29a
posts2
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Konstantin Shvachko (JIRA): 2 posts

People

Translate

site design / logo © 2022 Grokbase