Datanode 'alive' but with its disk failed, Namenode thinks it's alive

Key: HDFS-1234
URL: https://issues.apache.org/jira/browse/HDFS-1234
Project: Hadoop HDFS
Issue Type: Bug
Components: name-node
Affects Versions: 0.20.1
Reporter: Thanh Do

- Summary: Datanode 'alive' but with its disk failed, Namenode still thinks it's alive

- Setups:
+ Replication = 1
+ # available datanodes = 2
+ # disks / datanode = 1
+ # failures = 1
+ Failure type = bad disk
+ When/where failure happens = first phase of the pipeline

- Details:
In this experiment we have two datanodes. Each node has 1 disk.
However, if one datanode has a failed disk (but the node is still alive), the datanode
does not keep track of this. From the perspective of the namenode,
that datanode is still alive, and thus the namenode gives back the same datanode
to the client. The client will retry 3 times by asking the namenode to
give a new set of datanodes, and always get the same datanode.
And every time the client wants to write there, it gets an exception.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

Search Discussions

Discussion Posts

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 1 of 2 | next ›
Discussion Overview
grouphdfs-dev @
postedJun 17, '10 at 6:37a
activeJun 17, '10 at 5:38p

1 user in discussion

Todd Lipcon (JIRA): 2 posts



site design / logo © 2022 Grokbase