FAQ
All replicas of a block end up on only 1 rack
---------------------------------------------

Key: HADOOP-4477
URL: https://issues.apache.org/jira/browse/HADOOP-4477
Project: Hadoop Core
Issue Type: Bug
Components: dfs
Reporter: Hairong Kuang
Assignee: Hairong Kuang
Priority: Critical
Fix For: 0.20.0


HDFS replicas placement strategy guarantees that the replicas of a block exist on at least two racks when its replication factor is greater than one. But fsck still reports that the replicas of some blocks end up on one rack.

The cause of the problem is that decommission and corruption handling only check the block's replication factor but not the rack requirement. When an over-replicated block loses a replica due to decomission, corruption, or heartbeat lost, namenode does not take any action to guarantee that remaining replicas are on different racks.


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Hairong Kuang (JIRA) at Oct 23, 2008 at 9:01 pm
    [ https://issues.apache.org/jira/browse/HADOOP-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12642279#action_12642279 ]

    Hairong Kuang commented on HADOOP-4477:
    ---------------------------------------

    My proposal is to include both under-replicated blocks and blocks that do not satisfy rack requirement in the neededReplication queue. The neededReplication queue supports four priorities:
    Priority 0: Blocks that have only one replicas;
    Priority 1: Blocks whose replicas are on only one rack;
    Priority 2: Blocks whose number of replicas is no greater than 1/3 of it replication factor;
    Priority 3: All other under-replicated blocks.

    In general we should have priority 4 which includes those blocks that do not belong to priorities 0-3 and do not satisfy the HDFS rack requirement. Currently HDFS provides only two-rack guarantee so priority 1 covers all rack requirement break cases.

    In methods addStoredBlock, removeStoredBlock, startDecomission, and markBlockAsCorrupt in FSNamesystem, put both under-replication and 1 rack blocks into the neededReplication queue. Replicator will in addition replicate one more replicas for only 1 rack not under-replicated blocks.
    All replicas of a block end up on only 1 rack
    ---------------------------------------------

    Key: HADOOP-4477
    URL: https://issues.apache.org/jira/browse/HADOOP-4477
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Priority: Critical
    Fix For: 0.20.0


    HDFS replicas placement strategy guarantees that the replicas of a block exist on at least two racks when its replication factor is greater than one. But fsck still reports that the replicas of some blocks end up on one rack.
    The cause of the problem is that decommission and corruption handling only check the block's replication factor but not the rack requirement. When an over-replicated block loses a replica due to decomission, corruption, or heartbeat lost, namenode does not take any action to guarantee that remaining replicas are on different racks.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedOct 21, '08 at 6:02p
activeOct 23, '08 at 9:01p
posts2
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Hairong Kuang (JIRA): 2 posts

People

Translate

site design / logo © 2022 Grokbase