FAQ
Hello !

I am trying to increase the replication factor of a directory in our hadoop dfs from 1 to 2.
I observe that some of the blocks (12 out of 400) always remain under replicated, throwing the following message when I do an 'fsck' :

Under replicated blk_9084408236031628003. Target Replicas is 2 but found 1 replica(s).

I thought it could be a problem with a specific data node in the cluster, however I observe that the under replicated blocks belong to different data nodes .

Please give me your thoughts.

Thanks.
Ilay

Search Discussions

  • Hairong Kuang at Mar 30, 2009 at 10:02 pm
    Which version of HADOOP are you running? Your cluster might have hit
    HADOOP-5465.

    Hairong

    On 3/29/09 10:24 PM, "ilayaraja" wrote:

    Hello !

    I am trying to increase the replication factor of a directory in our hadoop
    dfs from 1 to 2.
    I observe that some of the blocks (12 out of 400) always remain under
    replicated, throwing the following message when I do an 'fsck' :

    Under replicated blk_9084408236031628003. Target Replicas
    is 2 but found 1 replica(s).

    I thought it could be a problem with a specific data node in the cluster,
    however I observe that the under replicated blocks belong to different data
    nodes .

    Please give me your thoughts.

    Thanks.
    Ilay


  • Ilayaraja at Mar 31, 2009 at 2:33 pm
    We are using hadoop-0.15.
    Let me explain the scenario:
    We have around 6 TB of data in our cluster on couple of data
    directories(/mnt, /mnt2)with a replication factor of 1. when we increased
    the replication to 2 for the entire data, we observed that /mnt is used 100%
    while /mnt2 is under utilized. So we wanted to balance the utilization of
    space from both the data directories by changing the hadoop code for
    getNextVolume(..) API. The new algorithm checks which data directory is
    having more space available and returns that as the volume for the block to
    be written. This updated version of hadoop is then used for setting the
    replication to 2 for the entire dfs. However, when the dfs replication is
    over, it reported more than 100 GB of data blocks are missing as well as
    some blocks are under replicated. We also observed that there are many
    blocks of size zero present in the cluster, we do not know how these blocks
    were created.


    ----- Original Message -----
    From: "Hairong Kuang" <hairong@yahoo-inc.com>
    To: "hadoop-dev" <core-dev@hadoop.apache.org>; "ilayaraja"
    <ilayaraja@rediff.co.in>; "hadoop-user" <core-user@hadoop.apache.org>
    Sent: Tuesday, March 31, 2009 3:30 AM
    Subject: Re: Problem: Some blocks remain under replicated

    Which version of HADOOP are you running? Your cluster might have hit
    HADOOP-5465.

    Hairong

    On 3/29/09 10:24 PM, "ilayaraja" wrote:

    Hello !

    I am trying to increase the replication factor of a directory in our
    hadoop
    dfs from 1 to 2.
    I observe that some of the blocks (12 out of 400) always remain under
    replicated, throwing the following message when I do an 'fsck' :

    Under replicated blk_9084408236031628003. Target
    Replicas
    is 2 but found 1 replica(s).

    I thought it could be a problem with a specific data node in the cluster,
    however I observe that the under replicated blocks belong to different
    data
    nodes .

    Please give me your thoughts.

    Thanks.
    Ilay


Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedMar 30, '09 at 7:53p
activeMar 31, '09 at 2:33p
posts3
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Ilayaraja: 2 posts Hairong Kuang: 1 post

People

Translate

site design / logo © 2022 Grokbase