FAQ
Hi guys.
I have more than a specific question. I am going to layout the steps I have taken. Please comment on what I can do better.

I was trying to to add 5 nodes to my existing 10 node cluster and also increase the replication factor from 2 to 3.
I thought I don't have to run the balancer cause it would most likely put the new replicas into the new nodes.

There are about 500k blocks.
I wanted to get it all stabilized(replication and balancing) within 24 hours. Its more than 24 hours now and fsck reports 30% under replication. Is there a way to force hdfs to use balance/replicate more aggressively.

It would be great if someone explained what/when things happen to blocks in the context of

1) Rebalancing

2) -setrep

3) Restarting cluster with a higher/lower replication factor.

A few questions and a few issues here.

1) When you restart the cluster with a higher than previous replication value. Does it also apply to existing blocks or only to new blocks being created ?

2) Does the balancer take into account under replication of blocks or does it blindly start moving existing blocks to reach threshold ?


A very specific problem . I am having this strange problem where the -setrep hangs on one particular block for hours. Is this because its corrupt ?. But, fsck said its healthy.


Thanks
Arun

Search Discussions

  • Alex Loddengaard at Jul 8, 2010 at 6:40 pm
    Hi Arun,

    Consider setting dfs.balance.bandwidthPerSec to something as high
    as 20971520 for the balancer and the setrep. You can do this by supplying
    -D at the command line.

    Your strategy for getting data onto the 5 nodes is correct: balance and
    setrep. Just understand these things take time.

    Hope this helps.

    Alex
    On Wed, Jul 7, 2010 at 4:09 PM, Arun Ramakrishnan wrote:

    Hi guys.

    I have more than a specific question. I am going to layout the steps I
    have taken. Please comment on what I can do better.



    I was trying to to add 5 nodes to my existing 10 node cluster and also
    increase the replication factor from 2 to 3.

    I thought I don’t have to run the balancer cause it would most likely put
    the new replicas into the new nodes.



    There are about 500k blocks.

    I wanted to get it all stabilized(replication and balancing) within 24
    hours. Its more than 24 hours now and fsck reports 30% under replication. Is
    there a way to force hdfs to use balance/replicate more aggressively.



    It would be great if someone explained what/when things happen to blocks in
    the context of

    1) Rebalancing

    2) –setrep

    3) Restarting cluster with a higher/lower replication factor.



    A few questions and a few issues here.

    1) When you restart the cluster with a higher than previous
    replication value. Does it also apply to existing blocks or only to new
    blocks being created ?

    2) Does the balancer take into account under replication of blocks or
    does it blindly start moving existing blocks to reach threshold ?





    A very specific problem . I am having this strange problem where the
    –setrep hangs on one particular block for hours. Is this because its corrupt
    ?. But, fsck said its healthy.





    Thanks

    Arun
  • Arun Ramakrishnan at Jul 9, 2010 at 12:04 am
    Thanks Alex.

    From: Alex Loddengaard
    Sent: Thursday, July 08, 2010 11:39 AM
    To: hdfs-user@hadoop.apache.org
    Subject: Re: rebalancing replciation help

    Hi Arun,

    Consider setting dfs.balance.bandwidthPerSec to something as high as 20971520 for the balancer and the setrep. You can do this by supplying -D at the command line.

    Your strategy for getting data onto the 5 nodes is correct: balance and setrep. Just understand these things take time.

    Hope this helps.

    Alex
    On Wed, Jul 7, 2010 at 4:09 PM, Arun Ramakrishnan wrote:
    Hi guys.
    I have more than a specific question. I am going to layout the steps I have taken. Please comment on what I can do better.

    I was trying to to add 5 nodes to my existing 10 node cluster and also increase the replication factor from 2 to 3.
    I thought I don't have to run the balancer cause it would most likely put the new replicas into the new nodes.

    There are about 500k blocks.
    I wanted to get it all stabilized(replication and balancing) within 24 hours. Its more than 24 hours now and fsck reports 30% under replication. Is there a way to force hdfs to use balance/replicate more aggressively.

    It would be great if someone explained what/when things happen to blocks in the context of

    1) Rebalancing

    2) -setrep

    3) Restarting cluster with a higher/lower replication factor.

    A few questions and a few issues here.

    1) When you restart the cluster with a higher than previous replication value. Does it also apply to existing blocks or only to new blocks being created ?

    2) Does the balancer take into account under replication of blocks or does it blindly start moving existing blocks to reach threshold ?


    A very specific problem . I am having this strange problem where the -setrep hangs on one particular block for hours. Is this because its corrupt ?. But, fsck said its healthy.


    Thanks
    Arun
  • Varene Olivier at Jul 9, 2010 at 2:44 pm
    To answer your questions (adding to what alex already told you)
    1) When you restart the cluster with a higher than previous
    replication value. Does it also apply to existing blocks or only to new
    blocks being created ?
    it applies only to NEWLY created blocks

    >
    2) Does the balancer take into account under replication of blocks
    or does it blindly start moving existing blocks to reach threshold ?

    i have no idea if the replication is performed before moving or not ...
    should be the case for more security

    Cheers


    Arun Ramakrishnan a écrit :
    Hi guys.

    I have more than a specific question. I am going to layout the steps I
    have taken. Please comment on what I can do better.



    I was trying to to add 5 nodes to my existing 10 node cluster and also
    increase the replication factor from 2 to 3.

    I thought I don’t have to run the balancer cause it would most likely
    put the new replicas into the new nodes.



    There are about 500k blocks.

    I wanted to get it all stabilized(replication and balancing) within 24
    hours. Its more than 24 hours now and fsck reports 30% under
    replication. Is there a way to force hdfs to use balance/replicate more
    aggressively.



    It would be great if someone explained what/when things happen to blocks
    in the context of

    1) Rebalancing

    2) –setrep

    3) Restarting cluster with a higher/lower replication factor.



    A few questions and a few issues here.

    1) When you restart the cluster with a higher than previous
    replication value. Does it also apply to existing blocks or only to new
    blocks being created ?

    2) Does the balancer take into account under replication of blocks
    or does it blindly start moving existing blocks to reach threshold ?





    A very specific problem . I am having this strange problem where the
    –setrep hangs on one particular block for hours. Is this because its
    corrupt ?. But, fsck said its healthy.





    Thanks

    Arun

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouphdfs-user @
categorieshadoop
postedJul 7, '10 at 11:16p
activeJul 9, '10 at 2:44p
posts4
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase