Hi guys.
I have more than a specific question. I am going to layout the steps I have taken. Please comment on what I can do better.

I was trying to to add 5 nodes to my existing 10 node cluster and also increase the replication factor from 2 to 3.
I thought I don't have to run the balancer cause it would most likely put the new replicas into the new nodes.

There are about 500k blocks.
I wanted to get it all stabilized(replication and balancing) within 24 hours. Its more than 24 hours now and fsck reports 30% under replication. Is there a way to force hdfs to use balance/replicate more aggressively.

It would be great if someone explained what/when things happen to blocks in the context of

1) Rebalancing

2) -setrep

3) Restarting cluster with a higher/lower replication factor.

A few questions and a few issues here.

1) When you restart the cluster with a higher than previous replication value. Does it also apply to existing blocks or only to new blocks being created ?

2) Does the balancer take into account under replication of blocks or does it blindly start moving existing blocks to reach threshold ?

A very specific problem . I am having this strange problem where the -setrep hangs on one particular block for hours. Is this because its corrupt ?. But, fsck said its healthy.


Search Discussions

Discussion Posts

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 1 of 4 | next ›
Discussion Overview
grouphdfs-user @
postedJul 7, '10 at 11:16p
activeJul 9, '10 at 2:44p



site design / logo © 2022 Grokbase