|| at Aug 7, 2009 at 5:59 pm
Make sure you rebalance soon after adding the new node. Otherwise, you will
have an age bias in file distribution. This can, in some applications, lead
to some strange effects. For example, if you have log files that you delete
when they get too old, disk space will be freed non-uniformly. This
shouldn't much affect performance, but it can lead to a need to rebalance
again (and again) later. Normal file churn combined with occasional
rebalancing should eventually fix this, but it is nicer not to.
On Fri, Aug 7, 2009 at 10:48 AM, Ravi Phulari wrote:
On 8/7/09 10:38 AM, "prashant ullegaddi" wrote:
We had a cluster of 9 machines with one name node, and 8 data nodes (2 had
220GB hard disk space, rest had 450GB).
Most of the space on first machines with 250GB disk space was consumed.
Now we added two new machines each with 450GB hard disk space as data nodes.
Is there any way to redistribute files on HDFS so that there will
considerable free space left on first two machines without
downloading the files to one local machine and then uploading it back on
Ted Dunning, CTO