|
Phantom |
at Jul 17, 2007 at 5:57 pm
|
⇧ |
| |
Here is the scenario I was concerned about. Consider three nodes in the
system A, B and C which are placed say in different racks. Let us say that
the disk on A fries up today. Now the blocks that were stored on A are not
going to re-replicated (this is my understanding but I could be wrong in
this assumption) to some other node or to the new disk with which you would
bring back A. Now a month later the disk of B could fry and then another
month later disk on C could fry. This way you could slowly start losing data
in the absence of a replica synchronization algorithm like that in S3. This
would never happen in S3 because there is always a replica synchronization
algorithm that is running to give the guarantee that there will always be 3
replicas in the system. So if a disk fries then the data is re-replicated.
Of course there is no way to protect oneself from 3 machines which store
replicas losing their disks at the same time.
So I was wondering if there is a replica synchronization algorithm in place
or is it a feature that is planned for the future.
A
On 7/17/07, Ted Dunning wrote:Assuming that you have many more disks than 3, then the chances that 3
simultaneous disk failures being just the right 3 is much lower than the
chances of losing any 3 disks. This is enhanced by the ability of Hadoop
to
allocate files in different racks since one of the few mechanisms of
coordinating failures is losing an entire rack.
For example, if you have 20 disks, then the chance of losing a particular
three disks given that you are losing 3 disks is about one chance in a
thousand (assuming independent error location) and should be impossible if
the failures are rack aligned.
Remember, you can always increase the number of replicas if you like.
On 7/17/07 12:55 AM, "Phantom" wrote:
Is replica management built into HDFS ? What I mean is if I set
replication
factor to 3 and if I lose 3 disks is that data lost forever ? I mean all 3
disks dying at the same time I know is a far fetched scenario but if they
die over a certain period of time does HDFS re-replicate the data to ensure
that there are always 3 copies in the system ?
Thanks
A