I was thinking the same.
In an ideal world, I guess, the namenodes would be quorum based.
Clients would be aware of all the namenodes and fire updates in parallel to all namenodes, and updates not return until N namenodes confirmed the update.
To make it easier one could initially require that all namenodes need to receive the update and only one namenode would serve lookups.
When namenodes come up they somehow need to synchronize with each other. (waving hands here)
When a namenode fails there would still be some manual intervention to remove it from the quorum, or to switch to another one if this was the main namenode.
This would put twice the traffic on the network (w.r.t. to namenode traffic), slow the clients to the speed to the slowest namenode, etc. etc.
And nothing is ever simple in distributed computing; there will be a 1000 corner cases to consider.
The NFS approach will at least work, and it is easier to reason about the state the system is in.
-- Lars
________________________________
From: Ryan Rawson <ryanobjc@gmail.com>
To: user@hbase.apache.org
Cc: hadoop-user@lucene.apache.org
Sent: Thursday, August 18, 2011 9:57 PM
Subject: Re: Avatar namenode?
There are a few problems for Avatar node which would prevent me from
ever using it:
- assumption of highly available NFS, this would typically mean
specialized hardware
- failover time is potentially lengthy (article says 60 seconds), and
HBase regionservers might fail
It's an interesting hack, I hear it was only 800 loc, but I'm not sure
if you are not Dhruba you can run it.
On Thu, Aug 18, 2011 at 9:49 PM, Jack Levin wrote:I don't think there is anyone except Facebook actually uses it. Their
case is special, as they have millions and millions of files in HDFS.
-Jack
On Thu, Aug 11, 2011 at 11:19 AM, shanmuganathan.r
wrote:
Hi All,
I am running the HBase distributed mode in seven node cluster with backup master. The HBase is running properly in the backup master environment. I want to run this HBase on top of the High Availability Hadoop. I saw about Avatar node in the following link
http://hadoopblog.blogspot.com/2010/02/hadoop-namenode-high-availability.html .
I need more help regarding this Avatar namenode configuration.
1. Which IP will be given to the Datanoe fs.default.name ?
2. Is there any good method other than avatar available for Backup namenode ?
Regards,
Shanmuganathan