You can affect the current situation with a little development
effort, as we (privately) have at my employer. Comments inline
below.
From: Cosmin Lehene
My answers inline...
On 1/7/09 12:05 PM, "Genady" wrote:
Could somebody explain an expected HBase 0.18.1 nodes
behavior in case Hadoop cluster failover for a following
reasons:
- HBase master region server fails
You need to manually set a different machine as master and
redistribute the configuration files on all other region
servers and restart the cluster.
Maybe someone in the development team could explain if this
will change with Zookeeper integration.
First, my understanding is that the ZK integration will handle
master role reassignment without requiring a cluster restart.
J-D could say more (or deny).
What we (my employer) do currently is run the Hadoop and HBase
daemons as child processes of custom monitoring daemons that
use a private DHT which supports TTLs on cells to write
heartbeats into the cloud. This same mechanism also supports
service discovery. All HBase processes in particular can be
automatically restarted should the location of the master shift.
(The location of the master may shift if a node has a hard
failure.) We write Hadoop and HBase configuration files on the
fly as necessary.
This all took me only a few days to implement and a few more
to debug.
Relocation of the Hadoop name node is trickier. I believe it
is possible to have it write the fs image out to a NFS share
such that a service relocation to another host with the same
NFS mount will pick up the latest edits seamlessly. However I
do not trust NFS under a number of failure conditions so will
not try this myself. There might be other better strategies
for replication of the fs image.
- HBase slave region server fails
This is handled transparently. Regions served by the failed
region server are reassigned to the rest of the region
servers.
- Hadoop master server fails
I suppose you mean the HDFS namenode. Currently, the
namenode is a single point of failure in HDFS and needs
manual intervention to configure a new namenode. A
secondary namenode can be configured, however this one only
keeps a metadata replica and does not act as failover node.
http://wiki.apache.org/hadoop/NameNodeSee my comments above.
- Hadoop slave server fails
If a HDFS datanode fails it's files are already replicated
on 2 other datanodes. Eventually the replication will be
fixed by the namenode - creating third replica on one of
remaining datanodes.
You want to make sure your client is requesting the default
replication. The stock Hadoop config does allow DFS clients
to specify a replication factor of 1 only. However the HBase
DFS client will always request the default so this is not
an issue for HBase.
- Hadoop master and HBase master are fail (
in case they're installed on the same computer and for
instance disk has failover)
These servers run independently so you can see above what
happens.
Don't run them on the same node regardless. The Hadoop name
node can become very busy given a lot of DFS file system
activity. Let it have its own dedicated node to avoid problems
e.g. replication stalls.
- HBase slave region server is failed but
HBase data could be recovered and copied to other node
and the new node is added instead of failed one.
Hbase region servers don't actually hold the data. Data
is stored in HDFS. Region servers just serve regions and
when the region server fails the regions are reassigned
(see above).
Cosmin
- Andy