Grokbase Groups HBase user July 2009
FAQ
Few days ago, I played with the latest trunk to see how fail-tolerance
works in 0.20. While running PerformanceEvaluation to generate
workloads, killing HRS and HMaster is not a big deal. The client
recovers after tens of secs to few minutes. This is good.

For multi masters, it seems that I have to manually start backup master by

bin/hbase-daemon.sh start master

This is ok, though it's better that we can specify this as part of
hbase-site.xml or a new conf/masters.

But stop backup master is messy... if I just do

bin/hbase-daemon.sh stop master

It will bring the whole cluster down. That's bad.

Not sure if we can do something like this :

1. if there is an active master, stop master will just make HMaster
die without shutdown the whole cluster
2. otherwise, shutdown the whole cluster as before

Any ideas?

Thanks,
Rong-En Fan

Search Discussions

  • Jean-Daniel Cryans at Jul 16, 2009 at 4:46 pm
    Rong-En Fan,

    I agree multi-master requires manual tasks and the current lack of doc
    does not help (it's on my list tho).

    I also agree that stop on a backup master shouldn't stop the cluster.
    Can you fill in a Jira? (kill -9 works well btw)

    wrt multi-master conf, I personally ruled it out of 0.20.0 but do you
    think we should still include it for usability? Is it currently too
    rough?

    Thx,

    J-D

    On Thu, Jul 16, 2009 at 12:40 PM, Rong-en Fanwrote:
    Few days ago, I played with the latest trunk to see how fail-tolerance
    works in 0.20. While running PerformanceEvaluation to generate
    workloads, killing HRS and HMaster is not a big deal. The client
    recovers after tens of secs to few minutes. This is good.

    For multi masters, it seems that I have to manually start backup master by

    bin/hbase-daemon.sh start master

    This is ok, though it's better that we can specify this as part of
    hbase-site.xml or a new conf/masters.

    But stop  backup master is messy... if I just do

    bin/hbase-daemon.sh stop master

    It will bring the whole cluster down. That's bad.

    Not sure if we can do something like this :

    1. if there is an active master, stop master will just make HMaster
    die without shutdown the whole cluster
    2. otherwise, shutdown the whole cluster as before

    Any ideas?

    Thanks,
    Rong-En Fan
  • Stack at Jul 16, 2009 at 5:01 pm
    In this doc, http://wiki.apache.org/hadoop/Hbase/RollingRestart, I say kill
    -9 the master for now.
    St.Ack
    On Thu, Jul 16, 2009 at 9:47 AM, Jean-Daniel Cryans wrote:

    Rong-En Fan,

    I agree multi-master requires manual tasks and the current lack of doc
    does not help (it's on my list tho).

    I also agree that stop on a backup master shouldn't stop the cluster.
    Can you fill in a Jira? (kill -9 works well btw)

    wrt multi-master conf, I personally ruled it out of 0.20.0 but do you
    think we should still include it for usability? Is it currently too
    rough?

    Thx,

    J-D

    On Thu, Jul 16, 2009 at 12:40 PM, Rong-en Fanwrote:
    Few days ago, I played with the latest trunk to see how fail-tolerance
    works in 0.20. While running PerformanceEvaluation to generate
    workloads, killing HRS and HMaster is not a big deal. The client
    recovers after tens of secs to few minutes. This is good.

    For multi masters, it seems that I have to manually start backup master by
    bin/hbase-daemon.sh start master

    This is ok, though it's better that we can specify this as part of
    hbase-site.xml or a new conf/masters.

    But stop backup master is messy... if I just do

    bin/hbase-daemon.sh stop master

    It will bring the whole cluster down. That's bad.

    Not sure if we can do something like this :

    1. if there is an active master, stop master will just make HMaster
    die without shutdown the whole cluster
    2. otherwise, shutdown the whole cluster as before

    Any ideas?

    Thanks,
    Rong-En Fan
  • Jonathan Gray at Jul 16, 2009 at 5:11 pm
    I think we should add a conf file for "backupmasters", or just use
    "masters" but with the first in the list the one that always gets to be
    master first (introducing a delay should ensure he gets the ephemeral
    node first?).

    Should not be too bad? If it's hard we could wait but seems like it
    would be fairly simple.

    JG

    Jean-Daniel Cryans wrote:
    Rong-En Fan,

    I agree multi-master requires manual tasks and the current lack of doc
    does not help (it's on my list tho).

    I also agree that stop on a backup master shouldn't stop the cluster.
    Can you fill in a Jira? (kill -9 works well btw)

    wrt multi-master conf, I personally ruled it out of 0.20.0 but do you
    think we should still include it for usability? Is it currently too
    rough?

    Thx,

    J-D

    On Thu, Jul 16, 2009 at 12:40 PM, Rong-en Fanwrote:
    Few days ago, I played with the latest trunk to see how fail-tolerance
    works in 0.20. While running PerformanceEvaluation to generate
    workloads, killing HRS and HMaster is not a big deal. The client
    recovers after tens of secs to few minutes. This is good.

    For multi masters, it seems that I have to manually start backup master by

    bin/hbase-daemon.sh start master

    This is ok, though it's better that we can specify this as part of
    hbase-site.xml or a new conf/masters.

    But stop backup master is messy... if I just do

    bin/hbase-daemon.sh stop master

    It will bring the whole cluster down. That's bad.

    Not sure if we can do something like this :

    1. if there is an active master, stop master will just make HMaster
    die without shutdown the whole cluster
    2. otherwise, shutdown the whole cluster as before

    Any ideas?

    Thanks,
    Rong-En Fan

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categorieshbase, hadoop
postedJul 16, '09 at 4:39p
activeJul 16, '09 at 5:11p
posts4
users4
websitehbase.apache.org

People

Translate

site design / logo © 2022 Grokbase