FAQ
Greetings all. I have been observing some interesting problems that
sometimes making hbase start/restart very hard to achieve. Here is a
situation:

Power goes out of a rack, and kills some datanodes, and some regionservers.

We power things back on, HDFS reports all datanodes back to normal,
and we cold restart hbase.
Obviously we have some log files in the /hbase/.logs directory on
HDFS. So, when master starts, it scans that dir and attempts to
replay the logs and insert all the data into the region files, so far
so good...

Now at some instances, we get this message:

hbase-root-master-rdag1.prod.imageshack.com.log.2011-01-02:2011-01-02
20:47:37,343 WARN org.apache.hadoop.hbase.util.FSUtils: Waited
121173ms for lease recovery on
hdfs://namenode:9000/hbase/.logs/mtae6.prod.imageshack.com,60020,1293990443946/10.103.5.6%3A60020.1294005303076:org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException:
failed to create file
/hbase/.logs/mtae6.prod.imageshack.com,60020,1293990443946/10.103.5.6%3A60020.1294005303076
for DFSClient_hb_m_10.101.7.1:60000_1294029805305 on client
10.101.7.1, because this file is already being created by NN_Recovery
on 10.101.7.1

Those messages (in master.log), will spew continuously and hbase will
not start. My understanding that namenode or maybe some datanode is
holding a lease on a file, and master is unable to process it. Left
by itself, the problem will not go away. The only way to resolve it,
is to shutdown the master, do

hadoop fs -cp /hbase/.logs/* /tmp/.logs
hadoop fs -rm /hbase/.logs/*
hadoop fs -mv /tmp/.logs/* /hbase/.logs/

Start master, and things are back to normal (all logs replay, master starts).
So, a question -- is there some sort of HDFS setting (are we hitting a
bug), to instruct the lease to be removed automatically? A timer
maybe? Can master be granted an authority maybe to copy a file into a
new name, and then replay it? It seems silly that master shouldn't be
able to do that, after all, its an hbase log file anyway.

Next, there is this situation:

2011-01-02 20:56:58,219 WARN org.apache.hadoop.hdfs.DFSClient: Error
Recovery for block blk_-1736208949609845257_8228359 failed because
recovery from primary datanode 10.101.1.6:50010 failed 6 times.
Pipeline was 10.101.6.1:50010, 10.103.5.8:50010, 10.103.5.6:50010,
10.101.1.6:50010. Marking primary datanode as bad.

Here /hbase/.logs/log_name exists, but the data is missing completely.
It seems this empty file persists after hbase/hdfs crash. The only
solution is to perform the above (cp, rm, mv), or simply delete those
files by hand. Now, is it possible that master would do that?
Master should be able to detect invalid files in the .log/ dir and get
rid of them without operators interaction, is there is some sort of
design element that I am simply missing?

Thanks.

-Jack

Search Discussions

  • Todd Lipcon at Jan 9, 2011 at 2:51 am
    Hi Jack,

    Do you have a rack topology script set up for HDFS?

    -Todd
    On Fri, Jan 7, 2011 at 6:32 PM, Jack Levin wrote:

    Greetings all. I have been observing some interesting problems that
    sometimes making hbase start/restart very hard to achieve. Here is a
    situation:

    Power goes out of a rack, and kills some datanodes, and some regionservers.

    We power things back on, HDFS reports all datanodes back to normal,
    and we cold restart hbase.
    Obviously we have some log files in the /hbase/.logs directory on
    HDFS. So, when master starts, it scans that dir and attempts to
    replay the logs and insert all the data into the region files, so far
    so good...

    Now at some instances, we get this message:

    hbase-root-master-rdag1.prod.imageshack.com.log.2011-01-02:2011-01-02
    20:47:37,343 WARN org.apache.hadoop.hbase.util.FSUtils: Waited
    121173ms for lease recovery on
    hdfs://namenode:9000/hbase/.logs/mtae6.prod.imageshack.com
    ,60020,1293990443946/10.103.5.6
    %3A60020.1294005303076:org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException:
    failed to create file
    /hbase/.logs/mtae6.prod.imageshack.com,60020,1293990443946/10.103.5.6
    %3A60020.1294005303076
    for DFSClient_hb_m_10.101.7.1:60000_1294029805305 on client
    10.101.7.1, because this file is already being created by NN_Recovery
    on 10.101.7.1

    Those messages (in master.log), will spew continuously and hbase will
    not start. My understanding that namenode or maybe some datanode is
    holding a lease on a file, and master is unable to process it. Left
    by itself, the problem will not go away. The only way to resolve it,
    is to shutdown the master, do

    hadoop fs -cp /hbase/.logs/* /tmp/.logs
    hadoop fs -rm /hbase/.logs/*
    hadoop fs -mv /tmp/.logs/* /hbase/.logs/

    Start master, and things are back to normal (all logs replay, master
    starts).
    So, a question -- is there some sort of HDFS setting (are we hitting a
    bug), to instruct the lease to be removed automatically? A timer
    maybe? Can master be granted an authority maybe to copy a file into a
    new name, and then replay it? It seems silly that master shouldn't be
    able to do that, after all, its an hbase log file anyway.

    Next, there is this situation:

    2011-01-02 20:56:58,219 WARN org.apache.hadoop.hdfs.DFSClient: Error
    Recovery for block blk_-1736208949609845257_8228359 failed because
    recovery from primary datanode 10.101.1.6:50010 failed 6 times.
    Pipeline was 10.101.6.1:50010, 10.103.5.8:50010, 10.103.5.6:50010,
    10.101.1.6:50010. Marking primary datanode as bad.

    Here /hbase/.logs/log_name exists, but the data is missing completely.
    It seems this empty file persists after hbase/hdfs crash. The only
    solution is to perform the above (cp, rm, mv), or simply delete those
    files by hand. Now, is it possible that master would do that?
    Master should be able to detect invalid files in the .log/ dir and get
    rid of them without operators interaction, is there is some sort of
    design element that I am simply missing?

    Thanks.

    -Jack


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Jack Levin at Jan 9, 2011 at 2:57 am
    Sure, this does not resolve the lease issue. To reproduce, just restart the namenode , have hbase hdfs clients fail, then try cold restart of the cluster

    -Jack

    On Jan 8, 2011, at 6:50 PM, Todd Lipcon wrote:

    Hi Jack,

    Do you have a rack topology script set up for HDFS?

    -Todd
    On Fri, Jan 7, 2011 at 6:32 PM, Jack Levin wrote:

    Greetings all. I have been observing some interesting problems that
    sometimes making hbase start/restart very hard to achieve. Here is a
    situation:

    Power goes out of a rack, and kills some datanodes, and some regionservers.

    We power things back on, HDFS reports all datanodes back to normal,
    and we cold restart hbase.
    Obviously we have some log files in the /hbase/.logs directory on
    HDFS. So, when master starts, it scans that dir and attempts to
    replay the logs and insert all the data into the region files, so far
    so good...

    Now at some instances, we get this message:

    hbase-root-master-rdag1.prod.imageshack.com.log.2011-01-02:2011-01-02
    20:47:37,343 WARN org.apache.hadoop.hbase.util.FSUtils: Waited
    121173ms for lease recovery on
    hdfs://namenode:9000/hbase/.logs/mtae6.prod.imageshack.com
    ,60020,1293990443946/10.103.5.6
    %3A60020.1294005303076:org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException:
    failed to create file
    /hbase/.logs/mtae6.prod.imageshack.com,60020,1293990443946/10.103.5.6
    %3A60020.1294005303076
    for DFSClient_hb_m_10.101.7.1:60000_1294029805305 on client
    10.101.7.1, because this file is already being created by NN_Recovery
    on 10.101.7.1

    Those messages (in master.log), will spew continuously and hbase will
    not start. My understanding that namenode or maybe some datanode is
    holding a lease on a file, and master is unable to process it. Left
    by itself, the problem will not go away. The only way to resolve it,
    is to shutdown the master, do

    hadoop fs -cp /hbase/.logs/* /tmp/.logs
    hadoop fs -rm /hbase/.logs/*
    hadoop fs -mv /tmp/.logs/* /hbase/.logs/

    Start master, and things are back to normal (all logs replay, master
    starts).
    So, a question -- is there some sort of HDFS setting (are we hitting a
    bug), to instruct the lease to be removed automatically? A timer
    maybe? Can master be granted an authority maybe to copy a file into a
    new name, and then replay it? It seems silly that master shouldn't be
    able to do that, after all, its an hbase log file anyway.

    Next, there is this situation:

    2011-01-02 20:56:58,219 WARN org.apache.hadoop.hdfs.DFSClient: Error
    Recovery for block blk_-1736208949609845257_8228359 failed because
    recovery from primary datanode 10.101.1.6:50010 failed 6 times.
    Pipeline was 10.101.6.1:50010, 10.103.5.8:50010, 10.103.5.6:50010,
    10.101.1.6:50010. Marking primary datanode as bad.

    Here /hbase/.logs/log_name exists, but the data is missing completely.
    It seems this empty file persists after hbase/hdfs crash. The only
    solution is to perform the above (cp, rm, mv), or simply delete those
    files by hand. Now, is it possible that master would do that?
    Master should be able to detect invalid files in the .log/ dir and get
    rid of them without operators interaction, is there is some sort of
    design element that I am simply missing?

    Thanks.

    -Jack


    --
    Todd Lipcon
    Software Engineer, Cloudera

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categorieshbase, hadoop
postedJan 8, '11 at 2:33a
activeJan 9, '11 at 2:57a
posts3
users2
websitehbase.apache.org

2 users in discussion

Jack Levin: 2 posts Todd Lipcon: 1 post

People

Translate

site design / logo © 2022 Grokbase