Grokbase Groups HBase user June 2010
Hi Jean,

It happened again today during a server restart. This involved a hadoop
start following by a hbase start.
There was also an exception when hbase master came up on reading a file
from hadoop. Not sure if that is the problem.
Pasted those logs too.

Current state of the system: master, zookeeper, region servers are all up.
But region servers are not connected to master.

Here are the logs ....

1. logs on hbase master and hadoop namenode.
hbase-master.out :

2. syslog on hbase master.

3. syslog on hbase regionservers. Posted one the other is the same.

I did a netstat -tna to confirm that master is listening on port

I did a restart of regionservers only and its able to connect fine.


On Fri, Jun 11, 2010 at 12:56 PM, Jean-Daniel Cryans wrote:

You can check the general health by using the webui, it runs on the
master node at port 60010.

For the errors, the context you gave is so limited that giving any
meaningful answer is impossible. Please post full logs on a web server
or on (or your preferred code pasting site) if it fits.

On Fri, Jun 11, 2010 at 12:48 PM, ishwar ramani wrote:

I have a hbase hadoop cluster setup. 6 days back we did a cold restart of
our system.
I recently noticed that a hbase query was timing out with

org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying
to locate root region

I looked at the master logs and none of the region servers had connected

2010-06-04 00:00:21,510 INFO
org.apache.hadoop.hbase.master.ServerManager: 0
region servers, 0 dead, average load NaN

The master had a stderr output when it started
org.apache.hadoop.ipc.RemoteException: Could not
complete write to file /hbase/devLogsTable/1225469767/oldlogfile.log by

The regionservers have been trying to connect with the master ever since
with the error

2010-06-03 14:33:28,960 WARN
org.apache.hadoop.hbase.regionserver.HRegionServer: Unable to connect to
master. Retrying. Error was: Connection refused

All the region servers and master processes are running now. Except none of
the region servers are connected.

My first question is how to monitor this problem. None of the logs report an
error. I monitor processes so they are all fine. The logs don't report any
How do i check for the general health of the cluster?

My second question is why did this happen?


Search Discussions

Discussion Posts


Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 3 of 3 | next ›
Discussion Overview
groupuser @
categorieshbase, hadoop
postedJun 11, '10 at 7:48p
activeJun 16, '10 at 7:55p



site design / logo © 2022 Grokbase