Grokbase Groups HBase user July 2010
Thanks all for your help with this, everything seems much more stable
for the meantime. I have a backlog loading job to run over a great
deal of data, so I might separate out my region servers from my task
trackers for the meantime.

Thanks again,


On 8 July 2010 17:46, Jean-Daniel Cryans wrote:
OS cache is good, glad you figured out your memory problem.

On Thu, Jul 8, 2010 at 2:03 AM, Jamie Cockrill wrote:
Morning all. Day 2 begins...

I discussed this with someone else earlier and they pointed out that
we also have task trackers running on all of those nodes, which will
affect the amount of memory being used when jobs are being run. Each
tasktracker had a maximum of 8 maps and 8 reduces configured per node,
with a JVM Xmx of 512mb each.  Clearly this implies a fully utilised
node will use 8*512mb + 8*512mb = 8GB of memory on tasks alone. That's
before the datanode does anything, or HBase for that matter.

As such, I've dropped it to 4 maps, 4 reduces per node and reduced the
Xmx to 256mb, giving a potential maximum task overhead of 2GB per
node. Running 'vmstat 20' now, under load from mapreduce jobs,
suggests that the actual free memory is about the same, but the memory
cache is much much bigger, which presumably is healthlier as, in
theory, that ought to relinquish memory to processes that request it.

Lets see if that does the trick!



On 7 July 2010 19:30, Jean-Daniel Cryans wrote:
YouAreDead means that the region server's session was expired, GC
seems like your major problem. (file problems can happen after a GC
sleep because they were moved around while the process was sleeping,
you also get the same kind of messages with xcievers issue... sorry
for the confusion)

By over committing the memory I meant trying to fit too much stuff in
the amount of RAM that you have. I guess it's the map and reduce tasks
that eat all the free space? Why not lower their number?


On Wed, Jul 7, 2010 at 11:22 AM, Jamie Cockrill
PS, I've now reset my MAX_FILESIZE back to the default.  (from the 1GB
i raised it to). It caused me to run into a delightful
'YouAreDeadException' which looks very related to the Garbage
collection issues on the Troubleshooting page, as my Zookeeper session



On 7 July 2010 19:19, Jamie Cockrill wrote:
By overcommit, do you mean make my overcommit_ratio higher on each box
(its at the default 50 at the moment)? What I'm noticing at the moment
is that hadoop is taking up the vast majority of the memory on the

I found this article:
which Todd, it looks like you replied to. Does this sound like a
similar problem? No worries if you can't remember, it was back in
january! This article suggests reducing the amount of memory allocated
to Hadoop at startup, how would I go about doing this?

Thank you everyone for your patience so far. Sorry if this is taking
up a lot of your time.


On 7 July 2010 19:03, Jean-Daniel Cryans wrote:
swappinness at 0 is good, but also don't overcommit your memory!


On Wed, Jul 7, 2010 at 10:53 AM, Jamie Cockrill
I think you're right.

Unfortunately the machines are on a separate network to this laptop,
so I'm having to type everything across, apologies if it doesn't
translate well...

free -m gave:

Mem    Total    Used     Free
7992     7939      53
b/c                    7877    114
Swap: 23415       895  22519

I did this on another node that isn't being smashed at the moment and
the numbers came out similar, but the buffers/cache free was higher

vmstat -20 is giving non-zero si and so's ranging between 3 and just
short of 5000.

That seems to be it I guess. Hadoop troubleshooting suggests setting
swappiness to 0, is that just a case of changing the value in



On 7 July 2010 18:40, Todd Lipcon wrote:
On Wed, Jul 7, 2010 at 10:32 AM, Jamie Cockrill wrote:

On the subject of GC and heap, I've left those as defaults. I could
look at those if that's the next logical step? Would there be anything
in any of the logs that I should look at?

One thing I have noticed is that it does take an absolute age to log
in to the DN/RS to restart the RS once it's fallen over, in one
instance it took about 10 minutes. These are 8GB, 4 core amd64 boxes
That indicates swapping. Can you run "free -m" on the node?

Also let "vmstat 20" run while running your job and observe the "si" and
"so" columns. If those are nonzero, it indicates you're swapping, and you've
oversubscribed your RAM (very easy on 8G machines)




On 7 July 2010 18:30, Jamie Cockrill wrote:
Bad news, it looks like my xcievers is set as it should be, it's in
the hdfs-site.xml and looking at the job.xml of one of my jobs in the
job-tracker, it's showing that property as set to 2047. I've cat |
grepped one of the datanode logs and although there were a few in
there, they were from a few months ago. I've upped my MAX_FILESIZE on
my table to 1GB to see if that helps (not sure if it will!).


On 7 July 2010 18:12, Jean-Daniel Cryans wrote:
xcievers exceptions will be in the datanodes' logs, and your problem
totally looks like it. 0.20.5 will have the same issue (since it's on
the HDFS side)


On Wed, Jul 7, 2010 at 10:08 AM, Jamie Cockrill
Hi Todd & JD,

All (hadoop and HBase) installed as of karmic-cdh3, which means:
Hadoop 0.20.2+228
HBase 0.89.20100621+17
Zookeeper 3.3.1+7

Unfortunately my whole cluster of regionservers have now crashed, so I
can't really say if it was swapping too much. There is a DEBUG
statement just before it crashes saying:

org.apache.hadoop.hbase.regionserver.wal.HLog: closing hlog writer in
hdfs://<somewhere on my HDFS, in /hbase>

What follows is:

WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception:
org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease
on <file location as above> File does not exist. Holder
DFSClient_-11113603 does not have any open files

It then seems to try and do some error recovery (Error Recovery for
block null bad datanode[0] nodes == null), fails (Could not get block
locations. Source file "<hbase file as before>" - Aborting). There is
then an ERROR org.apache...HRegionServer: Close and delete failed.
There is then a similar LeaseExpiredException as above.

There are then a couple of messages from HRegionServer saying that
it's notifying master of its shutdown and stopping itself. The
shutdown hook then fires and the RemoteException and
LeaseExpiredExceptions are printed again.

ulimit is set to 65000 (it's in the regionserver log, printed as I
restarted the regionserver), however I haven't got the xceivers set
anywhere. I'll give that a go. It does seem very odd as I did have a
few of them fall over one at a time with a few early loads, but that
seemed to be because the regions weren't splitting properly, so all
the traffic was going to one node and it was being overwhelmed. Once I
throttled it, after one load it a region split seemed to get
triggered, which flung regions all over, which made subsequent loads
much more distributed. However, perhaps the time-bomb was ticking...
I'll  have a go at specifying the xcievers property. I'm pretty
certain i've got everything else covered, except the patches as
referenced in the JIRA.

I just grepped some of the log files and didn't get an explicit
exception with 'xciever' in it.

I am considering downgrading(?) to 0.20.5, however because everything
is installed as per karmic-cdh3, I'm a bit reluctant to do so as
presumably Cloudera has tested each of these versions against each
other? And I don't really want to introduce further versioning issues.



On 7 July 2010 17:30, Jean-Daniel Cryans wrote:

Does your configuration meets the requirements?
ulimit and xcievers, if not set, are usually time bombs that blow off
the cluster is under load.


On Wed, Jul 7, 2010 at 9:11 AM, Jamie Cockrill <>wrote:
Dear all,

My current HBase/Hadoop architecture has HBase region servers on the
same physical boxes as the HDFS data-nodes. I'm getting an awful lot
of region server crashes. The last thing that happens appears to be a
DroppedSnapshot Exception, caused by an IOException: could not
complete write to file <file on HDFS>. I am running it under load,
heavy that is I'm not sure how that is quantified, but I'm guessing
is a load issue.

Is it common practice to put region servers on data-nodes? Is it
common to see region server crashes when either the HDFS or region
server (or both) is under heavy load? I'm guessing that is the case
I've seen a few similar posts. I've not got a great deal of capacity
to be separating region servers from HDFS data nodes, but it might be
an argument I could make.



Todd Lipcon
Software Engineer, Cloudera

Search Discussions

Discussion Posts


Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 19 of 27 | next ›
Discussion Overview
groupuser @
categorieshbase, hadoop
postedJul 7, '10 at 4:13p
activeJul 8, '10 at 11:02p



site design / logo © 2021 Grokbase