Grokbase Groups HBase user May 2011
FAQ

[HBase-user] mslab enabled jvm crash

Jack Levin
May 26, 2011 at 6:03 pm
It might sound crazy, but if you have plenty of CPU, consider lowering
your NewSize to like 30MB, if you do that your ParNews will be more
frequent, but hitting CMS failure will be less likely, this is what we
seen.

-Jack
On Thu, May 26, 2011 at 10:51 AM, Jack Levin wrote:
Wayne, we get CMS failures also, I am pretty sure they are
fragmentation related:

2011-05-26T09:20:00.304-0700: 206371.599: [GC 206371.599: [ParNew
(promotion failed): 76633K->76023K(76672K), 0.0924180 secs]206371.692:
[CMS: 11452308K->7142504K(122
02816K), 13.5870310 secs] 11525447K->7142504K(12279488K), [CMS Perm :
18254K->18254K(30436K)] icms_dc=0 , 13.6796820 secs] [Times:
user=13.17 sys=0.64, real=13.68 sec
s]

The RS does not go away when this happens.   If your disks are not
overloaded, you should consider flushing sooner and deeper, e.g. flush
larger chunks of memory, and offload the load to the disks, this way,
you run will free up more memstore cache, and promotion of YG to
tenured has more chances to succeed without CMS Failure.

-Jack.

PS.  Wayne, are you on IM, I am "jacklevin74" on both AIM and Skype, lets chat.
On Thu, May 26, 2011 at 10:42 AM, Wayne wrote:
I left parnew alone (did not add any settings). I also did not increase the
heap. 8g with 50% for memstore. Below are the JVM settings.

The errors I pasted occurred after running for only maybe 12 hours. The
cluster as a whole has been running for 24 hours with dropping a node, but
short time span CMFs are occurring.

Any recommendations?

export HBASE_OPTS="-XX:+UseCMSInitiatingOccupancyOnly
-XX:CMSInitiatingOccupancyFraction=65 -XX:+CMSParallelRemarkEnabled
-XX:+HeapDumpOnOutOfMemoryError -XX:+UseConcMarkSweepGC"

Thanks.

On Thu, May 26, 2011 at 1:30 PM, Stack wrote:
On Thu, May 26, 2011 at 9:00 AM, Wayne wrote:
Looking more closely I can see that we are still
getting Concurrent Mode Failures on some of the nodes but they are only
lasting for 10s so the nodes don't go away. Is this considered "normal"?
With CMSInitiatingOccupancyFraction=65 I would suspect this is not normal??
What configs. are you running with now?  It looks like you either left
parnew as unbounded or else you set it to 256M max?   So you did not
change the parnew size?  Did you up your heap size?  I see you are
getting 'promotion failed'/'concurrent mode failure'.  How long has it
been running now?

St.Ack
reply

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions