Grokbase Groups HBase user July 2011
FAQ
We aren't profiling right now. Here's what is in the hbase-env.sh

export TZ="US/Mountain"
export HBASE_OPTS="$HBASE_OPTS -XX:+UseConcMarkSweepGC
-XX:+CMSIncrementalMode -verbose:gc -XX:+PrintGCDetails
-XX:+PrintGCTimeStamps -Xloggc:/home/hadoop/gc-hbase.log "
export HBASE_MANAGES_ZK=false
export HBASE_PID_DIR=/home/hadoop
export HBASE_HEAPSIZE=10240

Java is
java version "1.6.0_17"
Java(TM) SE Runtime Environment (build 1.6.0_17-b04)
Java HotSpot(TM) 64-Bit Server VM (build 14.3-b01, mixed mode)

We were planning an upgrade to 1.6.0_25 before we ran into this issue.


On Thu, Jul 14, 2011 at 3:59 PM, Stack wrote:

What Lohit says but also, what jvm are you running and what options
are you feeding it? The stack trace is a little crazy (especially the
mix in of resource bundle loading). We saw something similar over in
HBASE-3830 when someone was running profiler. Is that what is going
on here?

Thanks,
St.Ack
On Thu, Jul 14, 2011 at 11:36 AM, Matt Davies wrote:
Hey everyone,

We periodically see a situation where the regionserver process exists in the
process list, zookeeper thread sends the keepalive so the master won't
remove it from the active list, yet the regionserver will not serve data.

Hadoop(cdh3u0), HBase 0.90.3 (Apache version), under load from an internal
testing tool.


I've taken a jstack of the process and found this:

Found one Java-level deadlock:
=============================
"IPC Server handler 99 on 60020":
waiting to lock monitor 0x0000000047f97000 (object 0x00002aaab8ef07e8, a
org.apache.hadoop.hbase.regionserver.MemStoreFlusher),
which is held by "IPC Server handler 64 on 60020"
"IPC Server handler 64 on 60020":
waiting for ownable synchronizer 0x00002aaab8eea130, (a
java.util.concurrent.locks.ReentrantLock$NonfairSync),
which is held by "regionserver60020.cacheFlusher"
"regionserver60020.cacheFlusher":
waiting to lock monitor 0x0000000047f97000 (object 0x00002aaab8ef07e8, a
org.apache.hadoop.hbase.regionserver.MemStoreFlusher),
which is held by "IPC Server handler 64 on 60020"

Java stack information for the threads listed above:
===================================================
"IPC Server handler 99 on 60020":
at
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.reclaimMemStoreMemory(MemStoreFlusher.java:434)
- waiting to lock <0x00002aaab8ef07e8> (a
org.apache.hadoop.hbase.regionserver.MemStoreFlusher)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:2529)
at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
"IPC Server handler 64 on 60020":
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00002aaab8eea130> (a
java.util.concurrent.locks.ReentrantLock$NonfairSync)
at
java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:747)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:778)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1114)
at
java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186)
at
java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262)
at
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.reclaimMemStoreMemory(MemStoreFlusher.java:435)
- locked <0x00002aaab8ef07e8> (a
org.apache.hadoop.hbase.regionserver.MemStoreFlusher)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:2529)
at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
"regionserver60020.cacheFlusher":
at java.util.ResourceBundle.endLoading(ResourceBundle.java:1506)
- waiting to lock <0x00002aaab8ef07e8> (a
org.apache.hadoop.hbase.regionserver.MemStoreFlusher)
at java.util.ResourceBundle.findBundle(ResourceBundle.java:1379)
at java.util.ResourceBundle.findBundle(ResourceBundle.java:1292)
at
java.util.ResourceBundle.getBundleImpl(ResourceBundle.java:1234)
at java.util.ResourceBundle.getBundle(ResourceBundle.java:832)
at sun.util.resources.LocaleData$1.run(LocaleData.java:127)
at java.security.AccessController.doPrivileged(Native Method)
at sun.util.resources.LocaleData.getBundle(LocaleData.java:125)
at
sun.util.resources.LocaleData.getTimeZoneNames(LocaleData.java:97)
at
sun.util.TimeZoneNameUtility.getBundle(TimeZoneNameUtility.java:115)
at
sun.util.TimeZoneNameUtility.retrieveDisplayNames(TimeZoneNameUtility.java:80)
at java.util.TimeZone.getDisplayNames(TimeZone.java:399)
at java.util.TimeZone.getDisplayName(TimeZone.java:350)
at java.util.Date.toString(Date.java:1025)
at java.lang.String.valueOf(String.java:2826)
at java.lang.StringBuilder.append(StringBuilder.java:115)
at
org.apache.hadoop.hbase.regionserver.PriorityCompactionQueue$CompactionRequest.toString(PriorityCompactionQueue.java:114)
at java.lang.String.valueOf(String.java:2826)
at java.lang.StringBuilder.append(StringBuilder.java:115)
at
org.apache.hadoop.hbase.regionserver.PriorityCompactionQueue.addToRegionsInQueue(PriorityCompactionQueue.java:145)
- locked <0x00002aaab8f2dc58> (a java.util.HashMap)
at
org.apache.hadoop.hbase.regionserver.PriorityCompactionQueue.add(PriorityCompactionQueue.java:188)
at
org.apache.hadoop.hbase.regionserver.CompactSplitThread.requestCompaction(CompactSplitThread.java:140)
- locked <0x00002aaab8894048> (a
org.apache.hadoop.hbase.regionserver.CompactSplitThread)
at
org.apache.hadoop.hbase.regionserver.CompactSplitThread.requestCompaction(CompactSplitThread.java:118)
- locked <0x00002aaab8894048> (a
org.apache.hadoop.hbase.regionserver.CompactSplitThread)
at
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:393)
at
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:366)
at
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:240)

Any ideas on how I could prevent this or let the master know about it? I've
written an app that will check all regionservers periodically for such a
lockup, but I can't run it constantly.

I can provide more of the jstack if that is helpful.

-Matt

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 5 of 9 | next ›
Discussion Overview
groupuser @
categorieshbase, hadoop
postedJul 14, '11 at 6:37p
activeJul 15, '11 at 6:03a
posts9
users4
websitehbase.apache.org

People

Translate

site design / logo © 2022 Grokbase