FAQ
[ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yu Li updated HBASE-16032:
--------------------------
     Attachment: HBASE-16032_v4.patch

Update patch resolving the issue Enis pointed out, rollback to use the try-catch way in StoreScanner constructor to avoid missing the StoreFile update in resetKVHeap when flush/bulkload happens
Possible memory leak in StoreScanner
------------------------------------

Key: HBASE-16032
URL: https://issues.apache.org/jira/browse/HBASE-16032
Project: HBase
Issue Type: Bug
Affects Versions: 1.2.1, 1.1.5, 0.98.20
Reporter: Yu Li
Assignee: Yu Li
Fix For: 2.0.0, 1.1.6, 1.3.1, 0.98.21, 1.2.3

Attachments: HBASE-16032.patch, HBASE-16032_v2.patch, HBASE-16032_v3.patch, HBASE-16032_v4.patch


We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
After some debugging, I located some possible memory leak in StoreScanner constructor:
{code}
public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
long readPt)
throws IOException {
this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
if (columns != null && scan.isRaw()) {
throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
}
matcher = new ScanQueryMatcher(scan, scanInfo, columns,
ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
oldestUnexpiredTS, now, store.getCoprocessorHost());
this.store.addChangedReaderObserver(this);
// Pass columns to try to filter out unnecessary StoreFiles.
List<KeyValueScanner> scanners = getScannersNoCompaction();
...
seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
&& lazySeekEnabledGlobally, parallelSeekEnabled);
...
resetKVHeap(scanners, store.getComparator());
}
{code}
If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in {{HRegion#get}}
{code}
RegionScanner scanner = null;
try {
scanner = getScanner(scan);
scanner.next(results);
} finally {
if (scanner != null)
scanner.close();
}
{code}
What's more, all exception thrown in the {{HRegion#getScanner}} path will cause scanner==null then memory leak, so we also need to handle this part.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 35 of 36 | next ›
Discussion Overview
groupissues @
categorieshbase, hadoop
postedJun 15, '16 at 2:51p
activeJun 18, '16 at 11:54a
posts36
users1
websitehbase.apache.org

1 user in discussion

Hadoop QA (JIRA): 36 posts

People

Translate

site design / logo © 2019 Grokbase