FAQ
Yu Li created HBASE-16032:
-----------------------------

              Summary: Possible memory leak in StoreScanner
                  Key: HBASE-16032
                  URL: https://issues.apache.org/jira/browse/HBASE-16032
              Project: HBase
           Issue Type: Bug
             Reporter: Yu Li
             Assignee: Yu Li


We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...

After some debugging, I located some possible memory leak in StoreScanner constructor:
{code}
   public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
       long readPt)
   throws IOException {
     this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
     if (columns != null && scan.isRaw()) {
       throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
     }
     matcher = new ScanQueryMatcher(scan, scanInfo, columns,
         ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
         oldestUnexpiredTS, now, store.getCoprocessorHost());

     this.store.addChangedReaderObserver(this);

     // Pass columns to try to filter out unnecessary StoreFiles.
     List<KeyValueScanner> scanners = getScannersNoCompaction();
     ...
     seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
         && lazySeekEnabledGlobally, parallelSeekEnabled);
     ...
     resetKVHeap(scanners, store.getComparator());
   }
{code}
If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in HRegion#get
{code}
     RegionScanner scanner = null;
     try {
       scanner = getScanner(scan);
       scanner.next(results);
     } finally {
       if (scanner != null)
         scanner.close();
     }
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Search Discussions

  • Yu Li (JIRA) at Jun 15, 2016 at 2:56 pm
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Yu Li updated HBASE-16032:
    --------------------------
         Attachment: HBASE-16032.patch

    To solve the problem, we should catch exceptions after {{addChangedReaderObserver}} and remove the StoreScanner from {{changedReaderObservers}} if there's any.

    Uploading a straight-forward patch following this design.
    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Reporter: Yu Li
    Assignee: Yu Li
    Attachments: HBASE-16032.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in HRegion#get
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Yu Li (JIRA) at Jun 15, 2016 at 2:59 pm
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Yu Li updated HBASE-16032:
    --------------------------
             Fix Version/s: 0.98.21
                            1.1.6
                            1.2.2
                            1.3.0
                            2.0.0
         Affects Version/s: 1.2.1
                            1.1.5
                            0.98.20
                    Status: Patch Available (was: Open)

    Submit patch for HadoopQA
    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 0.98.20, 1.1.5, 1.2.1
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.3.0, 1.2.2, 1.1.6, 0.98.21

    Attachments: HBASE-16032.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in HRegion#get
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Ted Yu (JIRA) at Jun 15, 2016 at 4:17 pm
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331991#comment-15331991 ]

    Ted Yu commented on HBASE-16032:
    --------------------------------

    Have you considered moving this.store.addChangedReaderObserver(this) call to the end of the ctor ?
    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 1.2.1, 1.1.5, 0.98.20
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.3.0, 1.2.2, 1.1.6, 0.98.21

    Attachments: HBASE-16032.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in HRegion#get
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Hadoop QA (JIRA) at Jun 15, 2016 at 7:06 pm
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15332335#comment-15332335 ]

    Hadoop QA commented on HBASE-16032:
    -----------------------------------
    (x) *{color:red}-1 overall{color}* |
    \\
    \\
    Vote || Subsystem || Runtime || Comment ||
    {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
    {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} |
    {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} |
    {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 9s {color} | {color:green} master passed {color} |
    {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s {color} | {color:green} master passed with JDK v1.8.0 {color} |
    {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s {color} | {color:green} master passed with JDK v1.7.0_79 {color} |
    {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 54s {color} | {color:green} master passed {color} |
    {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} master passed {color} |
    {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 59s {color} | {color:green} master passed {color} |
    {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s {color} | {color:green} master passed with JDK v1.8.0 {color} |
    {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s {color} | {color:green} master passed with JDK v1.7.0_79 {color} |
    {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} |
    {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s {color} | {color:green} the patch passed with JDK v1.8.0 {color} |
    {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 47s {color} | {color:green} the patch passed {color} |
    {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
    {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 32s {color} | {color:green} the patch passed {color} |
    {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 52s {color} | {color:green} the patch passed {color} |
    {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} |
    {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} |
    {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 25m 53s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
    {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 17s {color} | {color:green} the patch passed {color} |
    {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s {color} | {color:green} the patch passed with JDK v1.8.0 {color} |
    {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
    {color:red}-1{color} | {color:red} unit {color} | {color:red} 76m 6s {color} | {color:red} hbase-server in the patch failed. {color} |
    {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s {color} | {color:green} Patch does not generate ASF License warnings. {color} |
    {color:black}{color} | {color:black} {color} | {color:black} 117m 57s {color} | {color:black} {color} |
    \\
    \\
    Reason || Tests ||
    Failed junit tests | hadoop.hbase.replication.TestReplicationSyncUpTool |
    hadoop.hbase.client.TestFastFail |
    hadoop.hbase.replication.TestReplicationSmallTests |
    hadoop.hbase.replication.multiwal.TestReplicationKillMasterRSCompressedWithMultipleWAL |
    Timed out junit tests | org.apache.hadoop.hbase.client.TestReplicasClient |
    org.apache.hadoop.hbase.TestHBaseTestingUtility |
    org.apache.hadoop.hbase.client.TestTableSnapshotScanner |
    org.apache.hadoop.hbase.client.TestMobSnapshotCloneIndependence |
    \\
    \\
    Subsystem || Report/Notes ||
    JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12810863/HBASE-16032.patch |
    JIRA Issue | HBASE-16032 |
    Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile |
    uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
    Build tool | maven |
    Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh |
    git revision | master / ae5fe1e |
    Default Java | 1.7.0_79 |
    Multi-JDK versions | /home/jenkins/tools/java/jdk1.8.0:1.8.0 /usr/local/jenkins/java/jdk1.7.0_79:1.7.0_79 |
    findbugs | v3.0.0 |
    unit | https://builds.apache.org/job/PreCommit-HBASE-Build/2228/artifact/patchprocess/patch-unit-hbase-server.txt |
    unit test logs | https://builds.apache.org/job/PreCommit-HBASE-Build/2228/artifact/patchprocess/patch-unit-hbase-server.txt |
    Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/2228/testReport/ |
    modules | C: hbase-server U: hbase-server |
    Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/2228/console |
    Powered by | Apache Yetus 0.2.1 http://yetus.apache.org |

    This message was automatically generated.


    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 1.2.1, 1.1.5, 0.98.20
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.3.0, 1.2.2, 1.1.6, 0.98.21

    Attachments: HBASE-16032.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in HRegion#get
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Enis Soztutar (JIRA) at Jun 15, 2016 at 9:36 pm
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15332633#comment-15332633 ]

    Enis Soztutar commented on HBASE-16032:
    ---------------------------------------

      Do you know why we were getting so many exceptions? {{resetKVHeap()}} is throwing exceptions?
    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 1.2.1, 1.1.5, 0.98.20
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.3.0, 1.2.2, 1.1.6, 0.98.21

    Attachments: HBASE-16032.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in HRegion#get
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Yu Li (JIRA) at Jun 16, 2016 at 3:31 am
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15332998#comment-15332998 ]

    Yu Li commented on HBASE-16032:
    -------------------------------

    Yes, this was my first choice. I changed to use try-catch thinking about covering all exception handling in the {{HRegion#getScanner}} path but obviously could not achieve my goal after a relook now... Thanks for the reminder, and please check the new patch for more details.
    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 1.2.1, 1.1.5, 0.98.20
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.3.0, 1.2.2, 1.1.6, 0.98.21

    Attachments: HBASE-16032.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in HRegion#get
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Yu Li (JIRA) at Jun 16, 2016 at 3:38 am
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333000#comment-15333000 ]

    Yu Li commented on HBASE-16032:
    -------------------------------

    Unfortunately, the logging level of RpcServer in the problematic machine was INFO rather than DEBUG, so we missed the logging of exception details in {{CallRunner#run}}, and the client customer didn't report any exception either. But during the RS fullGC period there was some HDFS problem, so I guess (yep sorry only my guess...) the cause was HDFS related exceptions.
    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 1.2.1, 1.1.5, 0.98.20
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.3.0, 1.2.2, 1.1.6, 0.98.21

    Attachments: HBASE-16032.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in HRegion#get
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Yu Li (JIRA) at Jun 16, 2016 at 3:40 am
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Yu Li updated HBASE-16032:
    --------------------------
         Attachment: HBASE-16032_v2.patch

    Update the patch, covering missed exception-thrown handling on the {{HRegion#getScanner}} path
    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 1.2.1, 1.1.5, 0.98.20
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.3.0, 1.2.2, 1.1.6, 0.98.21

    Attachments: HBASE-16032.patch, HBASE-16032_v2.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in HRegion#get
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Yu Li (JIRA) at Jun 16, 2016 at 3:48 am
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Yu Li updated HBASE-16032:
    --------------------------
         Description:
    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...

    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
       public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
           long readPt)
       throws IOException {
         this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
         if (columns != null && scan.isRaw()) {
           throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
         }
         matcher = new ScanQueryMatcher(scan, scanInfo, columns,
             ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
             oldestUnexpiredTS, now, store.getCoprocessorHost());

         this.store.addChangedReaderObserver(this);

         // Pass columns to try to filter out unnecessary StoreFiles.
         List<KeyValueScanner> scanners = getScannersNoCompaction();
         ...
         seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
             && lazySeekEnabledGlobally, parallelSeekEnabled);
         ...
         resetKVHeap(scanners, store.getComparator());
       }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in {{HRegion#get}}
    {code}
         RegionScanner scanner = null;
         try {
           scanner = getScanner(scan);
           scanner.next(results);
         } finally {
           if (scanner != null)
             scanner.close();
         }
    {code}
    What's more, all exception thrown in the {{HRegion#getScanner}} path will cause scanner==null then memory leak, so we also need to handle this part.

       was:
    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...

    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
       public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
           long readPt)
       throws IOException {
         this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
         if (columns != null && scan.isRaw()) {
           throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
         }
         matcher = new ScanQueryMatcher(scan, scanInfo, columns,
             ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
             oldestUnexpiredTS, now, store.getCoprocessorHost());

         this.store.addChangedReaderObserver(this);

         // Pass columns to try to filter out unnecessary StoreFiles.
         List<KeyValueScanner> scanners = getScannersNoCompaction();
         ...
         seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
             && lazySeekEnabledGlobally, parallelSeekEnabled);
         ...
         resetKVHeap(scanners, store.getComparator());
       }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in HRegion#get
    {code}
         RegionScanner scanner = null;
         try {
           scanner = getScanner(scan);
           scanner.next(results);
         } finally {
           if (scanner != null)
             scanner.close();
         }
    {code}

    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 1.2.1, 1.1.5, 0.98.20
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.3.0, 1.2.2, 1.1.6, 0.98.21

    Attachments: HBASE-16032.patch, HBASE-16032_v2.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in {{HRegion#get}}
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}
    What's more, all exception thrown in the {{HRegion#getScanner}} path will cause scanner==null then memory leak, so we also need to handle this part.


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Mikhail Antonov (JIRA) at Jun 16, 2016 at 4:26 am
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Mikhail Antonov updated HBASE-16032:
    ------------------------------------
         Fix Version/s: (was: 1.3.0)
                        1.3.1
    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 1.2.1, 1.1.5, 0.98.20
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.2.2, 1.1.6, 1.3.1, 0.98.21

    Attachments: HBASE-16032.patch, HBASE-16032_v2.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in {{HRegion#get}}
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}
    What's more, all exception thrown in the {{HRegion#getScanner}} path will cause scanner==null then memory leak, so we also need to handle this part.


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Guanghao Zhang (JIRA) at Jun 16, 2016 at 4:44 am
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333071#comment-15333071 ]

    Guanghao Zhang commented on HBASE-16032:
    ----------------------------------------

    When initialing region scanner, HBASE-16012 add exception-thrown, too.
    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 1.2.1, 1.1.5, 0.98.20
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.2.2, 1.1.6, 1.3.1, 0.98.21

    Attachments: HBASE-16032.patch, HBASE-16032_v2.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in {{HRegion#get}}
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}
    What's more, all exception thrown in the {{HRegion#getScanner}} path will cause scanner==null then memory leak, so we also need to handle this part.


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Yu Li (JIRA) at Jun 16, 2016 at 5:12 am
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333110#comment-15333110 ]

    Yu Li commented on HBASE-16032:
    -------------------------------

    Thanks for the reference [~zghaobac], will do a code rebase if HBASE-16012 goes in first, vice versa I guess. Feel free to let me know if any other concern here.

    One thing to mention, that it will also throw IOE in {{this.filter.isFamilyEssential}} then neither {{scanners}} nor {{joinedScanners}} includes the already initialized scanner, and {{HRegion#handleException}} introduced in HBASE-16012 seems not taking this into account.
    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 1.2.1, 1.1.5, 0.98.20
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.2.2, 1.1.6, 1.3.1, 0.98.21

    Attachments: HBASE-16032.patch, HBASE-16032_v2.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in {{HRegion#get}}
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}
    What's more, all exception thrown in the {{HRegion#getScanner}} path will cause scanner==null then memory leak, so we also need to handle this part.


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Guanghao Zhang (JIRA) at Jun 16, 2016 at 5:22 am
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333121#comment-15333121 ]

    Guanghao Zhang commented on HBASE-16032:
    ----------------------------------------

    Nice catch. What do you think about the additionalScanners? I didn't see any use of it and not sure whether close these additionalScanners.
    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 1.2.1, 1.1.5, 0.98.20
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.2.2, 1.1.6, 1.3.1, 0.98.21

    Attachments: HBASE-16032.patch, HBASE-16032_v2.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in {{HRegion#get}}
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}
    What's more, all exception thrown in the {{HRegion#getScanner}} path will cause scanner==null then memory leak, so we also need to handle this part.


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Yu Li (JIRA) at Jun 16, 2016 at 5:39 am
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333145#comment-15333145 ]

    Yu Li commented on HBASE-16032:
    -------------------------------

    I think we should handle the {{additionalScanners}}, so we won't suffer if there's any newly-added invocation passing not-null values in. Thanks for the reminder and I'll add this into next patch.
    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 1.2.1, 1.1.5, 0.98.20
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.2.2, 1.1.6, 1.3.1, 0.98.21

    Attachments: HBASE-16032.patch, HBASE-16032_v2.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in {{HRegion#get}}
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}
    What's more, all exception thrown in the {{HRegion#getScanner}} path will cause scanner==null then memory leak, so we also need to handle this part.


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Yu Li (JIRA) at Jun 16, 2016 at 5:43 am
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Yu Li updated HBASE-16032:
    --------------------------
         Attachment: HBASE-16032_v3.patch
    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 1.2.1, 1.1.5, 0.98.20
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.2.2, 1.1.6, 1.3.1, 0.98.21

    Attachments: HBASE-16032.patch, HBASE-16032_v2.patch, HBASE-16032_v3.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in {{HRegion#get}}
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}
    What's more, all exception thrown in the {{HRegion#getScanner}} path will cause scanner==null then memory leak, so we also need to handle this part.


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Hadoop QA (JIRA) at Jun 16, 2016 at 5:49 am
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333156#comment-15333156 ]

    Hadoop QA commented on HBASE-16032:
    -----------------------------------
    (x) *{color:red}-1 overall{color}* |
    \\
    \\
    Vote || Subsystem || Runtime || Comment ||
    {color:red}-1{color} | {color:red} pre-patch {color} | {color:red} 0m 0s {color} | {color:red} JAVA_HOME is not defined. {color} |
    \\
    \\
    Subsystem || Report/Notes ||
    JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12810997/HBASE-16032_v3.patch |
    JIRA Issue | HBASE-16032 |
    Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile |
    uname | Linux pietas.apache.org 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
    Build tool | maven |
    Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh |
    git revision | master / f19f1d9 |
    Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/2245/console |
    Powered by | Apache Yetus 0.2.1 http://yetus.apache.org |

    This message was automatically generated.


    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 1.2.1, 1.1.5, 0.98.20
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.2.2, 1.1.6, 1.3.1, 0.98.21

    Attachments: HBASE-16032.patch, HBASE-16032_v2.patch, HBASE-16032_v3.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in {{HRegion#get}}
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}
    What's more, all exception thrown in the {{HRegion#getScanner}} path will cause scanner==null then memory leak, so we also need to handle this part.


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Yu Li (JIRA) at Jun 16, 2016 at 5:55 am
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333164#comment-15333164 ]

    Yu Li commented on HBASE-16032:
    -------------------------------

    Seems Jenkins have some problem now? Mind take a look sir? [~busbey]
    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 1.2.1, 1.1.5, 0.98.20
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.2.2, 1.1.6, 1.3.1, 0.98.21

    Attachments: HBASE-16032.patch, HBASE-16032_v2.patch, HBASE-16032_v3.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in {{HRegion#get}}
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}
    What's more, all exception thrown in the {{HRegion#getScanner}} path will cause scanner==null then memory leak, so we also need to handle this part.


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Hadoop QA (JIRA) at Jun 16, 2016 at 5:55 am
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333163#comment-15333163 ]

    Hadoop QA commented on HBASE-16032:
    -----------------------------------
    (x) *{color:red}-1 overall{color}* |
    \\
    \\
    Vote || Subsystem || Runtime || Comment ||
    {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
    {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} |
    {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} |
    {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 58s {color} | {color:green} master passed {color} |
    {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s {color} | {color:green} master passed with JDK v1.8.0 {color} |
    {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s {color} | {color:green} master passed with JDK v1.7.0_79 {color} |
    {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 54s {color} | {color:green} master passed {color} |
    {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} master passed {color} |
    {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 2s {color} | {color:green} master passed {color} |
    {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s {color} | {color:green} master passed with JDK v1.8.0 {color} |
    {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s {color} | {color:green} master passed with JDK v1.7.0_79 {color} |
    {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} |
    {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s {color} | {color:green} the patch passed with JDK v1.8.0 {color} |
    {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 42s {color} | {color:green} the patch passed {color} |
    {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
    {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} |
    {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 1s {color} | {color:green} the patch passed {color} |
    {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} |
    {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} |
    {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 27m 50s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
    {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 11s {color} | {color:green} the patch passed {color} |
    {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} the patch passed with JDK v1.8.0 {color} |
    {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
    {color:green}+1{color} | {color:green} unit {color} | {color:green} 80m 50s {color} | {color:green} hbase-server in the patch passed. {color} |
    {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s {color} | {color:green} Patch does not generate ASF License warnings. {color} |
    {color:black}{color} | {color:black} {color} | {color:black} 124m 36s {color} | {color:black} {color} |
    \\
    \\
    Subsystem || Report/Notes ||
    JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12810979/HBASE-16032_v2.patch |
    JIRA Issue | HBASE-16032 |
    Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile |
    uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
    Build tool | maven |
    Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh |
    git revision | master / 158568e |
    Default Java | 1.7.0_79 |
    Multi-JDK versions | /home/jenkins/tools/java/jdk1.8.0:1.8.0 /usr/local/jenkins/java/jdk1.7.0_79:1.7.0_79 |
    findbugs | v3.0.0 |
    Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/2240/testReport/ |
    modules | C: hbase-server U: hbase-server |
    Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/2240/console |
    Powered by | Apache Yetus 0.2.1 http://yetus.apache.org |

    This message was automatically generated.


    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 1.2.1, 1.1.5, 0.98.20
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.2.2, 1.1.6, 1.3.1, 0.98.21

    Attachments: HBASE-16032.patch, HBASE-16032_v2.patch, HBASE-16032_v3.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in {{HRegion#get}}
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}
    What's more, all exception thrown in the {{HRegion#getScanner}} path will cause scanner==null then memory leak, so we also need to handle this part.


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Yu Li (JIRA) at Jun 16, 2016 at 5:57 am
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333169#comment-15333169 ]

    Yu Li commented on HBASE-16032:
    -------------------------------

    I guess some configured server is missing the JAVA_HOME setting.
    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 1.2.1, 1.1.5, 0.98.20
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.2.2, 1.1.6, 1.3.1, 0.98.21

    Attachments: HBASE-16032.patch, HBASE-16032_v2.patch, HBASE-16032_v3.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in {{HRegion#get}}
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}
    What's more, all exception thrown in the {{HRegion#getScanner}} path will cause scanner==null then memory leak, so we also need to handle this part.


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Mikhail Antonov (JIRA) at Jun 16, 2016 at 9:10 am
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333413#comment-15333413 ]

    Mikhail Antonov commented on HBASE-16032:
    -----------------------------------------

    I changed the fix version to 1.3.1; stabilizing 1.3.0 now.
    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 1.2.1, 1.1.5, 0.98.20
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.2.2, 1.1.6, 1.3.1, 0.98.21

    Attachments: HBASE-16032.patch, HBASE-16032_v2.patch, HBASE-16032_v3.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in {{HRegion#get}}
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}
    What's more, all exception thrown in the {{HRegion#getScanner}} path will cause scanner==null then memory leak, so we also need to handle this part.


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Yu Li (JIRA) at Jun 16, 2016 at 11:44 am
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333621#comment-15333621 ]

    Yu Li commented on HBASE-16032:
    -------------------------------

    Ok, thanks for the message [~mantonov]
    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 1.2.1, 1.1.5, 0.98.20
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.2.2, 1.1.6, 1.3.1, 0.98.21

    Attachments: HBASE-16032.patch, HBASE-16032_v2.patch, HBASE-16032_v3.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in {{HRegion#get}}
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}
    What's more, all exception thrown in the {{HRegion#getScanner}} path will cause scanner==null then memory leak, so we also need to handle this part.


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Sean Busbey (JIRA) at Jun 16, 2016 at 12:56 pm
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Sean Busbey updated HBASE-16032:
    --------------------------------
         Fix Version/s: (was: 1.2.2)
                        1.2.3
    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 1.2.1, 1.1.5, 0.98.20
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.1.6, 1.3.1, 0.98.21, 1.2.3

    Attachments: HBASE-16032.patch, HBASE-16032_v2.patch, HBASE-16032_v3.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in {{HRegion#get}}
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}
    What's more, all exception thrown in the {{HRegion#getScanner}} path will cause scanner==null then memory leak, so we also need to handle this part.


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Hadoop QA (JIRA) at Jun 16, 2016 at 1:03 pm
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333726#comment-15333726 ]

    Hadoop QA commented on HBASE-16032:
    -----------------------------------
    (x) *{color:red}-1 overall{color}* |
    \\
    \\
    Vote || Subsystem || Runtime || Comment ||
    {color:red}-1{color} | {color:red} pre-patch {color} | {color:red} 0m 0s {color} | {color:red} JAVA_HOME is not defined. {color} |
    \\
    \\
    Subsystem || Report/Notes ||
    JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12810997/HBASE-16032_v3.patch |
    JIRA Issue | HBASE-16032 |
    Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile |
    uname | Linux asf910.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
    Build tool | maven |
    Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh |
    git revision | master / 6d02f36 |
    Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/2252/console |
    Powered by | Apache Yetus 0.2.1 http://yetus.apache.org |

    This message was automatically generated.


    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 1.2.1, 1.1.5, 0.98.20
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.1.6, 1.3.1, 0.98.21, 1.2.3

    Attachments: HBASE-16032.patch, HBASE-16032_v2.patch, HBASE-16032_v3.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in {{HRegion#get}}
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}
    What's more, all exception thrown in the {{HRegion#getScanner}} path will cause scanner==null then memory leak, so we also need to handle this part.


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Sean Busbey (JIRA) at Jun 16, 2016 at 1:03 pm
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333729#comment-15333729 ]

    Sean Busbey commented on HBASE-16032:
    -------------------------------------

    ...nevermind I see that it did.
    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 1.2.1, 1.1.5, 0.98.20
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.1.6, 1.3.1, 0.98.21, 1.2.3

    Attachments: HBASE-16032.patch, HBASE-16032_v2.patch, HBASE-16032_v3.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in {{HRegion#get}}
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}
    What's more, all exception thrown in the {{HRegion#getScanner}} path will cause scanner==null then memory leak, so we also need to handle this part.


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Sean Busbey (JIRA) at Jun 16, 2016 at 1:03 pm
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333727#comment-15333727 ]

    Sean Busbey commented on HBASE-16032:
    -------------------------------------

    looks like the worker node ubuntu-1 didn't have JDK 1.7 where we expect it. I had jenkins export a jdk1.7 variable and switched to that.

    I had it restart the test ([#2252|https://builds.apache.org/job/PreCommit-HBASE-Build/2252/]), which should pick up v3 again. let me know if it fails.
    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 1.2.1, 1.1.5, 0.98.20
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.1.6, 1.3.1, 0.98.21, 1.2.3

    Attachments: HBASE-16032.patch, HBASE-16032_v2.patch, HBASE-16032_v3.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in {{HRegion#get}}
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}
    What's more, all exception thrown in the {{HRegion#getScanner}} path will cause scanner==null then memory leak, so we also need to handle this part.


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Sean Busbey (JIRA) at Jun 16, 2016 at 1:23 pm
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333750#comment-15333750 ]

    Sean Busbey commented on HBASE-16032:
    -------------------------------------

    okay, [#2256|https://builds.apache.org/job/PreCommit-HBASE-Build/2256/] is going now. I really get HBASE-15882 taken care of.
    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 1.2.1, 1.1.5, 0.98.20
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.1.6, 1.3.1, 0.98.21, 1.2.3

    Attachments: HBASE-16032.patch, HBASE-16032_v2.patch, HBASE-16032_v3.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in {{HRegion#get}}
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}
    What's more, all exception thrown in the {{HRegion#getScanner}} path will cause scanner==null then memory leak, so we also need to handle this part.


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Sean Busbey (JIRA) at Jun 16, 2016 at 1:23 pm
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333750#comment-15333750 ]

    Sean Busbey edited comment on HBASE-16032 at 6/16/16 1:22 PM:
    --------------------------------------------------------------

    okay, [#2256|https://builds.apache.org/job/PreCommit-HBASE-Build/2256/] is going now. I really need to get HBASE-15882 taken care of.


    was (Author: busbey):
    okay, [#2256|https://builds.apache.org/job/PreCommit-HBASE-Build/2256/] is going now. I really get HBASE-15882 taken care of.
    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 1.2.1, 1.1.5, 0.98.20
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.1.6, 1.3.1, 0.98.21, 1.2.3

    Attachments: HBASE-16032.patch, HBASE-16032_v2.patch, HBASE-16032_v3.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in {{HRegion#get}}
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}
    What's more, all exception thrown in the {{HRegion#getScanner}} path will cause scanner==null then memory leak, so we also need to handle this part.


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Hadoop QA (JIRA) at Jun 16, 2016 at 3:32 pm
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334000#comment-15334000 ]

    Hadoop QA commented on HBASE-16032:
    -----------------------------------
    (x) *{color:red}-1 overall{color}* |
    \\
    \\
    Vote || Subsystem || Runtime || Comment ||
    {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
    {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} |
    {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} |
    {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 57s {color} | {color:green} master passed {color} |
    {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s {color} | {color:green} master passed with JDK v1.8.0 {color} |
    {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s {color} | {color:green} master passed with JDK v1.7.0_80 {color} |
    {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 58s {color} | {color:green} master passed {color} |
    {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} master passed {color} |
    {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 0s {color} | {color:green} master passed {color} |
    {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s {color} | {color:green} master passed with JDK v1.8.0 {color} |
    {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s {color} | {color:green} master passed with JDK v1.7.0_80 {color} |
    {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 46s {color} | {color:green} the patch passed {color} |
    {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s {color} | {color:green} the patch passed with JDK v1.8.0 {color} |
    {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s {color} | {color:green} the patch passed {color} |
    {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s {color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
    {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 35s {color} | {color:green} the patch passed {color} |
    {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 54s {color} | {color:green} the patch passed {color} |
    {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} |
    {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} |
    {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 29m 5s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
    {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 32s {color} | {color:green} the patch passed {color} |
    {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s {color} | {color:green} the patch passed with JDK v1.8.0 {color} |
    {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s {color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
    {color:green}+1{color} | {color:green} unit {color} | {color:green} 85m 43s {color} | {color:green} hbase-server in the patch passed. {color} |
    {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s {color} | {color:green} Patch does not generate ASF License warnings. {color} |
    {color:black}{color} | {color:black} {color} | {color:black} 130m 56s {color} | {color:black} {color} |
    \\
    \\
    Subsystem || Report/Notes ||
    JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12810997/HBASE-16032_v3.patch |
    JIRA Issue | HBASE-16032 |
    Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile |
    uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
    Build tool | maven |
    Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh |
    git revision | master / 6d02f36 |
    Default Java | 1.7.0_80 |
    Multi-JDK versions | /home/jenkins/tools/java/jdk1.8.0:1.8.0 /home/jenkins/jenkins-slave/tools/hudson.model.JDK/JDK_1.7_latest_:1.7.0_80 |
    findbugs | v3.0.0 |
    Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/2256/testReport/ |
    modules | C: hbase-server U: hbase-server |
    Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/2256/console |
    Powered by | Apache Yetus 0.2.1 http://yetus.apache.org |

    This message was automatically generated.


    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 1.2.1, 1.1.5, 0.98.20
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.1.6, 1.3.1, 0.98.21, 1.2.3

    Attachments: HBASE-16032.patch, HBASE-16032_v2.patch, HBASE-16032_v3.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in {{HRegion#get}}
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}
    What's more, all exception thrown in the {{HRegion#getScanner}} path will cause scanner==null then memory leak, so we also need to handle this part.


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Ted Yu (JIRA) at Jun 17, 2016 at 3:17 am
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15335299#comment-15335299 ]

    Ted Yu commented on HBASE-16032:
    --------------------------------

    Patch v3 look good.
    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 1.2.1, 1.1.5, 0.98.20
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.1.6, 1.3.1, 0.98.21, 1.2.3

    Attachments: HBASE-16032.patch, HBASE-16032_v2.patch, HBASE-16032_v3.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in {{HRegion#get}}
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}
    What's more, all exception thrown in the {{HRegion#getScanner}} path will cause scanner==null then memory leak, so we also need to handle this part.


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Yu Li (JIRA) at Jun 17, 2016 at 5:15 am
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15335429#comment-15335429 ]

    Yu Li commented on HBASE-16032:
    -------------------------------

    Thanks [~busbey]!
    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 1.2.1, 1.1.5, 0.98.20
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.1.6, 1.3.1, 0.98.21, 1.2.3

    Attachments: HBASE-16032.patch, HBASE-16032_v2.patch, HBASE-16032_v3.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in {{HRegion#get}}
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}
    What's more, all exception thrown in the {{HRegion#getScanner}} path will cause scanner==null then memory leak, so we also need to handle this part.


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Yu Li (JIRA) at Jun 17, 2016 at 5:17 am
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15335432#comment-15335432 ]

    Yu Li commented on HBASE-16032:
    -------------------------------

    Thanks for review [~ted_yu], will commit soon if no objections.
    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 1.2.1, 1.1.5, 0.98.20
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.1.6, 1.3.1, 0.98.21, 1.2.3

    Attachments: HBASE-16032.patch, HBASE-16032_v2.patch, HBASE-16032_v3.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in {{HRegion#get}}
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}
    What's more, all exception thrown in the {{HRegion#getScanner}} path will cause scanner==null then memory leak, so we also need to handle this part.


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Enis Soztutar (JIRA) at Jun 18, 2016 at 12:55 am
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15337365#comment-15337365 ]

    Enis Soztutar commented on HBASE-16032:
    ---------------------------------------

    With the following:
    {code}
      + // add observer at last to avoid memory leak when exception occurs during initialization
    {code}

    what happens if flush/compaction comes in between the heap reset and this. We will miss the new files or no?
    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 1.2.1, 1.1.5, 0.98.20
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.1.6, 1.3.1, 0.98.21, 1.2.3

    Attachments: HBASE-16032.patch, HBASE-16032_v2.patch, HBASE-16032_v3.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in {{HRegion#get}}
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}
    What's more, all exception thrown in the {{HRegion#getScanner}} path will cause scanner==null then memory leak, so we also need to handle this part.


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Yu Li (JIRA) at Jun 18, 2016 at 9:43 am
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15337686#comment-15337686 ]

    Yu Li commented on HBASE-16032:
    -------------------------------

    Yes, this is really something I neglected, thanks for pointing this out [~enis]. Will update the patch soon, and will wait for your +1 before commit
    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 1.2.1, 1.1.5, 0.98.20
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.1.6, 1.3.1, 0.98.21, 1.2.3

    Attachments: HBASE-16032.patch, HBASE-16032_v2.patch, HBASE-16032_v3.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in {{HRegion#get}}
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}
    What's more, all exception thrown in the {{HRegion#getScanner}} path will cause scanner==null then memory leak, so we also need to handle this part.


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Yu Li (JIRA) at Jun 18, 2016 at 9:59 am
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Yu Li updated HBASE-16032:
    --------------------------
         Attachment: HBASE-16032_v4.patch

    Update patch resolving the issue Enis pointed out, rollback to use the try-catch way in StoreScanner constructor to avoid missing the StoreFile update in resetKVHeap when flush/bulkload happens
    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 1.2.1, 1.1.5, 0.98.20
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.1.6, 1.3.1, 0.98.21, 1.2.3

    Attachments: HBASE-16032.patch, HBASE-16032_v2.patch, HBASE-16032_v3.patch, HBASE-16032_v4.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in {{HRegion#get}}
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}
    What's more, all exception thrown in the {{HRegion#getScanner}} path will cause scanner==null then memory leak, so we also need to handle this part.


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)
  • Hadoop QA (JIRA) at Jun 18, 2016 at 11:54 am
    [ https://issues.apache.org/jira/browse/HBASE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15337786#comment-15337786 ]

    Hadoop QA commented on HBASE-16032:
    -----------------------------------
    (x) *{color:red}-1 overall{color}* |
    \\
    \\
    Vote || Subsystem || Runtime || Comment ||
    {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
    {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} |
    {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} |
    {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 58s {color} | {color:green} master passed {color} |
    {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s {color} | {color:green} master passed with JDK v1.8.0 {color} |
    {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s {color} | {color:green} master passed with JDK v1.7.0_80 {color} |
    {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 51s {color} | {color:green} master passed {color} |
    {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} master passed {color} |
    {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 55s {color} | {color:red} hbase-server in master has 1 extant Findbugs warnings. {color} |
    {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} master passed with JDK v1.8.0 {color} |
    {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s {color} | {color:green} master passed with JDK v1.7.0_80 {color} |
    {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 44s {color} | {color:green} the patch passed {color} |
    {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s {color} | {color:green} the patch passed with JDK v1.8.0 {color} |
    {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s {color} | {color:green} the patch passed {color} |
    {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s {color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
    {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 34s {color} | {color:green} the patch passed {color} |
    {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 52s {color} | {color:green} the patch passed {color} |
    {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} |
    {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} |
    {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 26m 6s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
    {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 11s {color} | {color:green} the patch passed {color} |
    {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} the patch passed with JDK v1.8.0 {color} |
    {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s {color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
    {color:green}+1{color} | {color:green} unit {color} | {color:green} 73m 26s {color} | {color:green} hbase-server in the patch passed. {color} |
    {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s {color} | {color:green} Patch does not generate ASF License warnings. {color} |
    {color:black}{color} | {color:black} {color} | {color:black} 114m 38s {color} | {color:black} {color} |
    \\
    \\
    Subsystem || Report/Notes ||
    JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12811577/HBASE-16032_v4.patch |
    JIRA Issue | HBASE-16032 |
    Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile |
    uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
    Build tool | maven |
    Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh |
    git revision | master / 6717f0e |
    Default Java | 1.7.0_80 |
    Multi-JDK versions | /home/jenkins/tools/java/jdk1.8.0:1.8.0 /home/jenkins/jenkins-slave/tools/hudson.model.JDK/JDK_1.7_latest_:1.7.0_80 |
    findbugs | v3.0.0 |
    findbugs | https://builds.apache.org/job/PreCommit-HBASE-Build/2291/artifact/patchprocess/branch-findbugs-hbase-server-warnings.html |
    Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/2291/testReport/ |
    modules | C: hbase-server U: hbase-server |
    Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/2291/console |
    Powered by | Apache Yetus 0.2.1 http://yetus.apache.org |

    This message was automatically generated.


    Possible memory leak in StoreScanner
    ------------------------------------

    Key: HBASE-16032
    URL: https://issues.apache.org/jira/browse/HBASE-16032
    Project: HBase
    Issue Type: Bug
    Affects Versions: 1.2.1, 1.1.5, 0.98.20
    Reporter: Yu Li
    Assignee: Yu Li
    Fix For: 2.0.0, 1.1.6, 1.3.1, 0.98.21, 1.2.3

    Attachments: HBASE-16032.patch, HBASE-16032_v2.patch, HBASE-16032_v3.patch, HBASE-16032_v4.patch


    We observed frequent fullGC of RS in our production environment, and after analyzing the heapdump, we found large memory occupancy by HStore#changedReaderObservers, the map is surprisingly containing 7500w objects...
    After some debugging, I located some possible memory leak in StoreScanner constructor:
    {code}
    public StoreScanner(Store store, ScanInfo scanInfo, Scan scan, final NavigableSet<byte[]> columns,
    long readPt)
    throws IOException {
    this(store, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (columns != null && scan.isRaw()) {
    throw new DoNotRetryIOException("Cannot specify any column for a raw scan");
    }
    matcher = new ScanQueryMatcher(scan, scanInfo, columns,
    ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
    oldestUnexpiredTS, now, store.getCoprocessorHost());
    this.store.addChangedReaderObserver(this);
    // Pass columns to try to filter out unnecessary StoreFiles.
    List<KeyValueScanner> scanners = getScannersNoCompaction();
    ...
    seekScanners(scanners, matcher.getStartKey(), explicitColumnQuery
    && lazySeekEnabledGlobally, parallelSeekEnabled);
    ...
    resetKVHeap(scanners, store.getComparator());
    }
    {code}
    If there's any Exception thrown after {{this.store.addChangedReaderObserver(this)}}, the returned scanner might be null and there's no chance to remove the scanner from changedReaderObservers, like in {{HRegion#get}}
    {code}
    RegionScanner scanner = null;
    try {
    scanner = getScanner(scan);
    scanner.next(results);
    } finally {
    if (scanner != null)
    scanner.close();
    }
    {code}
    What's more, all exception thrown in the {{HRegion#getScanner}} path will cause scanner==null then memory leak, so we also need to handle this part.


    --
    This message was sent by Atlassian JIRA
    (v6.3.4#6332)

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupissues @
categorieshbase, hadoop
postedJun 15, '16 at 2:51p
activeJun 18, '16 at 11:54a
posts36
users1
websitehbase.apache.org

1 user in discussion

Hadoop QA (JIRA): 36 posts

People

Translate

site design / logo © 2021 Grokbase