FAQ
[ https://issues.apache.org/jira/browse/HBASE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12565594#action_12565594 ]

Billy Pearson commented on HBASE-69:
------------------------------------

Currently flushes and compaction's work good from what I can tell on my setup and test.

There are two area I have concerns in and have not got a chance to test.

1. hlogs: If I have a column family that does not get updates but say 1 out 100-250 updates then is that region going to hold up the removal of old hlogs waiting for a flush from this column. If this is so on column family could make a recovery take a long time if the region server fails. This is one of the reason besides memory usage I thank we need to leave/add back the option flusher to flush every 30-60 mins.

2. Splits: If I have a large region split in to two the compaction starts on the reload of the new splits. But say the columns take 50 mins to compact in that 50 min. If I get updates to cause a split again will this fail if the region has not finished compacting all the regions reference files from the original split.

Out side of the above concerns I have not noticed any bugs in the patch while flushing or compaction's all seams ok in that area.

[hbase] Make cache flush triggering less simplistic
---------------------------------------------------

Key: HBASE-69
URL: https://issues.apache.org/jira/browse/HBASE-69
Project: Hadoop HBase
Issue Type: Improvement
Components: regionserver
Reporter: stack
Assignee: Jim Kellerman
Attachments: patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt


When flusher runs -- its triggered when the sum of all Stores in a Region > a configurable max size -- we flush all Stores though a Store memcache might have but a few bytes.
I would think Stores should only dump their memcache disk if they have some substance.
The problem becomes more acute, the more families you have in a Region.
Possible behaviors would be to dump the biggest Store only, or only those Stores > 50% of max memcache size. Behavior would vary dependent on the prompt that provoked the flush. Would also log why the flush is running: optional or > max size.
This issue comes out of HADOOP-2621.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Billy Pearson (JIRA) at Feb 4, 2008 at 11:17 pm
    [ https://issues.apache.org/jira/browse/HBASE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12565596#action_12565596 ]

    Billy Pearson commented on HBASE-69:
    ------------------------------------

    Answer above yes it works with out the blocking.


    [hbase] Make cache flush triggering less simplistic
    ---------------------------------------------------

    Key: HBASE-69
    URL: https://issues.apache.org/jira/browse/HBASE-69
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: regionserver
    Reporter: stack
    Assignee: Jim Kellerman
    Attachments: patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt


    When flusher runs -- its triggered when the sum of all Stores in a Region > a configurable max size -- we flush all Stores though a Store memcache might have but a few bytes.
    I would think Stores should only dump their memcache disk if they have some substance.
    The problem becomes more acute, the more families you have in a Region.
    Possible behaviors would be to dump the biggest Store only, or only those Stores > 50% of max memcache size. Behavior would vary dependent on the prompt that provoked the flush. Would also log why the flush is running: optional or > max size.
    This issue comes out of HADOOP-2621.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Billy Pearson (JIRA) at Feb 5, 2008 at 1:55 am
    [ https://issues.apache.org/jira/browse/HBASE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12565624#action_12565624 ]

    Billy Pearson commented on HBASE-69:
    ------------------------------------

    example of hlog above

    Region server log
    {code}
    2008-02-04 19:36:50,813 FATAL org.apache.hadoop.hbase.HRegionServer: unable to report to master for 34439 milliseconds - aborting server
    {code}

    Master Server Log
    {code}
    2008-02-04 19:47:45,456 DEBUG org.apache.hadoop.hbase.HLog: Applied 15820000 edits
    {code}

    The file for the region
    oldlogfile.log = 3.46GB
    [hbase] Make cache flush triggering less simplistic
    ---------------------------------------------------

    Key: HBASE-69
    URL: https://issues.apache.org/jira/browse/HBASE-69
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: regionserver
    Reporter: stack
    Assignee: Jim Kellerman
    Attachments: patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt


    When flusher runs -- its triggered when the sum of all Stores in a Region > a configurable max size -- we flush all Stores though a Store memcache might have but a few bytes.
    I would think Stores should only dump their memcache disk if they have some substance.
    The problem becomes more acute, the more families you have in a Region.
    Possible behaviors would be to dump the biggest Store only, or only those Stores > 50% of max memcache size. Behavior would vary dependent on the prompt that provoked the flush. Would also log why the flush is running: optional or > max size.
    This issue comes out of HADOOP-2621.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Billy Pearson (JIRA) at Feb 6, 2008 at 6:23 pm
    [ https://issues.apache.org/jira/browse/HBASE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12566252#action_12566252 ]

    Billy Pearson commented on HBASE-69:
    ------------------------------------

    Might check out the compaction queue I had 1 out or 2 regions on one server split and reload the splits on the same server and my job finished with in a few mins after the split
    Only two more columns compacted after the split and the job finished. I now have column family's that have more then 3 map files and no compaction happening.

    Not sure if the stop of updates coming in or the split mess up the compaction queue or the split cleared it out.
    Before the split happened there was several columns requested compaction for both regions.

    And on reload of the split region we should have queue up compaction to so compaction could clear out the references files from the split.


    [hbase] Make cache flush triggering less simplistic
    ---------------------------------------------------

    Key: HBASE-69
    URL: https://issues.apache.org/jira/browse/HBASE-69
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: regionserver
    Reporter: stack
    Assignee: Jim Kellerman
    Fix For: 0.2.0

    Attachments: patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt


    When flusher runs -- its triggered when the sum of all Stores in a Region > a configurable max size -- we flush all Stores though a Store memcache might have but a few bytes.
    I would think Stores should only dump their memcache disk if they have some substance.
    The problem becomes more acute, the more families you have in a Region.
    Possible behaviors would be to dump the biggest Store only, or only those Stores > 50% of max memcache size. Behavior would vary dependent on the prompt that provoked the flush. Would also log why the flush is running: optional or > max size.
    This issue comes out of HADOOP-2621.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Billy Pearson (JIRA) at Feb 6, 2008 at 10:45 pm
    [ https://issues.apache.org/jira/browse/HBASE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12566384#action_12566384 ]

    Billy Pearson commented on HBASE-69:
    ------------------------------------

    I thank what is happening is while the compaction is working on a column any request to add a compaction to the queue is rejected but after the compaction starts new mapfiles may exceed the threshold of 3 map files. This leaves extra map files waiting for the next memcache flush to add a new compaction to the queue after the compaction completes on the region.

    I do not thank this is much of a problem there will be a memcache flush some time down the road starting a new compaction.
    [hbase] Make cache flush triggering less simplistic
    ---------------------------------------------------

    Key: HBASE-69
    URL: https://issues.apache.org/jira/browse/HBASE-69
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: regionserver
    Reporter: stack
    Assignee: Jim Kellerman
    Fix For: 0.2.0

    Attachments: patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt


    When flusher runs -- its triggered when the sum of all Stores in a Region > a configurable max size -- we flush all Stores though a Store memcache might have but a few bytes.
    I would think Stores should only dump their memcache disk if they have some substance.
    The problem becomes more acute, the more families you have in a Region.
    Possible behaviors would be to dump the biggest Store only, or only those Stores > 50% of max memcache size. Behavior would vary dependent on the prompt that provoked the flush. Would also log why the flush is running: optional or > max size.
    This issue comes out of HADOOP-2621.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Billy Pearson (JIRA) at Feb 10, 2008 at 10:55 pm
    [ https://issues.apache.org/jira/browse/HBASE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12567487#action_12567487 ]

    Billy Pearson commented on HBASE-69:
    ------------------------------------

    latest patch only the first column family is getting flushed, I got 3 heave inserted columns and only one is getting flushed and the region memory usage is rising fast.
    [hbase] Make cache flush triggering less simplistic
    ---------------------------------------------------

    Key: HBASE-69
    URL: https://issues.apache.org/jira/browse/HBASE-69
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: regionserver
    Reporter: stack
    Assignee: Jim Kellerman
    Fix For: 0.2.0

    Attachments: patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt


    When flusher runs -- its triggered when the sum of all Stores in a Region > a configurable max size -- we flush all Stores though a Store memcache might have but a few bytes.
    I would think Stores should only dump their memcache disk if they have some substance.
    The problem becomes more acute, the more families you have in a Region.
    Possible behaviors would be to dump the biggest Store only, or only those Stores > 50% of max memcache size. Behavior would vary dependent on the prompt that provoked the flush. Would also log why the flush is running: optional or > max size.
    This issue comes out of HADOOP-2621.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Billy Pearson (JIRA) at Feb 10, 2008 at 11:03 pm
    [ https://issues.apache.org/jira/browse/HBASE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12567489#action_12567489 ]

    Billy Pearson commented on HBASE-69:
    ------------------------------------

    shut down region server before it could crash and the flushes of 2 other columns where in the 200MB range and flush size was set at 64MB
    [hbase] Make cache flush triggering less simplistic
    ---------------------------------------------------

    Key: HBASE-69
    URL: https://issues.apache.org/jira/browse/HBASE-69
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: regionserver
    Reporter: stack
    Assignee: Jim Kellerman
    Fix For: 0.2.0

    Attachments: patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt


    When flusher runs -- its triggered when the sum of all Stores in a Region > a configurable max size -- we flush all Stores though a Store memcache might have but a few bytes.
    I would think Stores should only dump their memcache disk if they have some substance.
    The problem becomes more acute, the more families you have in a Region.
    Possible behaviors would be to dump the biggest Store only, or only those Stores > 50% of max memcache size. Behavior would vary dependent on the prompt that provoked the flush. Would also log why the flush is running: optional or > max size.
    This issue comes out of HADOOP-2621.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jim Kellerman (JIRA) at Feb 11, 2008 at 5:14 pm
    [ https://issues.apache.org/jira/browse/HBASE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12567713#action_12567713 ]

    Jim Kellerman commented on HBASE-69:
    ------------------------------------

    Billy,

    Thanks for the feedback. Clearly there is a bug here.
    [hbase] Make cache flush triggering less simplistic
    ---------------------------------------------------

    Key: HBASE-69
    URL: https://issues.apache.org/jira/browse/HBASE-69
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: regionserver
    Reporter: stack
    Assignee: Jim Kellerman
    Fix For: 0.2.0

    Attachments: patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt


    When flusher runs -- its triggered when the sum of all Stores in a Region > a configurable max size -- we flush all Stores though a Store memcache might have but a few bytes.
    I would think Stores should only dump their memcache disk if they have some substance.
    The problem becomes more acute, the more families you have in a Region.
    Possible behaviors would be to dump the biggest Store only, or only those Stores > 50% of max memcache size. Behavior would vary dependent on the prompt that provoked the flush. Would also log why the flush is running: optional or > max size.
    This issue comes out of HADOOP-2621.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jim Kellerman (JIRA) at Feb 11, 2008 at 5:16 pm
    [ https://issues.apache.org/jira/browse/HBASE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12567714#action_12567714 ]

    Jim Kellerman commented on HBASE-69:
    ------------------------------------

    Other things to do for this issue:
    - HStores need to inform region about cache sizes, etc so we can do region level memory usage accounting
    - Update migration tool so that it will check for unrecovered region server logs
    [hbase] Make cache flush triggering less simplistic
    ---------------------------------------------------

    Key: HBASE-69
    URL: https://issues.apache.org/jira/browse/HBASE-69
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: regionserver
    Reporter: stack
    Assignee: Jim Kellerman
    Fix For: 0.2.0

    Attachments: patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt


    When flusher runs -- its triggered when the sum of all Stores in a Region > a configurable max size -- we flush all Stores though a Store memcache might have but a few bytes.
    I would think Stores should only dump their memcache disk if they have some substance.
    The problem becomes more acute, the more families you have in a Region.
    Possible behaviors would be to dump the biggest Store only, or only those Stores > 50% of max memcache size. Behavior would vary dependent on the prompt that provoked the flush. Would also log why the flush is running: optional or > max size.
    This issue comes out of HADOOP-2621.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Billy Pearson (JIRA) at Feb 11, 2008 at 10:02 pm
    [ https://issues.apache.org/jira/browse/HBASE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12567836#action_12567836 ]

    Billy Pearson commented on HBASE-69:
    ------------------------------------

    I get Blocking updates messages but no flushing happens so everything gos on hold and no new transaction happen all my threads get blocked
    [hbase] Make cache flush triggering less simplistic
    ---------------------------------------------------

    Key: HBASE-69
    URL: https://issues.apache.org/jira/browse/HBASE-69
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: regionserver
    Reporter: stack
    Assignee: Jim Kellerman
    Fix For: 0.2.0

    Attachments: patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt


    When flusher runs -- its triggered when the sum of all Stores in a Region > a configurable max size -- we flush all Stores though a Store memcache might have but a few bytes.
    I would think Stores should only dump their memcache disk if they have some substance.
    The problem becomes more acute, the more families you have in a Region.
    Possible behaviors would be to dump the biggest Store only, or only those Stores > 50% of max memcache size. Behavior would vary dependent on the prompt that provoked the flush. Would also log why the flush is running: optional or > max size.
    This issue comes out of HADOOP-2621.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jim Kellerman (JIRA) at Feb 11, 2008 at 10:20 pm
    [ https://issues.apache.org/jira/browse/HBASE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12567846#action_12567846 ]

    Jim Kellerman commented on HBASE-69:
    ------------------------------------
    Billy Pearson - 11/Feb/08 02:00 PM
    I get Blocking updates messages but no flushing happens so everything gos on hold and no new
    transaction happen all my threads get blocked
    This logic is currently very unsophisticated. I'll try to come up with something better in the next patch. In the mean time, setting hbase.hregion.memcache.block.multiplier to (at least) the number of families should make things work.
    [hbase] Make cache flush triggering less simplistic
    ---------------------------------------------------

    Key: HBASE-69
    URL: https://issues.apache.org/jira/browse/HBASE-69
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: regionserver
    Reporter: stack
    Assignee: Jim Kellerman
    Fix For: 0.2.0

    Attachments: patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt


    When flusher runs -- its triggered when the sum of all Stores in a Region > a configurable max size -- we flush all Stores though a Store memcache might have but a few bytes.
    I would think Stores should only dump their memcache disk if they have some substance.
    The problem becomes more acute, the more families you have in a Region.
    Possible behaviors would be to dump the biggest Store only, or only those Stores > 50% of max memcache size. Behavior would vary dependent on the prompt that provoked the flush. Would also log why the flush is running: optional or > max size.
    This issue comes out of HADOOP-2621.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Billy Pearson (JIRA) at Feb 12, 2008 at 5:00 am
    [ https://issues.apache.org/jira/browse/HBASE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12567957#action_12567957 ]

    Billy Pearson commented on HBASE-69:
    ------------------------------------

    I am still seeing a hlog build up problem

    example I see these a lot after a job is done and the server are idle

    {code}
    2008-02-11 22:46:44,032 INFO org.apache.hadoop.hbase.HStore: Not flushing cache for 519281761/anchor because it has 0 entries
    2008-02-11 22:46:44,032 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 519281761/anchor in 1ms, sequenceid=131598230
    {code}

    I assume this is a optional flush which is good to have but if there is no entries can we update the columns current sequence id for that column to our current max sequence id so we can remove the old logs after the next hlog ?

    What I am seeing is low to no updated columns never get a memcache flush so there sequence id never changes (unless there is a split) and the old hlogs never get removed.

    [hbase] Make cache flush triggering less simplistic
    ---------------------------------------------------

    Key: HBASE-69
    URL: https://issues.apache.org/jira/browse/HBASE-69
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: regionserver
    Reporter: stack
    Assignee: Jim Kellerman
    Fix For: 0.2.0

    Attachments: patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt


    When flusher runs -- its triggered when the sum of all Stores in a Region > a configurable max size -- we flush all Stores though a Store memcache might have but a few bytes.
    I would think Stores should only dump their memcache disk if they have some substance.
    The problem becomes more acute, the more families you have in a Region.
    Possible behaviors would be to dump the biggest Store only, or only those Stores > 50% of max memcache size. Behavior would vary dependent on the prompt that provoked the flush. Would also log why the flush is running: optional or > max size.
    This issue comes out of HADOOP-2621.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jim Kellerman (JIRA) at Feb 12, 2008 at 5:46 am
    [ https://issues.apache.org/jira/browse/HBASE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12567968#action_12567968 ]

    Jim Kellerman commented on HBASE-69:
    ------------------------------------
    Billy Pearson - 11/Feb/08 08:59 PM
    I am still seeing a hlog build up problem

    example I see these a lot after a job is done and the server are idle

    2008-02-11 22:46:44,032 INFO org.apache.hadoop.hbase.HStore: Not flushing cache for
    519281761/anchor because it has 0 entries
    2008-02-11 22:46:44,032 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache
    flush for store 519281761/anchor in 1ms, sequenceid=131598230

    I assume this is a optional flush which is good to have but if there is no entries can we update the
    columns current sequence id for that column to our current max sequence id so we can remove the
    old logs after the next hlog ?
    Yes, this is an optional cache flush. The column's max sequence id is updated after a cache flush:

    {code}
    long sequenceId = log.startCacheFlush();
    ...
    this.log.completeCacheFlush(store.storeName, getTableDesc().getName(),
    sequenceId);
    {code}
    What I am seeing is low to no updated columns never get a memcache flush so there sequence id
    never changes (unless there is a split) and the old hlogs never get removed.
    Low to no updated columns will only get an optional cache flush, which will set their sequence
    number. However log files are not garbage collected until the log fills up and is rolled, which
    is not happening if the region server is idle. The only other time log files are cleaned up is when
    the region server shuts down.

    I have created issue HBASE-440 to add optional log rolling so that idle region servers will garbage collect old log files.


    [hbase] Make cache flush triggering less simplistic
    ---------------------------------------------------

    Key: HBASE-69
    URL: https://issues.apache.org/jira/browse/HBASE-69
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: regionserver
    Reporter: stack
    Assignee: Jim Kellerman
    Fix For: 0.2.0

    Attachments: patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt


    When flusher runs -- its triggered when the sum of all Stores in a Region > a configurable max size -- we flush all Stores though a Store memcache might have but a few bytes.
    I would think Stores should only dump their memcache disk if they have some substance.
    The problem becomes more acute, the more families you have in a Region.
    Possible behaviors would be to dump the biggest Store only, or only those Stores > 50% of max memcache size. Behavior would vary dependent on the prompt that provoked the flush. Would also log why the flush is running: optional or > max size.
    This issue comes out of HADOOP-2621.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Billy Pearson (JIRA) at Feb 12, 2008 at 9:57 am
    [ https://issues.apache.org/jira/browse/HBASE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12568027#action_12568027 ]

    Billy Pearson commented on HBASE-69:
    ------------------------------------

    I thank we are still missing something when I talk about hlog build up I mean there is never a removal of the hlogs unless there is a split or a shutdown.

    I have over 106 hlog on one server
    hbase.regionserver.maxlogentries = 250000
    The modification time of the oldest to the newest logs files are much greater then the 30 min option flush time set in hbase-default.xml

    [hbase] Make cache flush triggering less simplistic
    ---------------------------------------------------

    Key: HBASE-69
    URL: https://issues.apache.org/jira/browse/HBASE-69
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: regionserver
    Reporter: stack
    Assignee: Jim Kellerman
    Fix For: 0.2.0

    Attachments: patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt


    When flusher runs -- its triggered when the sum of all Stores in a Region > a configurable max size -- we flush all Stores though a Store memcache might have but a few bytes.
    I would think Stores should only dump their memcache disk if they have some substance.
    The problem becomes more acute, the more families you have in a Region.
    Possible behaviors would be to dump the biggest Store only, or only those Stores > 50% of max memcache size. Behavior would vary dependent on the prompt that provoked the flush. Would also log why the flush is running: optional or > max size.
    This issue comes out of HADOOP-2621.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jim Kellerman (JIRA) at Feb 12, 2008 at 2:25 pm
    [ https://issues.apache.org/jira/browse/HBASE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12568140#action_12568140 ]

    Jim Kellerman commented on HBASE-69:
    ------------------------------------
    Billy Pearson - 12/Feb/08 01:56 AM
    I thank we are still missing something when I talk about hlog build up I mean there is never a removal of
    the hlogs unless there is a split or a shutdown.
    As I explained previously, the only time that HLogs are garbage collected, is when either the log fills, or
    when the region server shuts down. Since there are not many updates coming in in the situation you
    described, the log will not fill and consequently old logs will not be garbage collected.

    HBASE-440, will add optional log rolling, and log rolling when a region is closed in addition to the current
    log rolling when the log fills. Both of these new events will cause logs to be garbage collected. Region
    server shutdown will delete all the old logs.

    [hbase] Make cache flush triggering less simplistic
    ---------------------------------------------------

    Key: HBASE-69
    URL: https://issues.apache.org/jira/browse/HBASE-69
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: regionserver
    Reporter: stack
    Assignee: Jim Kellerman
    Fix For: 0.2.0

    Attachments: patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt


    When flusher runs -- its triggered when the sum of all Stores in a Region > a configurable max size -- we flush all Stores though a Store memcache might have but a few bytes.
    I would think Stores should only dump their memcache disk if they have some substance.
    The problem becomes more acute, the more families you have in a Region.
    Possible behaviors would be to dump the biggest Store only, or only those Stores > 50% of max memcache size. Behavior would vary dependent on the prompt that provoked the flush. Would also log why the flush is running: optional or > max size.
    This issue comes out of HADOOP-2621.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Billy Pearson (JIRA) at Feb 12, 2008 at 4:33 pm
    [ https://issues.apache.org/jira/browse/HBASE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12568200#action_12568200 ]

    Billy Pearson commented on HBASE-69:
    ------------------------------------

    sorry my last post I am talking about while the cluster is busy. The hlogs have > 4 hours between them that would mean that I am rolling the hlog many times after several optional memcash flushes while jobs are running.

    Might check to make sure the option flush is updating the sequence id even if there is 0 entries and the flush is skipped.
    The problem could be from the code that removed the code before I used to see 0 logs removed while having debug turned on and hlog rolling but I do not see these messages after this patches.
    [hbase] Make cache flush triggering less simplistic
    ---------------------------------------------------

    Key: HBASE-69
    URL: https://issues.apache.org/jira/browse/HBASE-69
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: regionserver
    Reporter: stack
    Assignee: Jim Kellerman
    Fix For: 0.2.0

    Attachments: patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt


    When flusher runs -- its triggered when the sum of all Stores in a Region > a configurable max size -- we flush all Stores though a Store memcache might have but a few bytes.
    I would think Stores should only dump their memcache disk if they have some substance.
    The problem becomes more acute, the more families you have in a Region.
    Possible behaviors would be to dump the biggest Store only, or only those Stores > 50% of max memcache size. Behavior would vary dependent on the prompt that provoked the flush. Would also log why the flush is running: optional or > max size.
    This issue comes out of HADOOP-2621.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Feb 12, 2008 at 5:07 pm
    [ https://issues.apache.org/jira/browse/HBASE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12568219#action_12568219 ]

    stack commented on HBASE-69:
    ----------------------------

    Why when only one store, blocking is disabled? (Whats to stop update rate from overwhelming flusher in one-store case?)
    [hbase] Make cache flush triggering less simplistic
    ---------------------------------------------------

    Key: HBASE-69
    URL: https://issues.apache.org/jira/browse/HBASE-69
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: regionserver
    Reporter: stack
    Assignee: Jim Kellerman
    Fix For: 0.2.0

    Attachments: patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt


    When flusher runs -- its triggered when the sum of all Stores in a Region > a configurable max size -- we flush all Stores though a Store memcache might have but a few bytes.
    I would think Stores should only dump their memcache disk if they have some substance.
    The problem becomes more acute, the more families you have in a Region.
    Possible behaviors would be to dump the biggest Store only, or only those Stores > 50% of max memcache size. Behavior would vary dependent on the prompt that provoked the flush. Would also log why the flush is running: optional or > max size.
    This issue comes out of HADOOP-2621.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Billy Pearson (JIRA) at Feb 12, 2008 at 6:13 pm
    [ https://issues.apache.org/jira/browse/HBASE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12568264#action_12568264 ]

    Billy Pearson commented on HBASE-69:
    ------------------------------------

    ok I am seeing the rolling messages now and old logs are getting removed. going to update to latest patch and try again
    [hbase] Make cache flush triggering less simplistic
    ---------------------------------------------------

    Key: HBASE-69
    URL: https://issues.apache.org/jira/browse/HBASE-69
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: regionserver
    Reporter: stack
    Assignee: Jim Kellerman
    Fix For: 0.2.0

    Attachments: patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt


    When flusher runs -- its triggered when the sum of all Stores in a Region > a configurable max size -- we flush all Stores though a Store memcache might have but a few bytes.
    I would think Stores should only dump their memcache disk if they have some substance.
    The problem becomes more acute, the more families you have in a Region.
    Possible behaviors would be to dump the biggest Store only, or only those Stores > 50% of max memcache size. Behavior would vary dependent on the prompt that provoked the flush. Would also log why the flush is running: optional or > max size.
    This issue comes out of HADOOP-2621.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jim Kellerman (JIRA) at Feb 12, 2008 at 6:23 pm
    [ https://issues.apache.org/jira/browse/HBASE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12568270#action_12568270 ]

    Jim Kellerman commented on HBASE-69:
    ------------------------------------
    Billy Pearson - 12/Feb/08 08:32 AM
    sorry my last post I am talking about while the cluster is busy. The hlogs have > 4 hours between them
    that would mean that I am rolling the hlog many times after several optional memcash flushes while
    jobs are running.
    If the hlogs have > 4 hours between them then you can only expect a garbage collection once every
    4 hours.
    Might check to make sure the option flush is updating the sequence id even if there
    is 0 entries and the flush is skipped. The problem could be from the code that removed the code before
    I used to see 0 logs removed while having debug turned on and hlog rolling but I do not see these
    messages after this patches.
    I can assure you that the sequence id is getting updated and flushed to the log with an optional cache
    fllush even with no entries.

    In this patch and in trunk, optional flushes are put on the flush queue just like requested flushes.
    (See HRegionServer$Flusher.run)

    When their queue entry triggers,
    - in trunk: HRegion.flushCache() is called.
    - in the patch HRegion.flushCache(HStore) is called

    They both end up in HRegion.internalFlushcache, which:
    - first obtains a sequence Id for the log.
    - calls HStore.flushCache(sequenceId)

    Even if the cache is not flushed, in HRegion.internalFlushcache, both trunk and the patch call
    HLog.completeCacheFlush which writes the new sequence id (for the region in the case of trunk
    or for the store in the case of the patch).

    However no log files are removed until the current log is rolled (closed and a new one opened).

    [hbase] Make cache flush triggering less simplistic
    ---------------------------------------------------

    Key: HBASE-69
    URL: https://issues.apache.org/jira/browse/HBASE-69
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: regionserver
    Reporter: stack
    Assignee: Jim Kellerman
    Fix For: 0.2.0

    Attachments: patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt


    When flusher runs -- its triggered when the sum of all Stores in a Region > a configurable max size -- we flush all Stores though a Store memcache might have but a few bytes.
    I would think Stores should only dump their memcache disk if they have some substance.
    The problem becomes more acute, the more families you have in a Region.
    Possible behaviors would be to dump the biggest Store only, or only those Stores > 50% of max memcache size. Behavior would vary dependent on the prompt that provoked the flush. Would also log why the flush is running: optional or > max size.
    This issue comes out of HADOOP-2621.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jim Kellerman (JIRA) at Feb 12, 2008 at 6:35 pm
    [ https://issues.apache.org/jira/browse/HBASE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12568275#action_12568275 ]

    Jim Kellerman commented on HBASE-69:
    ------------------------------------
    stack - 12/Feb/08 09:07 AM
    Why when only one store, blocking is disabled? (Whats to stop update rate from overwhelming flusher
    in one-store case?)
    If we block based on the number of outstanding cache flush requests (which is currently how it works for
    more than one store), then we cannot take updates while that single store is flushed, which didn't seem
    like a good idea.

    What I wanted to do was to put a policy in place for this patch that was "good enough" until more
    sophisticated policies (which balance the memory in the region server) were done in HBASE-70,
    which is going to address region server memory management.

    [hbase] Make cache flush triggering less simplistic
    ---------------------------------------------------

    Key: HBASE-69
    URL: https://issues.apache.org/jira/browse/HBASE-69
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: regionserver
    Reporter: stack
    Assignee: Jim Kellerman
    Fix For: 0.2.0

    Attachments: patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt


    When flusher runs -- its triggered when the sum of all Stores in a Region > a configurable max size -- we flush all Stores though a Store memcache might have but a few bytes.
    I would think Stores should only dump their memcache disk if they have some substance.
    The problem becomes more acute, the more families you have in a Region.
    Possible behaviors would be to dump the biggest Store only, or only those Stores > 50% of max memcache size. Behavior would vary dependent on the prompt that provoked the flush. Would also log why the flush is running: optional or > max size.
    This issue comes out of HADOOP-2621.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Feb 12, 2008 at 6:51 pm
    [ https://issues.apache.org/jira/browse/HBASE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12568283#action_12568283 ]

    stack commented on HBASE-69:
    ----------------------------

    If memcache(s) are full, we should block updates. Otherwise, this patch is a step backward.
    [hbase] Make cache flush triggering less simplistic
    ---------------------------------------------------

    Key: HBASE-69
    URL: https://issues.apache.org/jira/browse/HBASE-69
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: regionserver
    Reporter: stack
    Assignee: Jim Kellerman
    Fix For: 0.2.0

    Attachments: patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt


    When flusher runs -- its triggered when the sum of all Stores in a Region > a configurable max size -- we flush all Stores though a Store memcache might have but a few bytes.
    I would think Stores should only dump their memcache disk if they have some substance.
    The problem becomes more acute, the more families you have in a Region.
    Possible behaviors would be to dump the biggest Store only, or only those Stores > 50% of max memcache size. Behavior would vary dependent on the prompt that provoked the flush. Would also log why the flush is running: optional or > max size.
    This issue comes out of HADOOP-2621.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Billy Pearson (JIRA) at Feb 13, 2008 at 6:02 am
    [ https://issues.apache.org/jira/browse/HBASE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12568440#action_12568440 ]

    Billy Pearson commented on HBASE-69:
    ------------------------------------

    I get these

    {code}
    INFO org.apache.hadoop.hbase.HRegion: Blocking updates for 'IPC Server handler 21 on 60020': cache flushes requested 8 is >= max flush request count 8
    {code}

    then everything stops not long after a job starts an no more processing everything is blocked I waited 3 hours to see if the option flush would unblock but it does not seam to be happening
    might take a look at the memcache blocker again
    [hbase] Make cache flush triggering less simplistic
    ---------------------------------------------------

    Key: HBASE-69
    URL: https://issues.apache.org/jira/browse/HBASE-69
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: regionserver
    Reporter: stack
    Assignee: Jim Kellerman
    Fix For: 0.2.0

    Attachments: patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt


    When flusher runs -- its triggered when the sum of all Stores in a Region > a configurable max size -- we flush all Stores though a Store memcache might have but a few bytes.
    I would think Stores should only dump their memcache disk if they have some substance.
    The problem becomes more acute, the more families you have in a Region.
    Possible behaviors would be to dump the biggest Store only, or only those Stores > 50% of max memcache size. Behavior would vary dependent on the prompt that provoked the flush. Would also log why the flush is running: optional or > max size.
    This issue comes out of HADOOP-2621.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jim Kellerman (JIRA) at Feb 13, 2008 at 6:07 pm
    [ https://issues.apache.org/jira/browse/HBASE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12568651#action_12568651 ]

    Jim Kellerman commented on HBASE-69:
    ------------------------------------

    Billy,

    Clearly there is a problem. Can you post a bit more of that region server's log (before the blocking message)? Maybe that will point me in the right direction.
    [hbase] Make cache flush triggering less simplistic
    ---------------------------------------------------

    Key: HBASE-69
    URL: https://issues.apache.org/jira/browse/HBASE-69
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: regionserver
    Reporter: stack
    Assignee: Jim Kellerman
    Fix For: 0.2.0

    Attachments: patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt


    When flusher runs -- its triggered when the sum of all Stores in a Region > a configurable max size -- we flush all Stores though a Store memcache might have but a few bytes.
    I would think Stores should only dump their memcache disk if they have some substance.
    The problem becomes more acute, the more families you have in a Region.
    Possible behaviors would be to dump the biggest Store only, or only those Stores > 50% of max memcache size. Behavior would vary dependent on the prompt that provoked the flush. Would also log why the flush is running: optional or > max size.
    This issue comes out of HADOOP-2621.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Billy Pearson (JIRA) at Feb 14, 2008 at 9:59 am
    [ https://issues.apache.org/jira/browse/HBASE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12568849#action_12568849 ]

    Billy Pearson commented on HBASE-69:
    ------------------------------------

    I will have to run the test again with the patch I remove logs often as the debug takes up a lots of GB when running large jobs.
    [hbase] Make cache flush triggering less simplistic
    ---------------------------------------------------

    Key: HBASE-69
    URL: https://issues.apache.org/jira/browse/HBASE-69
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: regionserver
    Reporter: stack
    Assignee: Jim Kellerman
    Fix For: 0.2.0

    Attachments: patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt


    When flusher runs -- its triggered when the sum of all Stores in a Region > a configurable max size -- we flush all Stores though a Store memcache might have but a few bytes.
    I would think Stores should only dump their memcache disk if they have some substance.
    The problem becomes more acute, the more families you have in a Region.
    Possible behaviors would be to dump the biggest Store only, or only those Stores > 50% of max memcache size. Behavior would vary dependent on the prompt that provoked the flush. Would also log why the flush is running: optional or > max size.
    This issue comes out of HADOOP-2621.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jim Kellerman (JIRA) at Feb 21, 2008 at 3:14 am
    [ https://issues.apache.org/jira/browse/HBASE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12570930#action_12570930 ]

    Jim Kellerman commented on HBASE-69:
    ------------------------------------

    Ok, I finally have a patch for this issue that works !

    It started at:
    2008-02-21 01:17:22,401 DEBUG [regionserver/0:0:0:0:0:0:0:0:8020] hbase.HRegionServer(1070): Done telling master we are up

    and ended with:

    2008-02-21 02:50:45,783 INFO [Thread-8] hbase.HRegionServer$ShutdownThread(158): Starting shutdown thread.
    2008-02-21 02:50:45,783 INFO [Thread-8] hbase.HRegionServer$ShutdownThread(163): Shutdown thread complete

    [hbase] Make cache flush triggering less simplistic
    ---------------------------------------------------

    Key: HBASE-69
    URL: https://issues.apache.org/jira/browse/HBASE-69
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: regionserver
    Reporter: stack
    Assignee: Jim Kellerman
    Fix For: 0.2.0

    Attachments: patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt


    When flusher runs -- its triggered when the sum of all Stores in a Region > a configurable max size -- we flush all Stores though a Store memcache might have but a few bytes.
    I would think Stores should only dump their memcache disk if they have some substance.
    The problem becomes more acute, the more families you have in a Region.
    Possible behaviors would be to dump the biggest Store only, or only those Stores > 50% of max memcache size. Behavior would vary dependent on the prompt that provoked the flush. Would also log why the flush is running: optional or > max size.
    This issue comes out of HADOOP-2621.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Feb 22, 2008 at 8:18 pm
    [ https://issues.apache.org/jira/browse/HBASE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12571574#action_12571574 ]

    stack commented on HBASE-69:
    ----------------------------

    I'd suggest making a new patch that has in it the improvements that are contained in this patch and lets commit that (probably in a new issue). Here's some comments:

    + +1 on FSUtils changes.
    + On migrations, would suggest that you emit logging of what step you are currently in: e.g. "moving to version 2". Will help should a step fail.
    + This is not a change made by you in Migrate but I wonder if the making fully qualified works w/ our new hbase.rootdir format where you need to specify the filesystem to use:
    {code}
    - Path rootdir =
    - fs.makeQualified(new Path(this.conf.get(HConstants.HBASE_DIR)));
    + rootdir = fs.makeQualified(new Path(this.conf.get(HConstants.HBASE_DIR)));
    {code}

    + +1 on new names of methods in Migrate.java
    + +1 on IdentityTableReduce.java changes
    + What about the HSK changes? Should we add them? They get rid of TextSequence. Do you think this change for the better?
    + Whats going on here?
    {code}
    - final AtomicInteger activeScanners = new AtomicInteger(0);
    + volatile AtomicInteger activeScanners = new AtomicInteger(0);
    {code}
    Why make reference to 'activeScanners' volatile rather than final? Once created as part of class creation, its not going to change? Volatile doesn't effect the content of the AtomicInteger, does it?
    + +1 on change to TestMigrate.java. Fixes currently broken migration scripts (They are failing on hudson).

    Rest of the patch seems like all or nothing.

    [hbase] Make cache flush triggering less simplistic
    ---------------------------------------------------

    Key: HBASE-69
    URL: https://issues.apache.org/jira/browse/HBASE-69
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: regionserver
    Reporter: stack
    Assignee: Jim Kellerman
    Fix For: 0.2.0

    Attachments: patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt


    When flusher runs -- its triggered when the sum of all Stores in a Region > a configurable max size -- we flush all Stores though a Store memcache might have but a few bytes.
    I would think Stores should only dump their memcache disk if they have some substance.
    The problem becomes more acute, the more families you have in a Region.
    Possible behaviors would be to dump the biggest Store only, or only those Stores > 50% of max memcache size. Behavior would vary dependent on the prompt that provoked the flush. Would also log why the flush is running: optional or > max size.
    This issue comes out of HADOOP-2621.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categorieshbase, hadoop
postedFeb 4, '08 at 11:15p
activeFeb 22, '08 at 8:18p
posts26
users1
websitehbase.apache.org

1 user in discussion

stack (JIRA): 26 posts

People

Translate

site design / logo © 2022 Grokbase