Grokbase Groups HBase dev April 2010
FAQ
Slowly changing column family or table could cause accumulation of logs & substantially increase recovery times
---------------------------------------------------------------------------------------------------------------

Key: HBASE-2477
URL: https://issues.apache.org/jira/browse/HBASE-2477
Project: Hadoop HBase
Issue Type: Bug
Reporter: Kannan Muthukkaruppan


Memstore flushes are triggered today if a memstore exceeds a certain size or there is memory pressure. However, there is no timer based flush for a memstore. This means a single column family or table getting a very slow rate of writes could hold up old HLogs from getting reclaimed for long periods of time-- which in turn increases recovery time for a failed region server since there are a lot more logs to process.

META is an example of a table which is likely to get very few writes. But even if we special cased META somehow, it wouldn't be good enough, since an application could genuinely have a mix of slow and fast changing tables or column families.

What about also triggering flushes on a timer (in addition to the current mechanism) to bound recovery times?


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Kannan Muthukkaruppan (JIRA) at Apr 22, 2010 at 9:07 pm
    [ https://issues.apache.org/jira/browse/HBASE-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Kannan Muthukkaruppan resolved HBASE-2477.
    ------------------------------------------

    Resolution: Not A Problem
    Slowly changing column family or table could cause accumulation of logs & substantially increase recovery times
    ---------------------------------------------------------------------------------------------------------------

    Key: HBASE-2477
    URL: https://issues.apache.org/jira/browse/HBASE-2477
    Project: Hadoop HBase
    Issue Type: Bug
    Reporter: Kannan Muthukkaruppan

    Memstore flushes are triggered today if a memstore exceeds a certain size or there is memory pressure. However, there is no timer based flush for a memstore. This means a single column family or table getting a very slow rate of writes could hold up old HLogs from getting reclaimed for long periods of time-- which in turn increases recovery time for a failed region server since there are a lot more logs to process.
    META is an example of a table which is likely to get very few writes. But even if we special cased META somehow, it wouldn't be good enough, since an application could genuinely have a mix of slow and fast changing tables or column families.
    What about also triggering flushes on a timer (in addition to the current mechanism) to bound recovery times?
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categorieshbase, hadoop
postedApr 22, '10 at 6:02p
activeApr 22, '10 at 9:07p
posts2
users1
websitehbase.apache.org

1 user in discussion

Kannan Muthukkaruppan (JIRA): 2 posts

People

Translate

site design / logo © 2022 Grokbase