FAQ
Set index interval at flush time based off count of keys and key attributes
---------------------------------------------------------------------------

Key: HBASE-1071
URL: https://issues.apache.org/jira/browse/HBASE-1071
Project: Hadoop HBase
Issue Type: Improvement
Reporter: stack

From Andrew Purtell note up on list:
"Later, maybe it would make sense to dynamically set the index
interval based on the distribution of cell sizes in the
mapfile at some future time, according to some parameterized
formula that could be adjusted with config variable(s). This
could be done during compaction. Would make sense to also
consider the distribution of key lengths. Or there could be
other similar tricks implemented to keep index sizes down. "

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Andrew Purtell (JIRA) at Dec 20, 2008 at 12:55 pm
    [ https://issues.apache.org/jira/browse/HBASE-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12658282#action_12658282 ]

    Andrew Purtell commented on HBASE-1071:
    ---------------------------------------

    One way to approach this is to estimate the size of the index on the heap by key count and lengths. Then consider a certain limit, and increase the index interval as necessary until the estimated index size is below threshold. This is simple and gives only one knob -- easy enough to tweak -- that gets directly to the effect wanted. Then a suitable default can be found through testing of some educated guesses with PE.
    Set index interval at flush time based off count of keys and key attributes
    ---------------------------------------------------------------------------

    Key: HBASE-1071
    URL: https://issues.apache.org/jira/browse/HBASE-1071
    Project: Hadoop HBase
    Issue Type: Improvement
    Reporter: stack

    From Andrew Purtell note up on list:
    "Later, maybe it would make sense to dynamically set the index
    interval based on the distribution of cell sizes in the
    mapfile at some future time, according to some parameterized
    formula that could be adjusted with config variable(s). This
    could be done during compaction. Would make sense to also
    consider the distribution of key lengths. Or there could be
    other similar tricks implemented to keep index sizes down. "
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at May 20, 2009 at 6:26 pm
    [ https://issues.apache.org/jira/browse/HBASE-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack resolved HBASE-1071.
    --------------------------

    Resolution: Invalid

    Resolving invalid. We no longer have intervals in our size-based index.
    Set index interval at flush time based off count of keys and key attributes
    ---------------------------------------------------------------------------

    Key: HBASE-1071
    URL: https://issues.apache.org/jira/browse/HBASE-1071
    Project: Hadoop HBase
    Issue Type: Improvement
    Reporter: stack

    From Andrew Purtell note up on list:
    "Later, maybe it would make sense to dynamically set the index
    interval based on the distribution of cell sizes in the
    mapfile at some future time, according to some parameterized
    formula that could be adjusted with config variable(s). This
    could be done during compaction. Would make sense to also
    consider the distribution of key lengths. Or there could be
    other similar tricks implemented to keep index sizes down. "
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categorieshbase, hadoop
postedDec 20, '08 at 4:00a
activeMay 20, '09 at 6:26p
posts3
users1
websitehbase.apache.org

1 user in discussion

stack (JIRA): 3 posts

People

Translate

site design / logo © 2022 Grokbase