FAQ
Client should cache region locations in an LRU structure
--------------------------------------------------------

Key: HBASE-407
URL: https://issues.apache.org/jira/browse/HBASE-407
Project: Hadoop HBase
Issue Type: Improvement
Components: client
Reporter: Bryan Duxbury
Priority: Minor


Instead of keeping the region locations cached client side in a TreeMap, we should use an LRU mechanism to help manage memory more dynamically.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Bryan Duxbury (JIRA) at Feb 8, 2008 at 8:01 pm
    [ https://issues.apache.org/jira/browse/HBASE-407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12567179#action_12567179 ]

    Bryan Duxbury commented on HBASE-407:
    -------------------------------------

    Talking with Stack, instead of an LRU mechanism, we could use Java soft references instead in the existing map structure. This would be a really easy way to make the cached region locations disappear when memory was short.
    Client should cache region locations in an LRU structure
    --------------------------------------------------------

    Key: HBASE-407
    URL: https://issues.apache.org/jira/browse/HBASE-407
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: client
    Reporter: Bryan Duxbury
    Priority: Minor

    Instead of keeping the region locations cached client side in a TreeMap, we should use an LRU mechanism to help manage memory more dynamically.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Feb 8, 2008 at 8:02 pm
    [ https://issues.apache.org/jira/browse/HBASE-407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack updated HBASE-407:
    ------------------------

    Priority: Major (was: Minor)

    Made this issue major priority rather than minor. Since HBASE-406 was committed, this issue becomes more important. Before 406, folks had a coarse means of managing the cache. Now they have none.
    Client should cache region locations in an LRU structure
    --------------------------------------------------------

    Key: HBASE-407
    URL: https://issues.apache.org/jira/browse/HBASE-407
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: client
    Reporter: Bryan Duxbury

    Instead of keeping the region locations cached client side in a TreeMap, we should use an LRU mechanism to help manage memory more dynamically.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Bryan Duxbury (JIRA) at Feb 8, 2008 at 8:51 pm
    [ https://issues.apache.org/jira/browse/HBASE-407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Bryan Duxbury reassigned HBASE-407:
    -----------------------------------

    Assignee: Bryan Duxbury
    Client should cache region locations in an LRU structure
    --------------------------------------------------------

    Key: HBASE-407
    URL: https://issues.apache.org/jira/browse/HBASE-407
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: client
    Reporter: Bryan Duxbury
    Assignee: Bryan Duxbury
    Attachments: 407.patch


    Instead of keeping the region locations cached client side in a TreeMap, we should use an LRU mechanism to help manage memory more dynamically.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Bryan Duxbury (JIRA) at Feb 8, 2008 at 8:51 pm
    [ https://issues.apache.org/jira/browse/HBASE-407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Bryan Duxbury updated HBASE-407:
    --------------------------------

    Attachment: 407.patch

    First shot. Tests pass locally, but almost assuredly do not cover nulled references code.
    Client should cache region locations in an LRU structure
    --------------------------------------------------------

    Key: HBASE-407
    URL: https://issues.apache.org/jira/browse/HBASE-407
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: client
    Reporter: Bryan Duxbury
    Attachments: 407.patch


    Instead of keeping the region locations cached client side in a TreeMap, we should use an LRU mechanism to help manage memory more dynamically.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Bryan Duxbury (JIRA) at Feb 8, 2008 at 9:48 pm
    [ https://issues.apache.org/jira/browse/HBASE-407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Bryan Duxbury updated HBASE-407:
    --------------------------------

    Status: Patch Available (was: Open)
    Client should cache region locations in an LRU structure
    --------------------------------------------------------

    Key: HBASE-407
    URL: https://issues.apache.org/jira/browse/HBASE-407
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: client
    Reporter: Bryan Duxbury
    Assignee: Bryan Duxbury
    Attachments: 407.patch


    Instead of keeping the region locations cached client side in a TreeMap, we should use an LRU mechanism to help manage memory more dynamically.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jim Kellerman (JIRA) at Feb 9, 2008 at 2:43 am
    [ https://issues.apache.org/jira/browse/HBASE-407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12567273#action_12567273 ]

    Jim Kellerman commented on HBASE-407:
    -------------------------------------

    Code review:

    Unused imports, remove:
    java.util.ArrayList
    java.util.Set

    Potential null pointer access at line 295

    Missing javadoc for public methods at lines 303, 308

    Line 629: unchecked conversion. Can this be fixed using Java generics?

    Client should cache region locations in an LRU structure
    --------------------------------------------------------

    Key: HBASE-407
    URL: https://issues.apache.org/jira/browse/HBASE-407
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: client
    Reporter: Bryan Duxbury
    Assignee: Bryan Duxbury
    Attachments: 407.patch


    Instead of keeping the region locations cached client side in a TreeMap, we should use an LRU mechanism to help manage memory more dynamically.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Bryan Duxbury (JIRA) at Feb 9, 2008 at 4:12 am
    [ https://issues.apache.org/jira/browse/HBASE-407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Bryan Duxbury updated HBASE-407:
    --------------------------------

    Attachment: 407-v2.patch

    Used a generic when creating my SoftReference on 629, and removed extra imports.

    As far as the unchecked null, do you mean on the server.close(scannerId) line? I don't agree. It's not going to try and close the scanner unless it managed to open one, which means that server has to be not null to get the scannerId in the first place.
    Client should cache region locations in an LRU structure
    --------------------------------------------------------

    Key: HBASE-407
    URL: https://issues.apache.org/jira/browse/HBASE-407
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: client
    Reporter: Bryan Duxbury
    Assignee: Bryan Duxbury
    Attachments: 407-v2.patch, 407.patch


    Instead of keeping the region locations cached client side in a TreeMap, we should use an LRU mechanism to help manage memory more dynamically.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Feb 11, 2008 at 5:58 pm
    [ https://issues.apache.org/jira/browse/HBASE-407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12567728#action_12567728 ]

    stack commented on HBASE-407:
    -----------------------------

    You can't use ReferenceMap from commons collections because you want a SortedMap and ReferenceMap ? Want me to make a little test to ensure this SoftReference value is actually working (I'd force the JVM to head for an OOME and add a ReferenceQueue implementation to the SortReference constructor and log its dequeues).

    Otherwise, for the future, would suggest that you put together declarations and assignments as in:

    {code}
    @@ -116,7 +115,8 @@

    private HRegionLocation rootRegionLocation;

    - private Map<Text, SortedMap<Text, HRegionLocation>> cachedRegionLocations;
    + private Map<Text, SortedMap<Text,
    + SoftReference<HRegionLocation>>> cachedRegionLocations;

    /**
    * constructor
    @@ -147,7 +147,8 @@
    this.masterChecked = false;

    this.cachedRegionLocations =
    - new ConcurrentHashMap<Text, SortedMap<Text, HRegionLocation>>();
    + new ConcurrentHashMap<Text, SortedMap<Text,
    + SoftReference<HRegionLocation>>>();
    {code}

    i.e. do private Map<Text, SortedMap<Text, SoftReference<HRegionLocation>>> cachedRegionLocations = new ConcurrentHashMap<Text, SortedMap<Text, SoftReference<HRegionLocation>>>();

    You might also be able to make cachedRegionLocations 'final' too.

    These are for sure safe:

    {code}
    + HRegionLocation rl = tableLocations.get(row).get();
    {code}

    and

    {code}
    + matchingRegions.get(matchingRegions.lastKey()).get();
    {code}

    Otherwise, patch looks great.


    Client should cache region locations in an LRU structure
    --------------------------------------------------------

    Key: HBASE-407
    URL: https://issues.apache.org/jira/browse/HBASE-407
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: client
    Reporter: Bryan Duxbury
    Assignee: Bryan Duxbury
    Attachments: 407-v2.patch, 407.patch


    Instead of keeping the region locations cached client side in a TreeMap, we should use an LRU mechanism to help manage memory more dynamically.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Bryan Duxbury (JIRA) at Feb 11, 2008 at 6:12 pm
    [ https://issues.apache.org/jira/browse/HBASE-407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12567733#action_12567733 ]

    Bryan Duxbury commented on HBASE-407:
    -------------------------------------

    Yes, I need a SortedMap, not a HashMap, to be able to search for the correct regions by name.

    I don't know if we need to spend very much time verifying that SoftRefence works as we expect it to, but it might make sense to push the client to an OOM situation so that we can make sure the logic of nulled references works correctly. I wouldn't call it a priority.

    I chose to declare the cachedRegionLocations and instantiate it separately for cosmetic reasons - who wants to see that many generics in one blob? Is there an advantage to making it final?

    Are you asking if it's definitely safe to pull references out of the tableLocations map? I'd say yes, because we check if the map contains them first. The SoftReference itself will always be not null at that point, it's the get() result that can be null.
    Client should cache region locations in an LRU structure
    --------------------------------------------------------

    Key: HBASE-407
    URL: https://issues.apache.org/jira/browse/HBASE-407
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: client
    Reporter: Bryan Duxbury
    Assignee: Bryan Duxbury
    Attachments: 407-v2.patch, 407.patch


    Instead of keeping the region locations cached client side in a TreeMap, we should use an LRU mechanism to help manage memory more dynamically.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Feb 11, 2008 at 6:24 pm
    [ https://issues.apache.org/jira/browse/HBASE-407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12567738#action_12567738 ]

    stack commented on HBASE-407:
    -----------------------------

    I think we should spend the time verifying SoftReferences are working the way you expect because then we can use them elsewhere with some confidence. I'll take a look B.

    Hey, if you're going to have a mess -- have it in one place rather than distributed (smile).

    Yeah, finals are good. They serve as hints to compiler and allows it take optimal paths that it would otherwise be wary making (I have no numbers -- just citing "the literature").
    Client should cache region locations in an LRU structure
    --------------------------------------------------------

    Key: HBASE-407
    URL: https://issues.apache.org/jira/browse/HBASE-407
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: client
    Reporter: Bryan Duxbury
    Assignee: Bryan Duxbury
    Attachments: 407-v2.patch, 407.patch


    Instead of keeping the region locations cached client side in a TreeMap, we should use an LRU mechanism to help manage memory more dynamically.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Bryan Duxbury (JIRA) at Feb 12, 2008 at 6:23 pm
    [ https://issues.apache.org/jira/browse/HBASE-407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Bryan Duxbury updated HBASE-407:
    --------------------------------

    Attachment: 407-v3.patch

    This patch adds a new utility class, SoftSortedMap, and uses that instead of SoftReferences directly. Otherwise, the functionality is unchanged.
    Client should cache region locations in an LRU structure
    --------------------------------------------------------

    Key: HBASE-407
    URL: https://issues.apache.org/jira/browse/HBASE-407
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: client
    Reporter: Bryan Duxbury
    Assignee: Bryan Duxbury
    Attachments: 407-v2.patch, 407-v3.patch, 407.patch


    Instead of keeping the region locations cached client side in a TreeMap, we should use an LRU mechanism to help manage memory more dynamically.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Feb 12, 2008 at 6:43 pm
    [ https://issues.apache.org/jira/browse/HBASE-407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12568277#action_12568277 ]

    stack commented on HBASE-407:
    -----------------------------

    Remove unnecessary SoftReference import in HConnetionManager

    Put declaration and assignment of cachedRegionLocations onto the one line (w/ final qualifier (if possible -- maybe its not... I can't see just looking at this patch)

    Does this comment and the nulling code that follows still apply --> // since we're using SoftReferences now, it's possible this possible ....
    Doesn't your new fancy SoftSortedMap handle all that stuff internally? (This comment and code is in two places -- make a method?)

    In the article on a cache map, there was also a test. Did you try your new SSM against the test?

    I wonder if this safe:

    + public V get(Object key) { |~
    + checkReferences(); |~
    + return internalMap.get(key).get(); |~
    + }

    Whats to stop the GC running between call to checkReference and execution of the internalMap.get?

    Same for the remove, contains, etc.

    Here is how its done in the apache commons collections ReferenceMap:

    {code}
    public Object get(Object key) { |
    purgeBeforeRead(); |/**
    Entry entry = getEntry(key); | * Annotation
    if (entry == null) { | */
    return null; |struct Annotation {
    } | 1:string family,
    return entry.getValue(); | 2:i32 index = -1,
    }
    {code}

    This logging will only prove annoying, I predict: + LOG.debug("Done cleaning up references."); ... and the one before it. At lease output some stats?

    Otherwise, patch looks great


    Client should cache region locations in an LRU structure
    --------------------------------------------------------

    Key: HBASE-407
    URL: https://issues.apache.org/jira/browse/HBASE-407
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: client
    Reporter: Bryan Duxbury
    Assignee: Bryan Duxbury
    Attachments: 407-v2.patch, 407-v3.patch, 407.patch


    Instead of keeping the region locations cached client side in a TreeMap, we should use an LRU mechanism to help manage memory more dynamically.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Bryan Duxbury (JIRA) at Feb 12, 2008 at 6:59 pm
    [ https://issues.apache.org/jira/browse/HBASE-407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Bryan Duxbury updated HBASE-407:
    --------------------------------

    Attachment: 407-v4.patch

    Removed unneeded imports and comments from HConnectionManager, moved declaration of cachedRegionLocations all to one place.

    Cleaned bookend debug logging from SoftSortedMap, and improved get and remove so that they can handle nulls better.

    If the GC nulls a reference between checkReferences and the actual get, it doesn't matter; the SoftValue will still be in the map, and get() will return null. However, I wasn't taking into account the situation when the key being checked for actually wasn't in the map, so that could have let to some NPEs.
    Client should cache region locations in an LRU structure
    --------------------------------------------------------

    Key: HBASE-407
    URL: https://issues.apache.org/jira/browse/HBASE-407
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: client
    Reporter: Bryan Duxbury
    Assignee: Bryan Duxbury
    Attachments: 407-v2.patch, 407-v3.patch, 407-v4.patch, 407.patch


    Instead of keeping the region locations cached client side in a TreeMap, we should use an LRU mechanism to help manage memory more dynamically.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Bryan Duxbury (JIRA) at Feb 12, 2008 at 9:39 pm
    [ https://issues.apache.org/jira/browse/HBASE-407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Bryan Duxbury updated HBASE-407:
    --------------------------------

    Attachment: 407-v5.patch

    This version adds the missing license header to SoftSortedMap.java and a javadoc class comment.
    Client should cache region locations in an LRU structure
    --------------------------------------------------------

    Key: HBASE-407
    URL: https://issues.apache.org/jira/browse/HBASE-407
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: client
    Reporter: Bryan Duxbury
    Assignee: Bryan Duxbury
    Attachments: 407-v2.patch, 407-v3.patch, 407-v4.patch, 407-v5.patch, 407.patch


    Instead of keeping the region locations cached client side in a TreeMap, we should use an LRU mechanism to help manage memory more dynamically.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Feb 12, 2008 at 9:49 pm
    [ https://issues.apache.org/jira/browse/HBASE-407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12568328#action_12568328 ]

    stack commented on HBASE-407:
    -----------------------------

    +1 on this patch (On IRC you said you'd try running the little test to see that things are working right). After that, commit it yourself -- now you can.
    Client should cache region locations in an LRU structure
    --------------------------------------------------------

    Key: HBASE-407
    URL: https://issues.apache.org/jira/browse/HBASE-407
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: client
    Reporter: Bryan Duxbury
    Assignee: Bryan Duxbury
    Attachments: 407-v2.patch, 407-v3.patch, 407-v4.patch, 407-v5.patch, 407.patch


    Instead of keeping the region locations cached client side in a TreeMap, we should use an LRU mechanism to help manage memory more dynamically.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Bryan Duxbury (JIRA) at Feb 12, 2008 at 11:33 pm
    [ https://issues.apache.org/jira/browse/HBASE-407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Bryan Duxbury updated HBASE-407:
    --------------------------------

    Attachment: 407-v6.patch

    This version contains a little test to show that SoftSortedMap actually decreases in size when there's memory usage. It probably needs to be varied by the heap size your system has.
    Client should cache region locations in an LRU structure
    --------------------------------------------------------

    Key: HBASE-407
    URL: https://issues.apache.org/jira/browse/HBASE-407
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: client
    Reporter: Bryan Duxbury
    Assignee: Bryan Duxbury
    Attachments: 407-v2.patch, 407-v3.patch, 407-v4.patch, 407-v5.patch, 407-v6.patch, 407.patch


    Instead of keeping the region locations cached client side in a TreeMap, we should use an LRU mechanism to help manage memory more dynamically.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Feb 12, 2008 at 11:35 pm
    [ https://issues.apache.org/jira/browse/HBASE-407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12568360#action_12568360 ]

    stack commented on HBASE-407:
    -----------------------------

    +1 on final version of patch (includes test).
    Client should cache region locations in an LRU structure
    --------------------------------------------------------

    Key: HBASE-407
    URL: https://issues.apache.org/jira/browse/HBASE-407
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: client
    Reporter: Bryan Duxbury
    Assignee: Bryan Duxbury
    Attachments: 407-v2.patch, 407-v3.patch, 407-v4.patch, 407-v5.patch, 407-v6.patch, 407.patch


    Instead of keeping the region locations cached client side in a TreeMap, we should use an LRU mechanism to help manage memory more dynamically.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Bryan Duxbury (JIRA) at Feb 13, 2008 at 1:23 am
    [ https://issues.apache.org/jira/browse/HBASE-407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Bryan Duxbury updated HBASE-407:
    --------------------------------

    Fix Version/s: 0.2.0
    Client should cache region locations in an LRU structure
    --------------------------------------------------------

    Key: HBASE-407
    URL: https://issues.apache.org/jira/browse/HBASE-407
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: client
    Reporter: Bryan Duxbury
    Assignee: Bryan Duxbury
    Fix For: 0.2.0

    Attachments: 407-v2.patch, 407-v3.patch, 407-v4.patch, 407-v5.patch, 407-v6.patch, 407.patch


    Instead of keeping the region locations cached client side in a TreeMap, we should use an LRU mechanism to help manage memory more dynamically.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Bryan Duxbury (JIRA) at Feb 13, 2008 at 7:33 pm
    [ https://issues.apache.org/jira/browse/HBASE-407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Bryan Duxbury updated HBASE-407:
    --------------------------------

    Resolution: Fixed
    Status: Resolved (was: Patch Available)

    I just committed this.
    Client should cache region locations in an LRU structure
    --------------------------------------------------------

    Key: HBASE-407
    URL: https://issues.apache.org/jira/browse/HBASE-407
    Project: Hadoop HBase
    Issue Type: Improvement
    Components: client
    Reporter: Bryan Duxbury
    Assignee: Bryan Duxbury
    Fix For: 0.2.0

    Attachments: 407-v2.patch, 407-v3.patch, 407-v4.patch, 407-v5.patch, 407-v6.patch, 407.patch


    Instead of keeping the region locations cached client side in a TreeMap, we should use an LRU mechanism to help manage memory more dynamically.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categorieshbase, hadoop
postedFeb 5, '08 at 1:19a
activeFeb 13, '08 at 7:33p
posts20
users1
websitehbase.apache.org

1 user in discussion

Bryan Duxbury (JIRA): 20 posts

People

Translate

site design / logo © 2022 Grokbase