FAQ
In o.a.h.io.Text - the clear method currently just resets length to 0,
while not doing anything about the bytes internally.

Curious to know the thoughts behind the decision (to let the internal
bytes to be reused for future appends vs. memory leaks due to not
clearing them ) ? Thanks.

$ svn diff
Index: src/java/org/apache/hadoop/io/Text.java
===================================================================
--- src/java/org/apache/hadoop/io/Text.java (revision 894545)
+++ src/java/org/apache/hadoop/io/Text.java (working copy)
@@ -224,6 +224,7 @@
*/
public void clear() {
length = 0;
+ bytes = EMPTY_BYTES;
}

Search Discussions

  • Owen O'Malley at Jan 1, 2010 at 7:04 am

    On Dec 30, 2009, at 12:36 AM, Kay Kay wrote:

    In o.a.h.io.Text - the clear method currently just resets length to
    0, while not doing anything about the bytes internally.

    Curious to know the thoughts behind the decision (to let the
    internal bytes to be reused for future appends vs. memory leaks due
    to not clearing them ) ?
    The byte array that backs up the Text object is always reused. It
    might make sense to have a setCapacity method on Text that is similar
    to BytesWritable's. With such a method, it would be possible to shrink
    the size of the backing array.

    -- Owen
  • Kay Kay at Jan 1, 2010 at 10:40 am

    On Thu, Dec 31, 2009 at 11:03 PM, Owen O'Malley wrote:
    On Dec 30, 2009, at 12:36 AM, Kay Kay wrote:

    In o.a.h.io.Text - the clear method currently just resets length to 0,
    while not doing anything about the bytes internally.

    Curious to know the thoughts behind the decision (to let the internal
    bytes to be reused for future appends vs. memory leaks due to not clearing
    them ) ?
    The byte array that backs up the Text object is always reused.

    I believe that behavior would be surprising to the user if they were
    expecting the object resources to be released entirely, by calling the
    clear() method.

    May be - clear() can reset the internal byte buffer and another method
    provided - called reset() / rewind() that can reuse the existing internal
    buffer while resetting the length variable only.



    It might make sense to have a setCapacity method on Text that is similar to
    BytesWritable's. With such a method, it would be possible to shrink the size
    of the backing array.
    HADOOP-6476 in place for this.


    -- Owen
  • Owen O'Malley at Jan 1, 2010 at 5:35 pm

    On Jan 1, 2010, at 2:39 AM, Kay Kay wrote:
    I believe that behavior would be surprising to the user if they were
    expecting the object resources to be released entirely, by calling the
    clear() method.
    I disagree. Clear only promises to reset to the empty string. It
    doesn't imply freeing resources.
    May be - clear() can reset the internal byte buffer and another method
    provided - called reset() / rewind() that can reuse the existing
    internal
    buffer while resetting the length variable only.
    Changing semantics of Text methods is very difficult. Clear is exactly
    the right verb for what it does. A patch that makes the Javadoc clear
    would be appriciated.

    Once we have setCapacity, a lot of these issues go away.

    txt.setCapacity(0)

    Is very clear what your intent is.

    -- Owen

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedDec 30, '09 at 8:36a
activeJan 1, '10 at 5:35p
posts4
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase