FAQ
Stringifier
-----------

Key: HADOOP-3048
URL: https://issues.apache.org/jira/browse/HADOOP-3048
Project: Hadoop Core
Issue Type: New Feature
Affects Versions: 0.17.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar
Fix For: 0.17.0


Storing arbitrary objects in the configuration has been discussed before in HADOOP-449 and HADOOP-1873. Although enabling such functionality has the risk of encouraging people to put big binary objects in the configuration, for some use cases passing objects to tasks is absolutely necessary.

This issue will track the implementation of a Stringifier interface which stringifies and destringifies objects. Using this implementation, developers can store objects in the configuration and restore them later.

Any thoughts ?

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Enis Soztutar (JIRA) at Mar 19, 2008 at 8:14 am
    [ https://issues.apache.org/jira/browse/HADOOP-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Enis Soztutar updated HADOOP-3048:
    ----------------------------------

    Attachment: stringifier_v1.patch

    Here is an initial patch which was part of the patch for HADOOP-449. We can use this as a starting point.
    Stringifier
    -----------

    Key: HADOOP-3048
    URL: https://issues.apache.org/jira/browse/HADOOP-3048
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: 0.17.0
    Reporter: Enis Soztutar
    Assignee: Enis Soztutar
    Fix For: 0.17.0

    Attachments: stringifier_v1.patch


    Storing arbitrary objects in the configuration has been discussed before in HADOOP-449 and HADOOP-1873. Although enabling such functionality has the risk of encouraging people to put big binary objects in the configuration, for some use cases passing objects to tasks is absolutely necessary.
    This issue will track the implementation of a Stringifier interface which stringifies and destringifies objects. Using this implementation, developers can store objects in the configuration and restore them later.
    Any thoughts ?
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Enis Soztutar (JIRA) at Mar 19, 2008 at 8:26 am
    [ https://issues.apache.org/jira/browse/HADOOP-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580268#action_12580268 ]

    Enis Soztutar commented on HADOOP-3048:
    ---------------------------------------

    For completeness, I will repeat Chris Douglas' comments about the implementation here :

    * DefaultStringifier could reuse more of the objects it creates. The serializer will hold a reference to the ByteArray(Input/Output)Buffers you've defined, so re-opening and closing them every time is unnecessary if you use a resetable stream like o.a.h.io.Data(Input/Output)Buffer.
    * The static convenience methods (load, store, storeArray) that create new Stringifiers in DefaultStringifier are probably unnecessary. Adding a stringified object to a config doesn't need that much support.
    * Further, it's not clear to me how one would register a Stringifier that isn't a DefaultStringifier. Would it make sense to follow the pattern in the Serialization framework from HADOOP-1986- or the WritableComparator static initialization- and register custom Stringifiers for classes?

    Stringifier
    -----------

    Key: HADOOP-3048
    URL: https://issues.apache.org/jira/browse/HADOOP-3048
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: 0.17.0
    Reporter: Enis Soztutar
    Assignee: Enis Soztutar
    Fix For: 0.17.0

    Attachments: stringifier_v1.patch


    Storing arbitrary objects in the configuration has been discussed before in HADOOP-449 and HADOOP-1873. Although enabling such functionality has the risk of encouraging people to put big binary objects in the configuration, for some use cases passing objects to tasks is absolutely necessary.
    This issue will track the implementation of a Stringifier interface which stringifies and destringifies objects. Using this implementation, developers can store objects in the configuration and restore them later.
    Any thoughts ?
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Enis Soztutar (JIRA) at Mar 19, 2008 at 5:46 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Enis Soztutar updated HADOOP-3048:
    ----------------------------------

    Attachment: stringifier_v2.patch

    Here is the updated patch with the following comments :
    bq. DefaultStringifier could reuse more of the objects it creates. The serializer will hold a reference to the ByteArray(Input/Output)Buffers you've defined, so re-opening and closing them every time is unnecessary if you use a resetable stream like o.a.h.io.Data(Input/Output)Buffer.
    Nod, i have changed the code to use DataInput/OutputBuffer.
    bq. The static convenience methods (load, store, storeArray) that create new Stringifiers in DefaultStringifier are probably unnecessary. Adding a stringified object to a config doesn't need that much support.
    In my opinion the main use case for the Stringifiers would be to store/load objects in the configuration, so the static convenience methods are helpful. Besides I do not see why would not want them.
    bq. Further, it's not clear to me how one would register a Stringifier that isn't a DefaultStringifier. Would it make sense to follow the pattern in the Serialization framework from HADOOP-1986- or the WritableComparator static initialization- and register custom Stringifiers for classes?
    Well, i first designed Stringifier to store the object in the configuration and DefaultStringifier just does the job. i do not foresee any other implementation other than Default but I just extracted the interface out of the class so that new stringifiers can be implemented, for whatever reason. In the current patch(HADOOP-449) everybody who wants to use a Stringifier just uses the Default one, and I think there is no need to develop a framework such as the Serialization until a need for another Stringifier emerges.


    Stringifier
    -----------

    Key: HADOOP-3048
    URL: https://issues.apache.org/jira/browse/HADOOP-3048
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: 0.17.0
    Reporter: Enis Soztutar
    Assignee: Enis Soztutar
    Fix For: 0.17.0

    Attachments: stringifier_v1.patch, stringifier_v2.patch


    Storing arbitrary objects in the configuration has been discussed before in HADOOP-449 and HADOOP-1873. Although enabling such functionality has the risk of encouraging people to put big binary objects in the configuration, for some use cases passing objects to tasks is absolutely necessary.
    This issue will track the implementation of a Stringifier interface which stringifies and destringifies objects. Using this implementation, developers can store objects in the configuration and restore them later.
    Any thoughts ?
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Enis Soztutar (JIRA) at Mar 19, 2008 at 6:14 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Enis Soztutar updated HADOOP-3048:
    ----------------------------------

    Status: Patch Available (was: Open)
    Stringifier
    -----------

    Key: HADOOP-3048
    URL: https://issues.apache.org/jira/browse/HADOOP-3048
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: 0.17.0
    Reporter: Enis Soztutar
    Assignee: Enis Soztutar
    Fix For: 0.17.0

    Attachments: stringifier_v1.patch, stringifier_v2.patch


    Storing arbitrary objects in the configuration has been discussed before in HADOOP-449 and HADOOP-1873. Although enabling such functionality has the risk of encouraging people to put big binary objects in the configuration, for some use cases passing objects to tasks is absolutely necessary.
    This issue will track the implementation of a Stringifier interface which stringifies and destringifies objects. Using this implementation, developers can store objects in the configuration and restore them later.
    Any thoughts ?
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Tsz Wo (Nicholas), SZE (JIRA) at Mar 19, 2008 at 6:52 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580501#action_12580501 ]

    Tsz Wo (Nicholas), SZE commented on HADOOP-3048:
    ------------------------------------------------

    I like the idea of Stringifier interface.

    In the patch, there are quite a few @SuppressWarnings("unchecked") tag in the codes. Is it possible to make the generics right and get ride of the tags?
    Stringifier
    -----------

    Key: HADOOP-3048
    URL: https://issues.apache.org/jira/browse/HADOOP-3048
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: 0.17.0
    Reporter: Enis Soztutar
    Assignee: Enis Soztutar
    Fix For: 0.17.0

    Attachments: stringifier_v1.patch, stringifier_v2.patch


    Storing arbitrary objects in the configuration has been discussed before in HADOOP-449 and HADOOP-1873. Although enabling such functionality has the risk of encouraging people to put big binary objects in the configuration, for some use cases passing objects to tasks is absolutely necessary.
    This issue will track the implementation of a Stringifier interface which stringifies and destringifies objects. Using this implementation, developers can store objects in the configuration and restore them later.
    Any thoughts ?
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Mar 20, 2008 at 8:57 am
    [ https://issues.apache.org/jira/browse/HADOOP-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580699#action_12580699 ]

    Hadoop QA commented on HADOOP-3048:
    -----------------------------------

    +1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12378258/stringifier_v2.patch
    against trunk revision 619744.

    @author +1. The patch does not contain any @author tags.

    tests included +1. The patch appears to include 4 new or modified tests.

    javadoc +1. The javadoc tool did not generate any warning messages.

    javac +1. The applied patch does not generate any new javac compiler warnings.

    release audit +1. The applied patch does not generate any new release audit warnings.

    findbugs +1. The patch does not introduce any new Findbugs warnings.

    core tests +1. The patch passed core unit tests.

    contrib tests +1. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2008/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2008/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2008/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2008/console

    This message is automatically generated.
    Stringifier
    -----------

    Key: HADOOP-3048
    URL: https://issues.apache.org/jira/browse/HADOOP-3048
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: 0.17.0
    Reporter: Enis Soztutar
    Assignee: Enis Soztutar
    Fix For: 0.17.0

    Attachments: stringifier_v1.patch, stringifier_v2.patch


    Storing arbitrary objects in the configuration has been discussed before in HADOOP-449 and HADOOP-1873. Although enabling such functionality has the risk of encouraging people to put big binary objects in the configuration, for some use cases passing objects to tasks is absolutely necessary.
    This issue will track the implementation of a Stringifier interface which stringifies and destringifies objects. Using this implementation, developers can store objects in the configuration and restore them later.
    Any thoughts ?
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Enis Soztutar (JIRA) at Mar 20, 2008 at 5:59 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Enis Soztutar updated HADOOP-3048:
    ----------------------------------

    Attachment: stringifier_v3.patch

    Version 3 of the patch :
    * removed static load/store accepting properties objects(the need for them in the patch for HADOOP-449 is lifted).
    * extended test case
    * added a method, to create a generic array. creating arrays of generics is not possible.

    bq. In the patch, there are quite a few @SuppressWarnings("unchecked") tag in the codes. Is it possible to make the generics right and get ride of the tags?
    Well i guess not. The warnings result from obj.getClass() function returning Class<? extends X>} where X is the erasure of the static type of the expression on which getClass is called. If you can find a better way of not causing warnings i would gladly accept that.

    Stringifier
    -----------

    Key: HADOOP-3048
    URL: https://issues.apache.org/jira/browse/HADOOP-3048
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: 0.17.0
    Reporter: Enis Soztutar
    Assignee: Enis Soztutar
    Fix For: 0.17.0

    Attachments: stringifier_v1.patch, stringifier_v2.patch, stringifier_v3.patch


    Storing arbitrary objects in the configuration has been discussed before in HADOOP-449 and HADOOP-1873. Although enabling such functionality has the risk of encouraging people to put big binary objects in the configuration, for some use cases passing objects to tasks is absolutely necessary.
    This issue will track the implementation of a Stringifier interface which stringifies and destringifies objects. Using this implementation, developers can store objects in the configuration and restore them later.
    Any thoughts ?
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Enis Soztutar (JIRA) at Mar 20, 2008 at 6:01 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Enis Soztutar updated HADOOP-3048:
    ----------------------------------

    Status: Open (was: Patch Available)
    Stringifier
    -----------

    Key: HADOOP-3048
    URL: https://issues.apache.org/jira/browse/HADOOP-3048
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: 0.17.0
    Reporter: Enis Soztutar
    Assignee: Enis Soztutar
    Fix For: 0.17.0

    Attachments: stringifier_v1.patch, stringifier_v2.patch, stringifier_v3.patch


    Storing arbitrary objects in the configuration has been discussed before in HADOOP-449 and HADOOP-1873. Although enabling such functionality has the risk of encouraging people to put big binary objects in the configuration, for some use cases passing objects to tasks is absolutely necessary.
    This issue will track the implementation of a Stringifier interface which stringifies and destringifies objects. Using this implementation, developers can store objects in the configuration and restore them later.
    Any thoughts ?
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Enis Soztutar (JIRA) at Mar 20, 2008 at 6:01 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Enis Soztutar updated HADOOP-3048:
    ----------------------------------

    Status: Patch Available (was: Open)
    Stringifier
    -----------

    Key: HADOOP-3048
    URL: https://issues.apache.org/jira/browse/HADOOP-3048
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: 0.17.0
    Reporter: Enis Soztutar
    Assignee: Enis Soztutar
    Fix For: 0.17.0

    Attachments: stringifier_v1.patch, stringifier_v2.patch, stringifier_v3.patch


    Storing arbitrary objects in the configuration has been discussed before in HADOOP-449 and HADOOP-1873. Although enabling such functionality has the risk of encouraging people to put big binary objects in the configuration, for some use cases passing objects to tasks is absolutely necessary.
    This issue will track the implementation of a Stringifier interface which stringifies and destringifies objects. Using this implementation, developers can store objects in the configuration and restore them later.
    Any thoughts ?
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Tsz Wo (Nicholas), SZE (JIRA) at Mar 20, 2008 at 8:39 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580916#action_12580916 ]

    Tsz Wo (Nicholas), SZE commented on HADOOP-3048:
    ------------------------------------------------
    Well i guess not. The warnings result from obj.getClass() function returning Class<? extends X>} where X is the erasure of the static type of the expression on which getClass is called. If you can find a better way of not causing warnings i would gladly accept that.
    It seems to me that there is no way to do t.getClass() for generic t without a unchecked warning. i.e.
    {code}
    T t = ...;
    Class<XXX> c = t.getClass(); //there will be a unchecked warning, no matter XXX is (or it dose not compile).
    {code}

    I have the following suggestions:
    - Only add @SuppressWarnings("unchecked") right behind the required statements but not behind the entire method.

    - Create a static getClass(T) as following:
    {code}
    static public <T> Class<T> getClass(T t) {
    @SuppressWarnings("unchecked")
    Class<T> clazz = (Class<T>)t.getClass();
    return clazz;
    }
    {code}
    Use getClass(t) instead of t.getClass(). Then, I guess we only need to add one @SuppressWarnings("unchecked") tag.
    Stringifier
    -----------

    Key: HADOOP-3048
    URL: https://issues.apache.org/jira/browse/HADOOP-3048
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: 0.17.0
    Reporter: Enis Soztutar
    Assignee: Enis Soztutar
    Fix For: 0.17.0

    Attachments: stringifier_v1.patch, stringifier_v2.patch, stringifier_v3.patch


    Storing arbitrary objects in the configuration has been discussed before in HADOOP-449 and HADOOP-1873. Although enabling such functionality has the risk of encouraging people to put big binary objects in the configuration, for some use cases passing objects to tasks is absolutely necessary.
    This issue will track the implementation of a Stringifier interface which stringifies and destringifies objects. Using this implementation, developers can store objects in the configuration and restore them later.
    Any thoughts ?
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Tsz Wo (Nicholas), SZE (JIRA) at Mar 20, 2008 at 9:33 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580916#action_12580916 ]

    szetszwo edited comment on HADOOP-3048 at 3/20/08 2:31 PM:
    -------------------------------------------------------------------------
    Well i guess not. The warnings result from obj.getClass() function returning Class<? extends X>} where X is the erasure of the static type of the expression on which getClass is called. If you can find a better way of not causing warnings i would gladly accept that.
    +1: It seems to me that there is no way to do t.getClass() for generic t without a unchecked warning. i.e.
    {code}
    T t = ...;
    Class<XXX> c = t.getClass(); //there will be a unchecked warning, no matter XXX is (or it does not compile).
    {code}

    I have the following suggestions:
    - Only add @SuppressWarnings("unchecked") right before the required statements but not before the entire method.

    - Create a static getClass(T) as following:
    {code}
    static public <T> Class<T> getClass(T t) {
    @SuppressWarnings("unchecked")
    Class<T> clazz = (Class<T>)t.getClass();
    return clazz;
    }
    {code}
    Use getClass(t) instead of t.getClass(). Then, I guess we only need to add one @SuppressWarnings("unchecked") tag.

    was (Author: szetszwo):
    Well i guess not. The warnings result from obj.getClass() function returning Class<? extends X>} where X is the erasure of the static type of the expression on which getClass is called. If you can find a better way of not causing warnings i would gladly accept that.
    It seems to me that there is no way to do t.getClass() for generic t without a unchecked warning. i.e.
    {code}
    T t = ...;
    Class<XXX> c = t.getClass(); //there will be a unchecked warning, no matter XXX is (or it dose not compile).
    {code}

    I have the following suggestions:
    - Only add @SuppressWarnings("unchecked") right behind the required statements but not behind the entire method.

    - Create a static getClass(T) as following:
    {code}
    static public <T> Class<T> getClass(T t) {
    @SuppressWarnings("unchecked")
    Class<T> clazz = (Class<T>)t.getClass();
    return clazz;
    }
    {code}
    Use getClass(t) instead of t.getClass(). Then, I guess we only need to add one @SuppressWarnings("unchecked") tag.
    Stringifier
    -----------

    Key: HADOOP-3048
    URL: https://issues.apache.org/jira/browse/HADOOP-3048
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: 0.17.0
    Reporter: Enis Soztutar
    Assignee: Enis Soztutar
    Fix For: 0.17.0

    Attachments: stringifier_v1.patch, stringifier_v2.patch, stringifier_v3.patch


    Storing arbitrary objects in the configuration has been discussed before in HADOOP-449 and HADOOP-1873. Although enabling such functionality has the risk of encouraging people to put big binary objects in the configuration, for some use cases passing objects to tasks is absolutely necessary.
    This issue will track the implementation of a Stringifier interface which stringifies and destringifies objects. Using this implementation, developers can store objects in the configuration and restore them later.
    Any thoughts ?
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Mar 21, 2008 at 6:07 am
    [ https://issues.apache.org/jira/browse/HADOOP-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12581007#action_12581007 ]

    Hadoop QA commented on HADOOP-3048:
    -----------------------------------

    +1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12378328/stringifier_v3.patch
    against trunk revision 619744.

    @author +1. The patch does not contain any @author tags.

    tests included +1. The patch appears to include 4 new or modified tests.

    javadoc +1. The javadoc tool did not generate any warning messages.

    javac +1. The applied patch does not generate any new javac compiler warnings.

    release audit +1. The applied patch does not generate any new release audit warnings.

    findbugs +1. The patch does not introduce any new Findbugs warnings.

    core tests +1. The patch passed core unit tests.

    contrib tests +1. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2018/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2018/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2018/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2018/console

    This message is automatically generated.
    Stringifier
    -----------

    Key: HADOOP-3048
    URL: https://issues.apache.org/jira/browse/HADOOP-3048
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: 0.17.0
    Reporter: Enis Soztutar
    Assignee: Enis Soztutar
    Fix For: 0.17.0

    Attachments: stringifier_v1.patch, stringifier_v2.patch, stringifier_v3.patch


    Storing arbitrary objects in the configuration has been discussed before in HADOOP-449 and HADOOP-1873. Although enabling such functionality has the risk of encouraging people to put big binary objects in the configuration, for some use cases passing objects to tasks is absolutely necessary.
    This issue will track the implementation of a Stringifier interface which stringifies and destringifies objects. Using this implementation, developers can store objects in the configuration and restore them later.
    Any thoughts ?
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Enis Soztutar (JIRA) at Mar 25, 2008 at 12:49 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Enis Soztutar updated HADOOP-3048:
    ----------------------------------

    Status: Open (was: Patch Available)
    Stringifier
    -----------

    Key: HADOOP-3048
    URL: https://issues.apache.org/jira/browse/HADOOP-3048
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: 0.17.0
    Reporter: Enis Soztutar
    Assignee: Enis Soztutar
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: stringifier_v1.patch, stringifier_v2.patch, stringifier_v3.patch


    Storing arbitrary objects in the configuration has been discussed before in HADOOP-449 and HADOOP-1873. Although enabling such functionality has the risk of encouraging people to put big binary objects in the configuration, for some use cases passing objects to tasks is absolutely necessary.
    This issue will track the implementation of a Stringifier interface which stringifies and destringifies objects. Using this implementation, developers can store objects in the configuration and restore them later.
    Any thoughts ?
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Enis Soztutar (JIRA) at Mar 25, 2008 at 12:51 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Enis Soztutar updated HADOOP-3048:
    ----------------------------------

    Attachment: stringifier_v4.patch

    I've updated the patch, added a GenericsUtil class and its test case, and moved the generics functions to it. Now the @SuppressWarnings("unchecked")'s are as localized as possible.


    Stringifier
    -----------

    Key: HADOOP-3048
    URL: https://issues.apache.org/jira/browse/HADOOP-3048
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: 0.17.0
    Reporter: Enis Soztutar
    Assignee: Enis Soztutar
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: stringifier_v1.patch, stringifier_v2.patch, stringifier_v3.patch, stringifier_v4.patch


    Storing arbitrary objects in the configuration has been discussed before in HADOOP-449 and HADOOP-1873. Although enabling such functionality has the risk of encouraging people to put big binary objects in the configuration, for some use cases passing objects to tasks is absolutely necessary.
    This issue will track the implementation of a Stringifier interface which stringifies and destringifies objects. Using this implementation, developers can store objects in the configuration and restore them later.
    Any thoughts ?
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Enis Soztutar (JIRA) at Mar 25, 2008 at 12:51 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Enis Soztutar updated HADOOP-3048:
    ----------------------------------

    Status: Patch Available (was: Open)
    Stringifier
    -----------

    Key: HADOOP-3048
    URL: https://issues.apache.org/jira/browse/HADOOP-3048
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: 0.17.0
    Reporter: Enis Soztutar
    Assignee: Enis Soztutar
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: stringifier_v1.patch, stringifier_v2.patch, stringifier_v3.patch, stringifier_v4.patch


    Storing arbitrary objects in the configuration has been discussed before in HADOOP-449 and HADOOP-1873. Although enabling such functionality has the risk of encouraging people to put big binary objects in the configuration, for some use cases passing objects to tasks is absolutely necessary.
    This issue will track the implementation of a Stringifier interface which stringifies and destringifies objects. Using this implementation, developers can store objects in the configuration and restore them later.
    Any thoughts ?
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Mar 25, 2008 at 8:09 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12582048#action_12582048 ]

    Hadoop QA commented on HADOOP-3048:
    -----------------------------------

    -1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12378563/stringifier_v4.patch
    against trunk revision 619744.

    @author +1. The patch does not contain any @author tags.

    tests included +1. The patch appears to include 11 new or modified tests.

    javadoc -1. The javadoc tool appears to have generated 1 warning messages.

    javac +1. The applied patch does not generate any new javac compiler warnings.

    release audit +1. The applied patch does not generate any new release audit warnings.

    findbugs +1. The patch does not introduce any new Findbugs warnings.

    core tests +1. The patch passed core unit tests.

    contrib tests +1. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2051/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2051/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2051/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2051/console

    This message is automatically generated.
    Stringifier
    -----------

    Key: HADOOP-3048
    URL: https://issues.apache.org/jira/browse/HADOOP-3048
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: 0.17.0
    Reporter: Enis Soztutar
    Assignee: Enis Soztutar
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: stringifier_v1.patch, stringifier_v2.patch, stringifier_v3.patch, stringifier_v4.patch


    Storing arbitrary objects in the configuration has been discussed before in HADOOP-449 and HADOOP-1873. Although enabling such functionality has the risk of encouraging people to put big binary objects in the configuration, for some use cases passing objects to tasks is absolutely necessary.
    This issue will track the implementation of a Stringifier interface which stringifies and destringifies objects. Using this implementation, developers can store objects in the configuration and restore them later.
    Any thoughts ?
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Enis Soztutar (JIRA) at Mar 26, 2008 at 12:17 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Enis Soztutar updated HADOOP-3048:
    ----------------------------------

    Attachment: stringifier_v5.patch

    Fixed the javadoc problem, I guess there is no need to retrigger hudson.
    Stringifier
    -----------

    Key: HADOOP-3048
    URL: https://issues.apache.org/jira/browse/HADOOP-3048
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: 0.17.0
    Reporter: Enis Soztutar
    Assignee: Enis Soztutar
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: stringifier_v1.patch, stringifier_v2.patch, stringifier_v3.patch, stringifier_v4.patch, stringifier_v5.patch


    Storing arbitrary objects in the configuration has been discussed before in HADOOP-449 and HADOOP-1873. Although enabling such functionality has the risk of encouraging people to put big binary objects in the configuration, for some use cases passing objects to tasks is absolutely necessary.
    This issue will track the implementation of a Stringifier interface which stringifies and destringifies objects. Using this implementation, developers can store objects in the configuration and restore them later.
    Any thoughts ?
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Tsz Wo (Nicholas), SZE (JIRA) at Mar 28, 2008 at 7:44 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12583164#action_12583164 ]

    Tsz Wo (Nicholas), SZE commented on HADOOP-3048:
    ------------------------------------------------

    +1 for GenericsUtil. I am sorry that I am not familiar with the other parts of the patch.
    Stringifier
    -----------

    Key: HADOOP-3048
    URL: https://issues.apache.org/jira/browse/HADOOP-3048
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: 0.17.0
    Reporter: Enis Soztutar
    Assignee: Enis Soztutar
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: stringifier_v1.patch, stringifier_v2.patch, stringifier_v3.patch, stringifier_v4.patch, stringifier_v5.patch


    Storing arbitrary objects in the configuration has been discussed before in HADOOP-449 and HADOOP-1873. Although enabling such functionality has the risk of encouraging people to put big binary objects in the configuration, for some use cases passing objects to tasks is absolutely necessary.
    This issue will track the implementation of a Stringifier interface which stringifies and destringifies objects. Using this implementation, developers can store objects in the configuration and restore them later.
    Any thoughts ?
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Chris Douglas (JIRA) at Mar 29, 2008 at 1:02 am
    [ https://issues.apache.org/jira/browse/HADOOP-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12583257#action_12583257 ]

    Chris Douglas commented on HADOOP-3048:
    ---------------------------------------

    * I'm uncomfortable with GenericsUtil. It can't be denied that Java generics have some fiercely annoying properties, but factoring this out is probably the wrong solution. Where GenericsUtil::getClass(T) is called, Class::asSubclass(Class) should be preferred. Similarly, GenericsUtil::toArray is called from a function that's destringifying and deserializing objects. It's not statically-checkable code, the casts are wholly justified, and the warning should be suppressed. Contexts where toArray might be useful should typically prefer the equally typesafe implementation in ArrayList (this case included).
    * Since {{separator}} is a constant, it should be in all caps

    Other than that, this looks great.
    Stringifier
    -----------

    Key: HADOOP-3048
    URL: https://issues.apache.org/jira/browse/HADOOP-3048
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: 0.17.0
    Reporter: Enis Soztutar
    Assignee: Enis Soztutar
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: stringifier_v1.patch, stringifier_v2.patch, stringifier_v3.patch, stringifier_v4.patch, stringifier_v5.patch


    Storing arbitrary objects in the configuration has been discussed before in HADOOP-449 and HADOOP-1873. Although enabling such functionality has the risk of encouraging people to put big binary objects in the configuration, for some use cases passing objects to tasks is absolutely necessary.
    This issue will track the implementation of a Stringifier interface which stringifies and destringifies objects. Using this implementation, developers can store objects in the configuration and restore them later.
    Any thoughts ?
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Chris Douglas (JIRA) at Mar 29, 2008 at 1:04 am
    [ https://issues.apache.org/jira/browse/HADOOP-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12583257#action_12583257 ]

    chris.douglas edited comment on HADOOP-3048 at 3/28/08 6:00 PM:
    ----------------------------------------------------------------

    * I'm uncomfortable with GenericsUtil. It can't be denied that Java generics have some fiercely annoying properties, but factoring this out is probably the wrong solution. Where GenericsUtil::getClass(T) is called, Class::asSubclass(Class) should be preferred. Similarly, GenericsUtil::toArray is called from a function that's destringifying and deserializing objects. It's not statically-checkable code, the casts are wholly justified, and the warning should be suppressed. Contexts where toArray might be useful should typically prefer the equally typesafe implementation in ArrayList (this case included).
    * Since {{separator}} is a constant, it should be in all caps

    Other than that, this looks great.

    bq. I think there is no need to develop a framework such as the Serialization until a need for another Stringifier emerges.

    Agreed.

    was (Author: chris.douglas):
    * I'm uncomfortable with GenericsUtil. It can't be denied that Java generics have some fiercely annoying properties, but factoring this out is probably the wrong solution. Where GenericsUtil::getClass(T) is called, Class::asSubclass(Class) should be preferred. Similarly, GenericsUtil::toArray is called from a function that's destringifying and deserializing objects. It's not statically-checkable code, the casts are wholly justified, and the warning should be suppressed. Contexts where toArray might be useful should typically prefer the equally typesafe implementation in ArrayList (this case included).
    * Since {{separator}} is a constant, it should be in all caps

    Other than that, this looks great.
    Stringifier
    -----------

    Key: HADOOP-3048
    URL: https://issues.apache.org/jira/browse/HADOOP-3048
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: 0.17.0
    Reporter: Enis Soztutar
    Assignee: Enis Soztutar
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: stringifier_v1.patch, stringifier_v2.patch, stringifier_v3.patch, stringifier_v4.patch, stringifier_v5.patch


    Storing arbitrary objects in the configuration has been discussed before in HADOOP-449 and HADOOP-1873. Although enabling such functionality has the risk of encouraging people to put big binary objects in the configuration, for some use cases passing objects to tasks is absolutely necessary.
    This issue will track the implementation of a Stringifier interface which stringifies and destringifies objects. Using this implementation, developers can store objects in the configuration and restore them later.
    Any thoughts ?
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Enis Soztutar (JIRA) at Apr 1, 2008 at 3:02 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12584171#action_12584171 ]

    Enis Soztutar commented on HADOOP-3048:
    ---------------------------------------

    bq. I'm uncomfortable with GenericsUtil. It can't be denied that Java generics have some fiercely annoying properties, but factoring this out is probably the wrong solution.
    I cannot understand why we do not want to separate the functionality to deal with generics. The proposed functions are fairly general and can be used by others, and it seems that we cannot get rid of them(please see below), so extracting this out is better than introducing ad hoc solutions.

    bq. Where GenericsUtil::getClass(T) is called, Class::asSubclass(Class) should be preferred.
    we cannot use Class::asSubclass(Class) get a Class<T> object where T is a generic type, because then we would use :
    {code}
    T item;
    Class<T> c = item.getClass().asSubclass( <cannot_fill> );
    {code}
    in the above code, we cannot write anything as an argument to getSubclass, since we do not have any object of type Class<T>. (if we had then why would we bother getting an object of the same type). GenericsUtil#getClass(T) is introduced to handle the repetitive pattern of
    {code}
    @SuppressWarnings("unchecked")
    Class<T> c = (Class<T>)t.getClass();
    {code}

    bq. Similarly, GenericsUtil::toArray is called from a function that's destringifying and deserializing objects. It's not statically-checkable code, the casts are wholly justified, and the warning should be suppressed. Contexts where toArray might be useful should typically prefer the equally typesafe implementation in ArrayList (this case included).

    Unfortunatelly, we cannot use ArrayList#toArray(T[] a), since creating a generic array is not allowed, similarly ArrayList#toArray() will not work either, since it returns an Object[] and casting Object[] to T[] is not possible. Thus we needed a method to do the job of returning a generic array from a List, that's why we have GenericsUtil::toArray(). The above cases are clearly demonstrated in TestGenericsUtil#testWithGenericClass().

    bq. Since separator is a constant, it should be in all caps
    I would change that, if it matters.
    Stringifier
    -----------

    Key: HADOOP-3048
    URL: https://issues.apache.org/jira/browse/HADOOP-3048
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: 0.17.0
    Reporter: Enis Soztutar
    Assignee: Enis Soztutar
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: stringifier_v1.patch, stringifier_v2.patch, stringifier_v3.patch, stringifier_v4.patch, stringifier_v5.patch


    Storing arbitrary objects in the configuration has been discussed before in HADOOP-449 and HADOOP-1873. Although enabling such functionality has the risk of encouraging people to put big binary objects in the configuration, for some use cases passing objects to tasks is absolutely necessary.
    This issue will track the implementation of a Stringifier interface which stringifies and destringifies objects. Using this implementation, developers can store objects in the configuration and restore them later.
    Any thoughts ?
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Enis Soztutar (JIRA) at Apr 2, 2008 at 2:25 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Enis Soztutar updated HADOOP-3048:
    ----------------------------------

    Attachment: stringifier_v6.patch

    updated patch refactors the constant variable into uppercase. If there is no further issues with the patch, I will commit this before the feature freeze. (04/04/08)
    Stringifier
    -----------

    Key: HADOOP-3048
    URL: https://issues.apache.org/jira/browse/HADOOP-3048
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: 0.17.0
    Reporter: Enis Soztutar
    Assignee: Enis Soztutar
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: stringifier_v1.patch, stringifier_v2.patch, stringifier_v3.patch, stringifier_v4.patch, stringifier_v5.patch, stringifier_v6.patch


    Storing arbitrary objects in the configuration has been discussed before in HADOOP-449 and HADOOP-1873. Although enabling such functionality has the risk of encouraging people to put big binary objects in the configuration, for some use cases passing objects to tasks is absolutely necessary.
    This issue will track the implementation of a Stringifier interface which stringifies and destringifies objects. Using this implementation, developers can store objects in the configuration and restore them later.
    Any thoughts ?
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Chris Douglas (JIRA) at Apr 2, 2008 at 8:27 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Chris Douglas updated HADOOP-3048:
    ----------------------------------

    Attachment: 3048-alt.patch

    Attached is an example of your patch that avoids GenericsUtil. It contains only a single, commented SuppressWarnings annotation for loadArray, necessary because Array.newInstance returns Object. Most of what it does is move the load/store functionality from static methods- that create instances of DefaultSerializer- to instance methods that maintain the modest type safety that generics can guarantee. It also keeps the class object one passes in to the constructor. Other than those minor tweaks, the rest of the functionality remains as you had it in your last patch.

    You've convinced me of the utility of getClass(T t), though. Do you think it could go into ReflectionUtils?
    Stringifier
    -----------

    Key: HADOOP-3048
    URL: https://issues.apache.org/jira/browse/HADOOP-3048
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: 0.17.0
    Reporter: Enis Soztutar
    Assignee: Enis Soztutar
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: 3048-alt.patch, stringifier_v1.patch, stringifier_v2.patch, stringifier_v3.patch, stringifier_v4.patch, stringifier_v5.patch, stringifier_v6.patch


    Storing arbitrary objects in the configuration has been discussed before in HADOOP-449 and HADOOP-1873. Although enabling such functionality has the risk of encouraging people to put big binary objects in the configuration, for some use cases passing objects to tasks is absolutely necessary.
    This issue will track the implementation of a Stringifier interface which stringifies and destringifies objects. Using this implementation, developers can store objects in the configuration and restore them later.
    Any thoughts ?
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Enis Soztutar (JIRA) at Apr 3, 2008 at 12:50 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585088#action_12585088 ]

    Enis Soztutar commented on HADOOP-3048:
    ---------------------------------------

    There is NO difference between
    {code}
    public static <T> T[] toArray(Class<T> c, List<T> list)
    {
    @SuppressWarnings("unchecked")
    T[] ta= (T[])Array.newInstance(c, list.size());
    for (int i= 0; i<list.size(); i++)
    ta[i]= list.get(i);
    return ta;
    }
    {code}
    and
    {code}
    list.toArray((T[])Array.newInstance(clazz, parts.length));
    {code}
    except that the former is encapsulated in a reusable, well documented utility function, that others can freely use, so in that case I am sorry but I have to insist on keeping GenericsUtil. Here is the Sun's implementation of ArrayList#toArray() for reference :
    {code}
    public <T> T[] toArray(T[] a) {
    if (a.length < size)
    a = (T[])java.lang.reflect.Array.
    newInstance(a.getClass().getComponentType(), size);
    System.arraycopy(elementData, 0, a, 0, size);
    if (a.length > size)
    a[size] = null;
    return a;
    }
    {code}

    One the issue about load/store methods being static or not, I can say that having one or two-liner non-static functions will not hurt, but the static versions are "convenience" methods (as stated before, their main usage will be to store/load objects to/from the configuration, and there is no need to keep a DefaultStringifier reference for a one-time operation.), thus I propose we introduce both static and non-static versions.
    Stringifier
    -----------

    Key: HADOOP-3048
    URL: https://issues.apache.org/jira/browse/HADOOP-3048
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: 0.17.0
    Reporter: Enis Soztutar
    Assignee: Enis Soztutar
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: 3048-alt.patch, stringifier_v1.patch, stringifier_v2.patch, stringifier_v3.patch, stringifier_v4.patch, stringifier_v5.patch, stringifier_v6.patch


    Storing arbitrary objects in the configuration has been discussed before in HADOOP-449 and HADOOP-1873. Although enabling such functionality has the risk of encouraging people to put big binary objects in the configuration, for some use cases passing objects to tasks is absolutely necessary.
    This issue will track the implementation of a Stringifier interface which stringifies and destringifies objects. Using this implementation, developers can store objects in the configuration and restore them later.
    Any thoughts ?
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Enis Soztutar (JIRA) at Apr 3, 2008 at 3:24 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585088#action_12585088 ]

    enis edited comment on HADOOP-3048 at 4/3/08 8:21 AM:
    ---------------------------------------------------------------

    There is NO difference between
    {code}
    public static <T> T[] toArray(Class<T> c, List<T> list)
    {
    @SuppressWarnings("unchecked")
    T[] ta= (T[])Array.newInstance(c, list.size());
    for (int i= 0; i<list.size(); i++)
    ta[i]= list.get(i);
    return ta;
    }
    {code}
    and
    {code}
    list.toArray((T[])Array.newInstance(clazz, parts.length));
    {code}
    except that the former is encapsulated in a reusable, well documented utility function, that others can freely use, so in that case I am sorry but I have to insist on keeping GenericsUtil. Here is the Sun's implementation of ArrayList#toArray() for reference :
    {code}
    public <T> T[] toArray(T[] a) {
    if (a.length < size)
    a = (T[])java.lang.reflect.Array.
    newInstance(a.getClass().getComponentType(), size);
    System.arraycopy(elementData, 0, a, 0, size);
    if (a.length > size)
    a[size] = null;
    return a;
    }
    {code}

    On the issue about load/store methods being static or not, I can say that having one or two-liner non-static functions will not hurt, but the static versions are "convenience" methods (as stated before, their main usage will be to store/load objects to/from the configuration, and there is no need to keep a DefaultStringifier reference for a one-time operation.), thus I propose we introduce both static and non-static versions. Static methods are not useless, since they are already being used in the patch for HADOOP-449.

    was (Author: enis):
    There is NO difference between
    {code}
    public static <T> T[] toArray(Class<T> c, List<T> list)
    {
    @SuppressWarnings("unchecked")
    T[] ta= (T[])Array.newInstance(c, list.size());
    for (int i= 0; i<list.size(); i++)
    ta[i]= list.get(i);
    return ta;
    }
    {code}
    and
    {code}
    list.toArray((T[])Array.newInstance(clazz, parts.length));
    {code}
    except that the former is encapsulated in a reusable, well documented utility function, that others can freely use, so in that case I am sorry but I have to insist on keeping GenericsUtil. Here is the Sun's implementation of ArrayList#toArray() for reference :
    {code}
    public <T> T[] toArray(T[] a) {
    if (a.length < size)
    a = (T[])java.lang.reflect.Array.
    newInstance(a.getClass().getComponentType(), size);
    System.arraycopy(elementData, 0, a, 0, size);
    if (a.length > size)
    a[size] = null;
    return a;
    }
    {code}

    One the issue about load/store methods being static or not, I can say that having one or two-liner non-static functions will not hurt, but the static versions are "convenience" methods (as stated before, their main usage will be to store/load objects to/from the configuration, and there is no need to keep a DefaultStringifier reference for a one-time operation.), thus I propose we introduce both static and non-static versions.
    Stringifier
    -----------

    Key: HADOOP-3048
    URL: https://issues.apache.org/jira/browse/HADOOP-3048
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: 0.17.0
    Reporter: Enis Soztutar
    Assignee: Enis Soztutar
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: 3048-alt.patch, stringifier_v1.patch, stringifier_v2.patch, stringifier_v3.patch, stringifier_v4.patch, stringifier_v5.patch, stringifier_v6.patch


    Storing arbitrary objects in the configuration has been discussed before in HADOOP-449 and HADOOP-1873. Although enabling such functionality has the risk of encouraging people to put big binary objects in the configuration, for some use cases passing objects to tasks is absolutely necessary.
    This issue will track the implementation of a Stringifier interface which stringifies and destringifies objects. Using this implementation, developers can store objects in the configuration and restore them later.
    Any thoughts ?
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Enis Soztutar (JIRA) at Apr 3, 2008 at 3:30 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Enis Soztutar updated HADOOP-3048:
    ----------------------------------

    Attachment: stringifier_v7.patch

    Updated patch, includes both static and non-static methods for load/store, keeps GenericsUtil as it is.
    Chris are you OK with this version?
    Stringifier
    -----------

    Key: HADOOP-3048
    URL: https://issues.apache.org/jira/browse/HADOOP-3048
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: 0.17.0
    Reporter: Enis Soztutar
    Assignee: Enis Soztutar
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: 3048-alt.patch, stringifier_v1.patch, stringifier_v2.patch, stringifier_v3.patch, stringifier_v4.patch, stringifier_v5.patch, stringifier_v6.patch, stringifier_v7.patch


    Storing arbitrary objects in the configuration has been discussed before in HADOOP-449 and HADOOP-1873. Although enabling such functionality has the risk of encouraging people to put big binary objects in the configuration, for some use cases passing objects to tasks is absolutely necessary.
    This issue will track the implementation of a Stringifier interface which stringifies and destringifies objects. Using this implementation, developers can store objects in the configuration and restore them later.
    Any thoughts ?
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Chris Douglas (JIRA) at Apr 3, 2008 at 6:41 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585227#action_12585227 ]

    Chris Douglas commented on HADOOP-3048:
    ---------------------------------------

    bq. There is NO difference between X and Y except that the former is encapsulated in a reusable, well documented utility function, that others can freely use...

    That said functionality is encapsulated in a reusable, well documented utility function that others can freely use is the substance of my objection. I don't think it's necessary and while the compiler (and Hudson) can flag an unchecked cast and committers can spot-check for wanton (and uncommented) SuppressWarnings annotations, less fastidious uses of GenericUtils functions can slip in unnoted, unscrutinized, and unjustified. I fully agree with your reasoning, but come to the opposite conclusion: that there is no difference between the one line of code (that generates a warning that must be justified) and a utility function in a new class recommends against the latter approach.

    All that said, I can live with GenericsUtil if you remain convinced that it has a role. Arguing against my self of three lines ago, most of its functionality could also fit in ReflectionUtils, but the distinction couldn't matter less.

    bq. On the issue about load/store methods being static or not, I can say that having one or two-liner non-static functions will not hurt, but the static versions are "convenience" methods (as stated before, their main usage will be to store/load objects to/from the configuration, and there is no need to keep a DefaultStringifier reference for a one-time operation.), thus I propose we introduce both static and non-static versions.

    OK. If you're going to leave the statics in, then you might as well drop the instance methods (v6 instead of v7). If someone creates an instance and keeps it around, you're right, they're probably dealing with volumes of data that won't go into the configuration (we hope).

    +1 for v6. Thanks for all your work- and rework- on this.
    Stringifier
    -----------

    Key: HADOOP-3048
    URL: https://issues.apache.org/jira/browse/HADOOP-3048
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: 0.17.0
    Reporter: Enis Soztutar
    Assignee: Enis Soztutar
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: 3048-alt.patch, stringifier_v1.patch, stringifier_v2.patch, stringifier_v3.patch, stringifier_v4.patch, stringifier_v5.patch, stringifier_v6.patch, stringifier_v7.patch


    Storing arbitrary objects in the configuration has been discussed before in HADOOP-449 and HADOOP-1873. Although enabling such functionality has the risk of encouraging people to put big binary objects in the configuration, for some use cases passing objects to tasks is absolutely necessary.
    This issue will track the implementation of a Stringifier interface which stringifies and destringifies objects. Using this implementation, developers can store objects in the configuration and restore them later.
    Any thoughts ?
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Enis Soztutar (JIRA) at Apr 4, 2008 at 7:38 am
    [ https://issues.apache.org/jira/browse/HADOOP-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Enis Soztutar updated HADOOP-3048:
    ----------------------------------

    Resolution: Fixed
    Release Note: A new Interface and a default implementation to convert and restore serializations of objects to strings.
    Hadoop Flags: [Reviewed]
    Status: Resolved (was: Patch Available)

    After coming to terms, I have committed the *version 6* of the patch. Thanks to the reviewers.
    Stringifier
    -----------

    Key: HADOOP-3048
    URL: https://issues.apache.org/jira/browse/HADOOP-3048
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: 0.17.0
    Reporter: Enis Soztutar
    Assignee: Enis Soztutar
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: 3048-alt.patch, stringifier_v1.patch, stringifier_v2.patch, stringifier_v3.patch, stringifier_v4.patch, stringifier_v5.patch, stringifier_v6.patch, stringifier_v7.patch


    Storing arbitrary objects in the configuration has been discussed before in HADOOP-449 and HADOOP-1873. Although enabling such functionality has the risk of encouraging people to put big binary objects in the configuration, for some use cases passing objects to tasks is absolutely necessary.
    This issue will track the implementation of a Stringifier interface which stringifies and destringifies objects. Using this implementation, developers can store objects in the configuration and restore them later.
    Any thoughts ?
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hudson (JIRA) at Apr 5, 2008 at 12:16 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585971#action_12585971 ]

    Hudson commented on HADOOP-3048:
    --------------------------------

    Integrated in Hadoop-trunk #451 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/451/])
    Stringifier
    -----------

    Key: HADOOP-3048
    URL: https://issues.apache.org/jira/browse/HADOOP-3048
    Project: Hadoop Core
    Issue Type: New Feature
    Affects Versions: 0.17.0
    Reporter: Enis Soztutar
    Assignee: Enis Soztutar
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: 3048-alt.patch, stringifier_v1.patch, stringifier_v2.patch, stringifier_v3.patch, stringifier_v4.patch, stringifier_v5.patch, stringifier_v6.patch, stringifier_v7.patch


    Storing arbitrary objects in the configuration has been discussed before in HADOOP-449 and HADOOP-1873. Although enabling such functionality has the risk of encouraging people to put big binary objects in the configuration, for some use cases passing objects to tasks is absolutely necessary.
    This issue will track the implementation of a Stringifier interface which stringifies and destringifies objects. Using this implementation, developers can store objects in the configuration and restore them later.
    Any thoughts ?
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Nigel Daley (JIRA) at May 14, 2008 at 12:38 am
    [ https://issues.apache.org/jira/browse/HADOOP-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Nigel Daley updated HADOOP-3048:
    --------------------------------

    Component/s: io
    Stringifier
    -----------

    Key: HADOOP-3048
    URL: https://issues.apache.org/jira/browse/HADOOP-3048
    Project: Hadoop Core
    Issue Type: New Feature
    Components: io
    Affects Versions: 0.17.0
    Reporter: Enis Soztutar
    Assignee: Enis Soztutar
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: 3048-alt.patch, stringifier_v1.patch, stringifier_v2.patch, stringifier_v3.patch, stringifier_v4.patch, stringifier_v5.patch, stringifier_v6.patch, stringifier_v7.patch


    Storing arbitrary objects in the configuration has been discussed before in HADOOP-449 and HADOOP-1873. Although enabling such functionality has the risk of encouraging people to put big binary objects in the configuration, for some use cases passing objects to tasks is absolutely necessary.
    This issue will track the implementation of a Stringifier interface which stringifies and destringifies objects. Using this implementation, developers can store objects in the configuration and restore them later.
    Any thoughts ?
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedMar 19, '08 at 8:10a
activeMay 14, '08 at 12:38a
posts31
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Nigel Daley (JIRA): 31 posts

People

Translate

site design / logo © 2022 Grokbase