Grokbase Groups Pig dev August 2010
FAQ
Improve dynamic invokers to deal with no-arg methods and array parameters
-------------------------------------------------------------------------

Key: PIG-1551
URL: https://issues.apache.org/jira/browse/PIG-1551
Project: Pig
Issue Type: Improvement
Reporter: Dmitriy V. Ryaboy


PIG-1354 introduced a set of UDFs that can be used to dynamically wrap simple Java methods in a UDF, so that users don't need to create trivial wrappers if they are ok sacrificing some speed.

This issue is to extend the set of methods that can be wrapped this way to include methods that do not take any arguments, and methods that take arrays of {int,long,float,double,string} as arguments.
Arrays are expected to be represented by bags in Pig. Notably, this allows users to wrap statistical functions in o.a.commons.math.stat.StatUtils .

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Dmitriy V. Ryaboy (JIRA) at Aug 20, 2010 at 1:23 am
    [ https://issues.apache.org/jira/browse/PIG-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Dmitriy V. Ryaboy updated PIG-1551:
    -----------------------------------

    Attachment: PIG-1551.patch

    Patch attached.
    Improve dynamic invokers to deal with no-arg methods and array parameters
    -------------------------------------------------------------------------

    Key: PIG-1551
    URL: https://issues.apache.org/jira/browse/PIG-1551
    Project: Pig
    Issue Type: Improvement
    Affects Versions: 0.8.0
    Reporter: Dmitriy V. Ryaboy
    Fix For: 0.8.0

    Attachments: PIG-1551.patch


    PIG-1354 introduced a set of UDFs that can be used to dynamically wrap simple Java methods in a UDF, so that users don't need to create trivial wrappers if they are ok sacrificing some speed.
    This issue is to extend the set of methods that can be wrapped this way to include methods that do not take any arguments, and methods that take arrays of {int,long,float,double,string} as arguments.
    Arrays are expected to be represented by bags in Pig. Notably, this allows users to wrap statistical functions in o.a.commons.math.stat.StatUtils .
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Dmitriy V. Ryaboy (JIRA) at Aug 20, 2010 at 1:23 am
    [ https://issues.apache.org/jira/browse/PIG-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Dmitriy V. Ryaboy updated PIG-1551:
    -----------------------------------

    Status: Patch Available (was: Open)
    Affects Version/s: 0.8.0
    Fix Version/s: 0.8.0
    Improve dynamic invokers to deal with no-arg methods and array parameters
    -------------------------------------------------------------------------

    Key: PIG-1551
    URL: https://issues.apache.org/jira/browse/PIG-1551
    Project: Pig
    Issue Type: Improvement
    Affects Versions: 0.8.0
    Reporter: Dmitriy V. Ryaboy
    Fix For: 0.8.0

    Attachments: PIG-1551.patch


    PIG-1354 introduced a set of UDFs that can be used to dynamically wrap simple Java methods in a UDF, so that users don't need to create trivial wrappers if they are ok sacrificing some speed.
    This issue is to extend the set of methods that can be wrapped this way to include methods that do not take any arguments, and methods that take arrays of {int,long,float,double,string} as arguments.
    Arrays are expected to be represented by bags in Pig. Notably, this allows users to wrap statistical functions in o.a.commons.math.stat.StatUtils .
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Dmitriy V. Ryaboy (JIRA) at Aug 20, 2010 at 6:06 am
    [ https://issues.apache.org/jira/browse/PIG-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Dmitriy V. Ryaboy reassigned PIG-1551:
    --------------------------------------

    Assignee: Dmitriy V. Ryaboy
    Improve dynamic invokers to deal with no-arg methods and array parameters
    -------------------------------------------------------------------------

    Key: PIG-1551
    URL: https://issues.apache.org/jira/browse/PIG-1551
    Project: Pig
    Issue Type: Improvement
    Affects Versions: 0.8.0
    Reporter: Dmitriy V. Ryaboy
    Assignee: Dmitriy V. Ryaboy
    Fix For: 0.8.0

    Attachments: PIG-1551.patch


    PIG-1354 introduced a set of UDFs that can be used to dynamically wrap simple Java methods in a UDF, so that users don't need to create trivial wrappers if they are ok sacrificing some speed.
    This issue is to extend the set of methods that can be wrapped this way to include methods that do not take any arguments, and methods that take arrays of {int,long,float,double,string} as arguments.
    Arrays are expected to be represented by bags in Pig. Notably, this allows users to wrap statistical functions in o.a.commons.math.stat.StatUtils .
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Richard Ding (JIRA) at Aug 23, 2010 at 11:04 pm
    [ https://issues.apache.org/jira/browse/PIG-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12901656#action_12901656 ]

    Richard Ding commented on PIG-1551:
    -----------------------------------


    In Invoker.java, there is a typo:

    {code}
    private static final Class<?> LONG_ARRAY_CLASS = new String[0].getClass();
    {code}

    also in unPrimitivize method, this code seems unnecessary:

    {code}
    } else if (klass.equals(DOUBLE_ARRAY_CLASS)) {
    return DOUBLE_ARRAY_CLASS;
    {code}

    Otherwise the patch looks good.
    Improve dynamic invokers to deal with no-arg methods and array parameters
    -------------------------------------------------------------------------

    Key: PIG-1551
    URL: https://issues.apache.org/jira/browse/PIG-1551
    Project: Pig
    Issue Type: Improvement
    Affects Versions: 0.8.0
    Reporter: Dmitriy V. Ryaboy
    Assignee: Dmitriy V. Ryaboy
    Fix For: 0.8.0

    Attachments: PIG-1551.patch


    PIG-1354 introduced a set of UDFs that can be used to dynamically wrap simple Java methods in a UDF, so that users don't need to create trivial wrappers if they are ok sacrificing some speed.
    This issue is to extend the set of methods that can be wrapped this way to include methods that do not take any arguments, and methods that take arrays of {int,long,float,double,string} as arguments.
    Arrays are expected to be represented by bags in Pig. Notably, this allows users to wrap statistical functions in o.a.commons.math.stat.StatUtils .
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Dmitriy V. Ryaboy (JIRA) at Aug 24, 2010 at 6:45 am
    [ https://issues.apache.org/jira/browse/PIG-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Dmitriy V. Ryaboy updated PIG-1551:
    -----------------------------------

    Status: Open (was: Patch Available)
    Improve dynamic invokers to deal with no-arg methods and array parameters
    -------------------------------------------------------------------------

    Key: PIG-1551
    URL: https://issues.apache.org/jira/browse/PIG-1551
    Project: Pig
    Issue Type: Improvement
    Affects Versions: 0.8.0
    Reporter: Dmitriy V. Ryaboy
    Assignee: Dmitriy V. Ryaboy
    Fix For: 0.8.0

    Attachments: PIG-1551.patch, PIG_1551.2.patch


    PIG-1354 introduced a set of UDFs that can be used to dynamically wrap simple Java methods in a UDF, so that users don't need to create trivial wrappers if they are ok sacrificing some speed.
    This issue is to extend the set of methods that can be wrapped this way to include methods that do not take any arguments, and methods that take arrays of {int,long,float,double,string} as arguments.
    Arrays are expected to be represented by bags in Pig. Notably, this allows users to wrap statistical functions in o.a.commons.math.stat.StatUtils .
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Dmitriy V. Ryaboy (JIRA) at Aug 24, 2010 at 6:45 am
    [ https://issues.apache.org/jira/browse/PIG-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Dmitriy V. Ryaboy updated PIG-1551:
    -----------------------------------

    Attachment: PIG_1551.2.patch

    Attaching patch that fixes the two errors Richard pointed out.

    Improve dynamic invokers to deal with no-arg methods and array parameters
    -------------------------------------------------------------------------

    Key: PIG-1551
    URL: https://issues.apache.org/jira/browse/PIG-1551
    Project: Pig
    Issue Type: Improvement
    Affects Versions: 0.8.0
    Reporter: Dmitriy V. Ryaboy
    Assignee: Dmitriy V. Ryaboy
    Fix For: 0.8.0

    Attachments: PIG-1551.patch, PIG_1551.2.patch


    PIG-1354 introduced a set of UDFs that can be used to dynamically wrap simple Java methods in a UDF, so that users don't need to create trivial wrappers if they are ok sacrificing some speed.
    This issue is to extend the set of methods that can be wrapped this way to include methods that do not take any arguments, and methods that take arrays of {int,long,float,double,string} as arguments.
    Arrays are expected to be represented by bags in Pig. Notably, this allows users to wrap statistical functions in o.a.commons.math.stat.StatUtils .
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Dmitriy V. Ryaboy (JIRA) at Aug 24, 2010 at 6:45 am
    [ https://issues.apache.org/jira/browse/PIG-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Dmitriy V. Ryaboy updated PIG-1551:
    -----------------------------------

    Status: Patch Available (was: Open)
    Improve dynamic invokers to deal with no-arg methods and array parameters
    -------------------------------------------------------------------------

    Key: PIG-1551
    URL: https://issues.apache.org/jira/browse/PIG-1551
    Project: Pig
    Issue Type: Improvement
    Affects Versions: 0.8.0
    Reporter: Dmitriy V. Ryaboy
    Assignee: Dmitriy V. Ryaboy
    Fix For: 0.8.0

    Attachments: PIG-1551.patch, PIG_1551.2.patch


    PIG-1354 introduced a set of UDFs that can be used to dynamically wrap simple Java methods in a UDF, so that users don't need to create trivial wrappers if they are ok sacrificing some speed.
    This issue is to extend the set of methods that can be wrapped this way to include methods that do not take any arguments, and methods that take arrays of {int,long,float,double,string} as arguments.
    Arrays are expected to be represented by bags in Pig. Notably, this allows users to wrap statistical functions in o.a.commons.math.stat.StatUtils .
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Richard Ding (JIRA) at Aug 24, 2010 at 6:07 pm
    [ https://issues.apache.org/jira/browse/PIG-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12901992#action_12901992 ]

    Richard Ding commented on PIG-1551:
    -----------------------------------


    The typo is still there:

    {code}
    private static final Class<?> LONG_ARRAY_CLASS = new Long[0].getClass();
    {code}

    It seems what you want is

    {code}
    private static final Class<?> LONG_ARRAY_CLASS = new long[0].getClass();
    {code}

    so it's consistent with other array classes.

    This does raise a question about array parameters: the first form applies to methods like _amethod(Long[] nums)_, while the second supports methods like _amethod(long[] nums)_. And they are not exchangeable.
    Improve dynamic invokers to deal with no-arg methods and array parameters
    -------------------------------------------------------------------------

    Key: PIG-1551
    URL: https://issues.apache.org/jira/browse/PIG-1551
    Project: Pig
    Issue Type: Improvement
    Affects Versions: 0.8.0
    Reporter: Dmitriy V. Ryaboy
    Assignee: Dmitriy V. Ryaboy
    Fix For: 0.8.0

    Attachments: PIG-1551.patch, PIG_1551.2.patch


    PIG-1354 introduced a set of UDFs that can be used to dynamically wrap simple Java methods in a UDF, so that users don't need to create trivial wrappers if they are ok sacrificing some speed.
    This issue is to extend the set of methods that can be wrapped this way to include methods that do not take any arguments, and methods that take arrays of {int,long,float,double,string} as arguments.
    Arrays are expected to be represented by bags in Pig. Notably, this allows users to wrap statistical functions in o.a.commons.math.stat.StatUtils .
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Dmitriy V. Ryaboy (JIRA) at Aug 24, 2010 at 7:21 pm
    [ https://issues.apache.org/jira/browse/PIG-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Dmitriy V. Ryaboy updated PIG-1551:
    -----------------------------------

    Attachment: PIG_1551.3.patch

    Ugh. Thank you for catching that -- fixed, and added a test to make sure it stays fixed.

    The particular set of methods I needed this for used primitives, so that's what I did. It's a bit tricky to add support for Long, Double, etc arrays, as I would have to check all combinations of possible method signatures when seeing things like (int[], int[], int[]) -- it becomes fairly ugly code.. Do you think this is particularly compelling? I can't really think of methods that take arrays of Number classes; usually, if you start using Numbers, you are also using Collections, not plain arrays.
    Improve dynamic invokers to deal with no-arg methods and array parameters
    -------------------------------------------------------------------------

    Key: PIG-1551
    URL: https://issues.apache.org/jira/browse/PIG-1551
    Project: Pig
    Issue Type: Improvement
    Affects Versions: 0.8.0
    Reporter: Dmitriy V. Ryaboy
    Assignee: Dmitriy V. Ryaboy
    Fix For: 0.8.0

    Attachments: PIG-1551.patch, PIG_1551.2.patch, PIG_1551.3.patch


    PIG-1354 introduced a set of UDFs that can be used to dynamically wrap simple Java methods in a UDF, so that users don't need to create trivial wrappers if they are ok sacrificing some speed.
    This issue is to extend the set of methods that can be wrapped this way to include methods that do not take any arguments, and methods that take arrays of {int,long,float,double,string} as arguments.
    Arrays are expected to be represented by bags in Pig. Notably, this allows users to wrap statistical functions in o.a.commons.math.stat.StatUtils .
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Richard Ding (JIRA) at Aug 24, 2010 at 7:56 pm
    [ https://issues.apache.org/jira/browse/PIG-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12902042#action_12902042 ]

    Richard Ding commented on PIG-1551:
    -----------------------------------

    +1.

    I'm fine with arrays of primitive types. I can't think of a Java method that uses an array of object Long as a parameter.
    Improve dynamic invokers to deal with no-arg methods and array parameters
    -------------------------------------------------------------------------

    Key: PIG-1551
    URL: https://issues.apache.org/jira/browse/PIG-1551
    Project: Pig
    Issue Type: Improvement
    Affects Versions: 0.8.0
    Reporter: Dmitriy V. Ryaboy
    Assignee: Dmitriy V. Ryaboy
    Fix For: 0.8.0

    Attachments: PIG-1551.patch, PIG_1551.2.patch, PIG_1551.3.patch


    PIG-1354 introduced a set of UDFs that can be used to dynamically wrap simple Java methods in a UDF, so that users don't need to create trivial wrappers if they are ok sacrificing some speed.
    This issue is to extend the set of methods that can be wrapped this way to include methods that do not take any arguments, and methods that take arrays of {int,long,float,double,string} as arguments.
    Arrays are expected to be represented by bags in Pig. Notably, this allows users to wrap statistical functions in o.a.commons.math.stat.StatUtils .
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Dmitriy V. Ryaboy (JIRA) at Aug 24, 2010 at 9:17 pm
    [ https://issues.apache.org/jira/browse/PIG-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Dmitriy V. Ryaboy updated PIG-1551:
    -----------------------------------

    Status: Resolved (was: Patch Available)
    Release Note:
    The idea is simple: frequently, Pig users need to use a simple function that is already provided by standard Java libraries, but for which a UDF has not been written. Dynamic Invokers allow a Pig programmer to refer to Java functions without having to wrap them in custom Pig UDFs, at the cost of doing some Java reflection on every function call.

    {code}
    DEFINE UrlDecode InvokeForString('java.net.URLDecoder.decode', 'String String');
    encoded_strings = LOAD 'encoded_strings.txt' as (encoded:chararray);
    decoded_strings = FOREACH encoded_strings GENERATE UrlDecode(encoded, 'UTF-8');
    {code}

    Currently, Dynamic Invokers can be used for any static function that accepts no arguments or some combination of Strings, ints, longs, doubles, floats, or arrays of same, and returns a String, an int, a long, a double, or a float. Primitives only for the numbers, no capital-letter numeric classes as arguments. Depending on the return type, a specific kind of Invoker must be used: InvokeForString, InvokeForInt, InvokeForLong, InvokeForDouble, or InvokeForFloat.

    The DEFINE keyword is used to bind a keyword to a Java method, as above. The first argument to the InvokeFor* constructor is the full path to the desired method. The second argument is a space-delimited ordered list of the classes of the method arguments. This can be omitted or an empty string if the method takes no arguments. Valid class names are String, Long, Float, Double, and Int. Invokers can also work with array arguments, represented in Pig as DataBags of single-tuple elements. Simply refer to string[], for example. Class names are not case-sensitive.

    The ability to use invokers on methods that take array arguments makes methods like those in org.apache.commons.math.stat.StatUtils available for processing the results of grouping your datasets, for example. This is very nice, but a word of caution: the resulting UDF will of course not be optimized for Hadoop, and the very significant benefits one gains from implementing the Algebraic and Accumulative interfaces are lost here. Be careful with this one.
    Resolution: Fixed

    Commited.
    Improve dynamic invokers to deal with no-arg methods and array parameters
    -------------------------------------------------------------------------

    Key: PIG-1551
    URL: https://issues.apache.org/jira/browse/PIG-1551
    Project: Pig
    Issue Type: Improvement
    Affects Versions: 0.8.0
    Reporter: Dmitriy V. Ryaboy
    Assignee: Dmitriy V. Ryaboy
    Fix For: 0.8.0

    Attachments: PIG-1551.patch, PIG_1551.2.patch, PIG_1551.3.patch


    PIG-1354 introduced a set of UDFs that can be used to dynamically wrap simple Java methods in a UDF, so that users don't need to create trivial wrappers if they are ok sacrificing some speed.
    This issue is to extend the set of methods that can be wrapped this way to include methods that do not take any arguments, and methods that take arrays of {int,long,float,double,string} as arguments.
    Arrays are expected to be represented by bags in Pig. Notably, this allows users to wrap statistical functions in o.a.commons.math.stat.StatUtils .
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categoriespig, hadoop
postedAug 20, '10 at 1:19a
activeAug 24, '10 at 9:17p
posts12
users1
websitepig.apache.org

1 user in discussion

Dmitriy V. Ryaboy (JIRA): 12 posts

People

Translate

site design / logo © 2022 Grokbase