Grokbase Groups Pig dev August 2008
FAQ
order by on single field with user defined comparator fails
-----------------------------------------------------------

Key: PIG-402
URL: https://issues.apache.org/jira/browse/PIG-402
Project: Pig
Issue Type: Bug
Affects Versions: types_branch
Reporter: Olga Natkovich
Fix For: types_branch


register udf.jar;
a = load 'data';
c = order a by $0 using MyOrderUDF();
store c into 'out',

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Olga Natkovich (JIRA) at Sep 4, 2008 at 12:54 am
    [ https://issues.apache.org/jira/browse/PIG-402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Olga Natkovich reassigned PIG-402:
    ----------------------------------

    Assignee: Shravan Matthur Narayanamurthy
    order by on single field with user defined comparator fails
    -----------------------------------------------------------

    Key: PIG-402
    URL: https://issues.apache.org/jira/browse/PIG-402
    Project: Pig
    Issue Type: Bug
    Affects Versions: types_branch
    Reporter: Olga Natkovich
    Assignee: Shravan Matthur Narayanamurthy
    Fix For: types_branch


    register udf.jar;
    a = load 'data';
    c = order a by $0 using MyOrderUDF();
    store c into 'out',
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Shravan Matthur Narayanamurthy (JIRA) at Sep 9, 2008 at 7:04 pm
    [ https://issues.apache.org/jira/browse/PIG-402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12629574#action_12629574 ]

    Shravan Matthur Narayanamurthy commented on PIG-402:
    ----------------------------------------------------

    This is what I have understood - Comparators in Hadoop assume that you know the type of key(keyClass) beforehand and do not let you configure the type dynamically. So I feel, the only way out for us is to wrap the key inside a tuple whenever, a User Defined Comparison Func is used.

    If any of you have other suggestions, please comment.
    order by on single field with user defined comparator fails
    -----------------------------------------------------------

    Key: PIG-402
    URL: https://issues.apache.org/jira/browse/PIG-402
    Project: Pig
    Issue Type: Bug
    Affects Versions: types_branch
    Reporter: Olga Natkovich
    Assignee: Shravan Matthur Narayanamurthy
    Fix For: types_branch


    register udf.jar;
    a = load 'data';
    c = order a by $0 using MyOrderUDF();
    store c into 'out',
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Alan Gates (JIRA) at Sep 9, 2008 at 7:18 pm
    [ https://issues.apache.org/jira/browse/PIG-402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12629578#action_12629578 ]

    Alan Gates commented on PIG-402:
    --------------------------------

    I think that's perfectly reasonable. There is a performance penalty for wrapping in a tuple. But user defined comparators are expected to be the exception, especially now that we provide numeric and descending order sort (the only two reasons we added the user defined comparators to begin with).
    order by on single field with user defined comparator fails
    -----------------------------------------------------------

    Key: PIG-402
    URL: https://issues.apache.org/jira/browse/PIG-402
    Project: Pig
    Issue Type: Bug
    Affects Versions: types_branch
    Reporter: Olga Natkovich
    Assignee: Shravan Matthur Narayanamurthy
    Fix For: types_branch


    register udf.jar;
    a = load 'data';
    c = order a by $0 using MyOrderUDF();
    store c into 'out',
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Shravan Matthur Narayanamurthy (JIRA) at Sep 10, 2008 at 5:32 pm
    [ https://issues.apache.org/jira/browse/PIG-402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Shravan Matthur Narayanamurthy updated PIG-402:
    -----------------------------------------------

    Attachment: 402.patch
    order by on single field with user defined comparator fails
    -----------------------------------------------------------

    Key: PIG-402
    URL: https://issues.apache.org/jira/browse/PIG-402
    Project: Pig
    Issue Type: Bug
    Affects Versions: types_branch
    Reporter: Olga Natkovich
    Assignee: Shravan Matthur Narayanamurthy
    Fix For: types_branch

    Attachments: 402.patch


    register udf.jar;
    a = load 'data';
    c = order a by $0 using MyOrderUDF();
    store c into 'out',
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Shravan Matthur Narayanamurthy (JIRA) at Sep 10, 2008 at 5:32 pm
    [ https://issues.apache.org/jira/browse/PIG-402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Shravan Matthur Narayanamurthy updated PIG-402:
    -----------------------------------------------

    Status: Patch Available (was: Open)

    The solution I am providing is to wrap the key inside a tuple whenever a user defined comparison func is used. For that I have the following in the patch.

    1) Created a new Mapper class MapWithComparator in PigMapReduce which will be used whenever a u.d. comparator is used. The assumuption is that keyType and keyClass will be appropriately set to Tuple and the collect here wraps the key in a Tuple. This was done to avoid an if branch in the earlier Mapper class.

    2) JobControlCompiler: To meet the assumptions in 1 above, the changes to job control compiler ensures consistency

    3) PigMapBase: Incidental. Introduced a tuple factory instance into the base class.

    4) TestEvalPipeline: Added a new unit test to test Sort with UDF.
    order by on single field with user defined comparator fails
    -----------------------------------------------------------

    Key: PIG-402
    URL: https://issues.apache.org/jira/browse/PIG-402
    Project: Pig
    Issue Type: Bug
    Affects Versions: types_branch
    Reporter: Olga Natkovich
    Assignee: Shravan Matthur Narayanamurthy
    Fix For: types_branch

    Attachments: 402.patch


    register udf.jar;
    a = load 'data';
    c = order a by $0 using MyOrderUDF();
    store c into 'out',
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Olga Natkovich (JIRA) at Sep 10, 2008 at 6:58 pm
    [ https://issues.apache.org/jira/browse/PIG-402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Olga Natkovich updated PIG-402:
    -------------------------------

    Resolution: Fixed
    Status: Resolved (was: Patch Available)

    patch committed. shravan, thanks!
    order by on single field with user defined comparator fails
    -----------------------------------------------------------

    Key: PIG-402
    URL: https://issues.apache.org/jira/browse/PIG-402
    Project: Pig
    Issue Type: Bug
    Affects Versions: types_branch
    Reporter: Olga Natkovich
    Assignee: Shravan Matthur Narayanamurthy
    Fix For: types_branch

    Attachments: 402.patch


    register udf.jar;
    a = load 'data';
    c = order a by $0 using MyOrderUDF();
    store c into 'out',
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categoriespig, hadoop
postedAug 28, '08 at 1:01a
activeSep 10, '08 at 6:58p
posts7
users1
websitepig.apache.org

1 user in discussion

Olga Natkovich (JIRA): 7 posts

People

Translate

site design / logo © 2022 Grokbase