FAQ
cross is broken
---------------

Key: PIG-422
URL: https://issues.apache.org/jira/browse/PIG-422
Project: Pig
Issue Type: Bug
Affects Versions: types_branch
Reporter: Olga Natkovich
Assignee: Shravan Matthur Narayanamurthy
Fix For: types_branch


The following script fails:

a = load 'data1' as (name, age, gpa);
b = load 'data2' as (name, age, registration, contributions);
c = filter a by age < 19 and gpa < 1.0;
d = filter b by age < 19;
e = cross c, d;
store e into 'output';

produces the following stack:

0808261638_3210_r_000000java.lang.ClassCastException: org.apache.pig.data.DefaultDataBag cannot be cast to org.apache.pig.data.Tuple
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:264)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:220)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:231)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:220)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODistinct.getNext(PODistinct.java:76)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:270)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:351)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:158)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.processInput(POUserFunc.java:123)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:175)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:241)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:217)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:156)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:206)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:176)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:87)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:391)
at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124)
/Cross
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:158)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.processInput(POUserFunc.java:123)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:175)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:241)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:217)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:156)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:206)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:176)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:87)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:391)
at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Shravan Matthur Narayanamurthy (JIRA) at Sep 10, 2008 at 6:16 pm
    [ https://issues.apache.org/jira/browse/PIG-422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Shravan Matthur Narayanamurthy updated PIG-422:
    -----------------------------------------------

    Status: Patch Available (was: Open)

    This one got broken because of the fix to POUserFunc to adhere to trunk behavior. We removed the Tuple inside a Tuple check. The initial fix used a constant expression which was a Tuple and relied on POUserFunc to remove the nesting before sending it to GFCross.

    So now I split the list of objects inside the constant tuple into 2 constant expressions. However, it did not work because of our unordered plan structure. It was accessing the two constants in random order and GFCross would not work if we pass(1,2) instead of (2,1).

    I think we need to be careful about this one. If a UDF is given constant expressions like UDF('2','1'), We create constant expressions and attach it to the UDF as inputs. However, I am not sure if there is guarantee that the two constant expressions will be pulled in the same order as our plan doesn't support order.

    I was able to fix this one because, luckily the POUserFunc operator relies on its inputs and not on the ones got by using getPredecessors() on the plan. I think most of the operators that were created earlier did that since we did not have a handle to the plan the operator is a part of. So, I explicitly initialized the inputs of POUserFunc to the list of constanct expressions, created in the right order, after connecting all the operators in the plan. I think we need to take a look at the code and see if we can hit such problems elsewhere.
    cross is broken
    ---------------

    Key: PIG-422
    URL: https://issues.apache.org/jira/browse/PIG-422
    Project: Pig
    Issue Type: Bug
    Affects Versions: types_branch
    Reporter: Olga Natkovich
    Assignee: Shravan Matthur Narayanamurthy
    Fix For: types_branch

    Attachments: 422.patch


    The following script fails:
    a = load 'data1' as (name, age, gpa);
    b = load 'data2' as (name, age, registration, contributions);
    c = filter a by age < 19 and gpa < 1.0;
    d = filter b by age < 19;
    e = cross c, d;
    store e into 'output';
    produces the following stack:
    0808261638_3210_r_000000java.lang.ClassCastException: org.apache.pig.data.DefaultDataBag cannot be cast to org.apache.pig.data.Tuple
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:264)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:220)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:231)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:220)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODistinct.getNext(PODistinct.java:76)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:270)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:351)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:158)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.processInput(POUserFunc.java:123)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:175)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:241)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:217)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:156)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:206)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:176)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:87)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:391)
    at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124)
    /Cross
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:158)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.processInput(POUserFunc.java:123)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:175)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:241)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:217)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:156)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:206)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:176)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:87)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:391)
    at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124)
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Shravan Matthur Narayanamurthy (JIRA) at Sep 10, 2008 at 6:16 pm
    [ https://issues.apache.org/jira/browse/PIG-422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Shravan Matthur Narayanamurthy updated PIG-422:
    -----------------------------------------------

    Attachment: 422.patch

    Included in this patch are the changes to visit(LOCross) in LogToPhyTranslationVisitor and a very simple new unit test for cross.
    cross is broken
    ---------------

    Key: PIG-422
    URL: https://issues.apache.org/jira/browse/PIG-422
    Project: Pig
    Issue Type: Bug
    Affects Versions: types_branch
    Reporter: Olga Natkovich
    Assignee: Shravan Matthur Narayanamurthy
    Fix For: types_branch

    Attachments: 422.patch


    The following script fails:
    a = load 'data1' as (name, age, gpa);
    b = load 'data2' as (name, age, registration, contributions);
    c = filter a by age < 19 and gpa < 1.0;
    d = filter b by age < 19;
    e = cross c, d;
    store e into 'output';
    produces the following stack:
    0808261638_3210_r_000000java.lang.ClassCastException: org.apache.pig.data.DefaultDataBag cannot be cast to org.apache.pig.data.Tuple
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:264)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:220)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:231)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:220)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODistinct.getNext(PODistinct.java:76)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:270)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:351)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:158)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.processInput(POUserFunc.java:123)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:175)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:241)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:217)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:156)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:206)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:176)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:87)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:391)
    at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124)
    /Cross
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:158)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.processInput(POUserFunc.java:123)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:175)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:241)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:217)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:156)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:206)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:176)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:87)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:391)
    at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124)
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Olga Natkovich (JIRA) at Sep 12, 2008 at 6:20 pm
    [ https://issues.apache.org/jira/browse/PIG-422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Olga Natkovich updated PIG-422:
    -------------------------------

    Resolution: Fixed
    Status: Resolved (was: Patch Available)

    patch committed. Thanks, Shravan!
    cross is broken
    ---------------

    Key: PIG-422
    URL: https://issues.apache.org/jira/browse/PIG-422
    Project: Pig
    Issue Type: Bug
    Affects Versions: types_branch
    Reporter: Olga Natkovich
    Assignee: Shravan Matthur Narayanamurthy
    Fix For: types_branch

    Attachments: 422.patch


    The following script fails:
    a = load 'data1' as (name, age, gpa);
    b = load 'data2' as (name, age, registration, contributions);
    c = filter a by age < 19 and gpa < 1.0;
    d = filter b by age < 19;
    e = cross c, d;
    store e into 'output';
    produces the following stack:
    0808261638_3210_r_000000java.lang.ClassCastException: org.apache.pig.data.DefaultDataBag cannot be cast to org.apache.pig.data.Tuple
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:264)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:220)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:231)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:220)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODistinct.getNext(PODistinct.java:76)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:270)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:351)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:158)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.processInput(POUserFunc.java:123)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:175)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:241)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:217)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:156)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:206)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:176)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:87)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:391)
    at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124)
    /Cross
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:158)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.processInput(POUserFunc.java:123)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:175)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:241)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:217)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:156)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:206)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:176)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:87)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:391)
    at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124)
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categoriespig, hadoop
postedSep 8, '08 at 10:52p
activeSep 12, '08 at 6:20p
posts4
users1
websitepig.apache.org

1 user in discussion

Olga Natkovich (JIRA): 4 posts

People

Translate

site design / logo © 2022 Grokbase