Grokbase Groups Pig dev December 2008
FAQ
Reducer plan generation fails when UDF contains integers as parameters
----------------------------------------------------------------------

Key: PIG-568
URL: https://issues.apache.org/jira/browse/PIG-568
Project: Pig
Issue Type: Bug
Components: impl
Affects Versions: types_branch
Reporter: Viraj Bhat
Priority: Critical
Fix For: types_branch


Consider the following pig script which contains a UDF known as MYUDF. MYUDF is a dummy UDF which takes in a Bag E and a set of integers as offsets.

{code}
register myudf.jar;

A = load 'visits.txt' using PigStorage() as ( name:chararray, url:chararray, timestamp:chararray);

B = filter A by (
(name is not null) AND
(timestamp is not null)
);

C = group B by (
url
);

D = foreach C {
E = order B by timestamp;
generate E;
}

G = foreach D generate
param.MYUDF(E, -1, 0, 1);
--this works
--param.MYUDF(E,'-1','0','1');

explain G;
dump G;
{code}

If you execute the above script, it fails during the reducer phase where the POUserFunc(MYUDF)[bag] is being called. The MYUDF code is infact not called but somehow the parameters passed to the MYUDF cause the exception in the reduce plan. If you replace the offsets -1,0,1 with '-1', '0', '1' (strings) the UDF seems to get called and the script works fine.
=============================================================================================================================
java.io.IOException: Received Error while processing the reduce plan.
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:307)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:247)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:224)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:136)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209)
=============================================================================================================================

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Viraj Bhat (JIRA) at Dec 17, 2008 at 2:25 am
    [ https://issues.apache.org/jira/browse/PIG-568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Viraj Bhat updated PIG-568:
    ---------------------------

    Attachment: visits.txt

    Test data
    Reducer plan generation fails when UDF contains integers as parameters
    ----------------------------------------------------------------------

    Key: PIG-568
    URL: https://issues.apache.org/jira/browse/PIG-568
    Project: Pig
    Issue Type: Bug
    Components: impl
    Affects Versions: types_branch
    Reporter: Viraj Bhat
    Priority: Critical
    Fix For: types_branch

    Attachments: myudfint.pig, visits.txt


    Consider the following pig script which contains a UDF known as MYUDF. MYUDF is a dummy UDF which takes in a Bag E and a set of integers as offsets.
    {code}
    register myudf.jar;
    A = load 'visits.txt' using PigStorage() as ( name:chararray, url:chararray, timestamp:chararray);
    B = filter A by (
    (name is not null) AND
    (timestamp is not null)
    );
    C = group B by (
    url
    );
    D = foreach C {
    E = order B by timestamp;
    generate E;
    }
    G = foreach D generate
    param.MYUDF(E, -1, 0, 1);
    --this works
    --param.MYUDF(E,'-1','0','1');
    explain G;
    dump G;
    {code}
    If you execute the above script, it fails during the reducer phase where the POUserFunc(MYUDF)[bag] is being called. The MYUDF code is infact not called but somehow the parameters passed to the MYUDF cause the exception in the reduce plan. If you replace the offsets -1,0,1 with '-1', '0', '1' (strings) the UDF seems to get called and the script works fine.
    =============================================================================================================================
    java.io.IOException: Received Error while processing the reduce plan.
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:307)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:247)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:224)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:136)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
    at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209)
    =============================================================================================================================
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Viraj Bhat (JIRA) at Dec 17, 2008 at 2:25 am
    [ https://issues.apache.org/jira/browse/PIG-568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Viraj Bhat updated PIG-568:
    ---------------------------

    Attachment: myudfint.pig

    Pig Script causing the exception
    Reducer plan generation fails when UDF contains integers as parameters
    ----------------------------------------------------------------------

    Key: PIG-568
    URL: https://issues.apache.org/jira/browse/PIG-568
    Project: Pig
    Issue Type: Bug
    Components: impl
    Affects Versions: types_branch
    Reporter: Viraj Bhat
    Priority: Critical
    Fix For: types_branch

    Attachments: myudfint.pig, visits.txt


    Consider the following pig script which contains a UDF known as MYUDF. MYUDF is a dummy UDF which takes in a Bag E and a set of integers as offsets.
    {code}
    register myudf.jar;
    A = load 'visits.txt' using PigStorage() as ( name:chararray, url:chararray, timestamp:chararray);
    B = filter A by (
    (name is not null) AND
    (timestamp is not null)
    );
    C = group B by (
    url
    );
    D = foreach C {
    E = order B by timestamp;
    generate E;
    }
    G = foreach D generate
    param.MYUDF(E, -1, 0, 1);
    --this works
    --param.MYUDF(E,'-1','0','1');
    explain G;
    dump G;
    {code}
    If you execute the above script, it fails during the reducer phase where the POUserFunc(MYUDF)[bag] is being called. The MYUDF code is infact not called but somehow the parameters passed to the MYUDF cause the exception in the reduce plan. If you replace the offsets -1,0,1 with '-1', '0', '1' (strings) the UDF seems to get called and the script works fine.
    =============================================================================================================================
    java.io.IOException: Received Error while processing the reduce plan.
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:307)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:247)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:224)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:136)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
    at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209)
    =============================================================================================================================
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Viraj Bhat (JIRA) at Dec 17, 2008 at 2:27 am
    [ https://issues.apache.org/jira/browse/PIG-568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Viraj Bhat updated PIG-568:
    ---------------------------

    Attachment: MYUDF.java

    Dummy UDF MYUDF.java used in the Pig script
    Reducer plan generation fails when UDF contains integers as parameters
    ----------------------------------------------------------------------

    Key: PIG-568
    URL: https://issues.apache.org/jira/browse/PIG-568
    Project: Pig
    Issue Type: Bug
    Components: impl
    Affects Versions: types_branch
    Reporter: Viraj Bhat
    Priority: Critical
    Fix For: types_branch

    Attachments: MYUDF.java, myudfint.pig, visits.txt


    Consider the following pig script which contains a UDF known as MYUDF. MYUDF is a dummy UDF which takes in a Bag E and a set of integers as offsets.
    {code}
    register myudf.jar;
    A = load 'visits.txt' using PigStorage() as ( name:chararray, url:chararray, timestamp:chararray);
    B = filter A by (
    (name is not null) AND
    (timestamp is not null)
    );
    C = group B by (
    url
    );
    D = foreach C {
    E = order B by timestamp;
    generate E;
    }
    G = foreach D generate
    param.MYUDF(E, -1, 0, 1);
    --this works
    --param.MYUDF(E,'-1','0','1');
    explain G;
    dump G;
    {code}
    If you execute the above script, it fails during the reducer phase where the POUserFunc(MYUDF)[bag] is being called. The MYUDF code is infact not called but somehow the parameters passed to the MYUDF cause the exception in the reduce plan. If you replace the offsets -1,0,1 with '-1', '0', '1' (strings) the UDF seems to get called and the script works fine.
    =============================================================================================================================
    java.io.IOException: Received Error while processing the reduce plan.
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:307)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:247)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:224)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:136)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
    at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209)
    =============================================================================================================================
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Pradeep Kamath (JIRA) at Dec 18, 2008 at 6:28 pm
    [ https://issues.apache.org/jira/browse/PIG-568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Pradeep Kamath resolved PIG-568.
    --------------------------------

    Resolution: Duplicate

    Marking as duplicate of https://issues.apache.org/jira/browse/PIG-522 as explained in the previous comment
    Reducer plan generation fails when UDF contains integers as parameters
    ----------------------------------------------------------------------

    Key: PIG-568
    URL: https://issues.apache.org/jira/browse/PIG-568
    Project: Pig
    Issue Type: Bug
    Components: impl
    Affects Versions: types_branch
    Reporter: Viraj Bhat
    Assignee: Pradeep Kamath
    Priority: Critical
    Fix For: types_branch

    Attachments: MYUDF.java, myudfint.pig, visits.txt


    Consider the following pig script which contains a UDF known as MYUDF. MYUDF is a dummy UDF which takes in a Bag E and a set of integers as offsets.
    {code}
    register myudf.jar;
    A = load 'visits.txt' using PigStorage() as ( name:chararray, url:chararray, timestamp:chararray);
    B = filter A by (
    (name is not null) AND
    (timestamp is not null)
    );
    C = group B by (
    url
    );
    D = foreach C {
    E = order B by timestamp;
    generate E;
    }
    G = foreach D generate
    param.MYUDF(E, -1, 0, 1);
    --this works
    --param.MYUDF(E,'-1','0','1');
    explain G;
    dump G;
    {code}
    If you execute the above script, it fails during the reducer phase where the POUserFunc(MYUDF)[bag] is being called. The MYUDF code is infact not called but somehow the parameters passed to the MYUDF cause the exception in the reduce plan. If you replace the offsets -1,0,1 with '-1', '0', '1' (strings) the UDF seems to get called and the script works fine.
    =============================================================================================================================
    java.io.IOException: Received Error while processing the reduce plan.
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:307)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:247)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:224)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:136)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
    at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209)
    =============================================================================================================================
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Pradeep Kamath (JIRA) at Dec 18, 2008 at 6:28 pm
    [ https://issues.apache.org/jira/browse/PIG-568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Pradeep Kamath updated PIG-568:
    -------------------------------

    Assignee: Pradeep Kamath

    Though the script has a UDF and error seems to rise from the use of the UDF, the root cause is a bug in PONegative which is used to represent the -1 argument. -1 is modelled as the Constant(1) as an expression to PONegative. This is a duplicate of issue https://issues.apache.org/jira/browse/PIG-522
    Reducer plan generation fails when UDF contains integers as parameters
    ----------------------------------------------------------------------

    Key: PIG-568
    URL: https://issues.apache.org/jira/browse/PIG-568
    Project: Pig
    Issue Type: Bug
    Components: impl
    Affects Versions: types_branch
    Reporter: Viraj Bhat
    Assignee: Pradeep Kamath
    Priority: Critical
    Fix For: types_branch

    Attachments: MYUDF.java, myudfint.pig, visits.txt


    Consider the following pig script which contains a UDF known as MYUDF. MYUDF is a dummy UDF which takes in a Bag E and a set of integers as offsets.
    {code}
    register myudf.jar;
    A = load 'visits.txt' using PigStorage() as ( name:chararray, url:chararray, timestamp:chararray);
    B = filter A by (
    (name is not null) AND
    (timestamp is not null)
    );
    C = group B by (
    url
    );
    D = foreach C {
    E = order B by timestamp;
    generate E;
    }
    G = foreach D generate
    param.MYUDF(E, -1, 0, 1);
    --this works
    --param.MYUDF(E,'-1','0','1');
    explain G;
    dump G;
    {code}
    If you execute the above script, it fails during the reducer phase where the POUserFunc(MYUDF)[bag] is being called. The MYUDF code is infact not called but somehow the parameters passed to the MYUDF cause the exception in the reduce plan. If you replace the offsets -1,0,1 with '-1', '0', '1' (strings) the UDF seems to get called and the script works fine.
    =============================================================================================================================
    java.io.IOException: Received Error while processing the reduce plan.
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:307)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:247)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:224)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:136)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
    at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209)
    =============================================================================================================================
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categoriespig, hadoop
postedDec 17, '08 at 2:23a
activeDec 18, '08 at 6:28p
posts6
users1
websitepig.apache.org

1 user in discussion

Pradeep Kamath (JIRA): 6 posts

People

Translate

site design / logo © 2023 Grokbase