Grokbase Groups Pig dev August 2008
FAQ
Filter does not allow udf as the filter operator and only allows ComparisonOperators
------------------------------------------------------------------------------------

Key: PIG-369
URL: https://issues.apache.org/jira/browse/PIG-369
Project: Pig
Issue Type: Bug
Affects Versions: types_branch
Reporter: Pradeep Kamath
Fix For: types_branch


The following pig script does not work:
{code}
register util.jar;
define MyFilterSet util.FilterUdf('filter.txt');
A = load 'simpletest' using PigStorage() as ( x, y );
B = filter A by MyFilterSet(x);
dump B;
{code}

The following error is seen:
{noformat}

java -cp pig.jar:$localc org.apache.pig.Main filter.pig
2008-08-07 17:59:37,663 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: localhost:9000
2008-08-07 17:59:37,748 [main] WARN org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
2008-08-07 17:59:38,035 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: localhost:9001
2008-08-07 17:59:38,166 [main] WARN org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
java.io.IOException: Unable to open iterator for alias: B [org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc]
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.setPlan(POFilter.java:179)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:592)
at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:102)
at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:31)
at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68)
at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:245)
at org.apache.pig.PigServer.compilePp(PigServer.java:590)
at org.apache.pig.PigServer.execute(PigServer.java:516)
at org.apache.pig.PigServer.openIterator(PigServer.java:307)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:258)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:175)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:82)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64)
at org.apache.pig.Main.main(Main.java:302)
Caused by: java.lang.ClassCastException: org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc
... 15 more

{noformat}

I looked further and the issue seems to be in POFilter which only thinks of the filter operator as a ComparisonOperator and doesn't allow a UDF for filtering:
{code}
public void setPlan(PhysicalPlan plan) {
this.plan = plan;
comOp = (ComparisonOperator) (plan.getLeaves()).get(0);
compOperandType = comOp.getOperandType();
}
{code}

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Olga Natkovich (JIRA) at Aug 11, 2008 at 4:29 pm
    [ https://issues.apache.org/jira/browse/PIG-369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Olga Natkovich reassigned PIG-369:
    ----------------------------------

    Assignee: Shravan Matthur Narayanamurthy
    Filter does not allow udf as the filter operator and only allows ComparisonOperators
    ------------------------------------------------------------------------------------

    Key: PIG-369
    URL: https://issues.apache.org/jira/browse/PIG-369
    Project: Pig
    Issue Type: Bug
    Affects Versions: types_branch
    Reporter: Pradeep Kamath
    Assignee: Shravan Matthur Narayanamurthy
    Fix For: types_branch


    The following pig script does not work:
    {code}
    register util.jar;
    define MyFilterSet util.FilterUdf('filter.txt');
    A = load 'simpletest' using PigStorage() as ( x, y );
    B = filter A by MyFilterSet(x);
    dump B;
    {code}
    The following error is seen:
    {noformat}
    java -cp pig.jar:$localc org.apache.pig.Main filter.pig
    2008-08-07 17:59:37,663 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: localhost:9000
    2008-08-07 17:59:37,748 [main] WARN org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
    2008-08-07 17:59:38,035 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: localhost:9001
    2008-08-07 17:59:38,166 [main] WARN org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
    java.io.IOException: Unable to open iterator for alias: B [org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc]
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.setPlan(POFilter.java:179)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:592)
    at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:102)
    at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:31)
    at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68)
    at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
    at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:245)
    at org.apache.pig.PigServer.compilePp(PigServer.java:590)
    at org.apache.pig.PigServer.execute(PigServer.java:516)
    at org.apache.pig.PigServer.openIterator(PigServer.java:307)
    at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:258)
    at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:175)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:82)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64)
    at org.apache.pig.Main.main(Main.java:302)
    Caused by: java.lang.ClassCastException: org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc
    ... 15 more
    {noformat}
    I looked further and the issue seems to be in POFilter which only thinks of the filter operator as a ComparisonOperator and doesn't allow a UDF for filtering:
    {code}
    public void setPlan(PhysicalPlan plan) {
    this.plan = plan;
    comOp = (ComparisonOperator) (plan.getLeaves()).get(0);
    compOperandType = comOp.getOperandType();
    }
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Shravan Matthur Narayanamurthy (JIRA) at Aug 13, 2008 at 12:47 pm
    [ https://issues.apache.org/jira/browse/PIG-369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Shravan Matthur Narayanamurthy updated PIG-369:
    -----------------------------------------------

    Status: Patch Available (was: Open)

    Implemented the visit(LOCross) method in LogToPhyTranslationVisitor. This mimics what we were doing in Pig-1.0. To summarize, the following script with Cross will be converted as shown below:

    {noformat}
    A1 = load 'f1';
    A2 = load 'f2';
    .
    .
    .
    An = load 'fn';
    B = cross A1,A2,...,An;
    {noformat}

    {noformat}
    A1 = load 'f1';
    .
    .
    .
    An = load 'fn';
    B1 = foreach A1 generate flatten(GFCross('n','0')), flatten(*);
    B2 = foreach A2 generate flatten(GFCross('n','1')), flatten(*);
    .
    .
    .
    Bn = foreach An generate flatten(GFCross('n','n-1')), flatten(*);
    C = splgroup B1 by ($0,$1,..,$n-1) inner, B2 by ($0,$1,..,$n-1) inner, ..., Bn by ($0,$1,..,$n-1) inner;
    D = foreach C generate flatten($1), flatten($2), ..., flatten($n);
    {noformat}

    GFCross outputs a bag with n-tuples and the foreach flattens the bag attaches them to the original tuples thus replicating each tuple.

    The only difference from a normal pig script is the splgroup where the local-rearrange has a slight modification. When it is processing a cross, it removes the first n values from each value tuple which were attached to it by the foreach and passes the correct tuple as value while retaining the first n values as the key.

    For ex, the foreach might produce (2,1,R,4) where (R,4) is the actual tuple & (2,1) is one of the tuples in the GFCross output. The localrearrange here arranges such tuples into keys and values by makeing (2,1) the key and (R,4) the value.

    So the patch has two changes: one to translator & the other to localrearrange.
    Filter does not allow udf as the filter operator and only allows ComparisonOperators
    ------------------------------------------------------------------------------------

    Key: PIG-369
    URL: https://issues.apache.org/jira/browse/PIG-369
    Project: Pig
    Issue Type: Bug
    Affects Versions: types_branch
    Reporter: Pradeep Kamath
    Assignee: Shravan Matthur Narayanamurthy
    Fix For: types_branch


    The following pig script does not work:
    {code}
    register util.jar;
    define MyFilterSet util.FilterUdf('filter.txt');
    A = load 'simpletest' using PigStorage() as ( x, y );
    B = filter A by MyFilterSet(x);
    dump B;
    {code}
    The following error is seen:
    {noformat}
    java -cp pig.jar:$localc org.apache.pig.Main filter.pig
    2008-08-07 17:59:37,663 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: localhost:9000
    2008-08-07 17:59:37,748 [main] WARN org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
    2008-08-07 17:59:38,035 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: localhost:9001
    2008-08-07 17:59:38,166 [main] WARN org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
    java.io.IOException: Unable to open iterator for alias: B [org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc]
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.setPlan(POFilter.java:179)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:592)
    at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:102)
    at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:31)
    at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68)
    at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
    at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:245)
    at org.apache.pig.PigServer.compilePp(PigServer.java:590)
    at org.apache.pig.PigServer.execute(PigServer.java:516)
    at org.apache.pig.PigServer.openIterator(PigServer.java:307)
    at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:258)
    at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:175)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:82)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64)
    at org.apache.pig.Main.main(Main.java:302)
    Caused by: java.lang.ClassCastException: org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc
    ... 15 more
    {noformat}
    I looked further and the issue seems to be in POFilter which only thinks of the filter operator as a ComparisonOperator and doesn't allow a UDF for filtering:
    {code}
    public void setPlan(PhysicalPlan plan) {
    this.plan = plan;
    comOp = (ComparisonOperator) (plan.getLeaves()).get(0);
    compOperandType = comOp.getOperandType();
    }
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Shravan Matthur Narayanamurthy (JIRA) at Aug 13, 2008 at 12:49 pm
    [ https://issues.apache.org/jira/browse/PIG-369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Shravan Matthur Narayanamurthy updated PIG-369:
    -----------------------------------------------

    Status: Open (was: Patch Available)
    Filter does not allow udf as the filter operator and only allows ComparisonOperators
    ------------------------------------------------------------------------------------

    Key: PIG-369
    URL: https://issues.apache.org/jira/browse/PIG-369
    Project: Pig
    Issue Type: Bug
    Affects Versions: types_branch
    Reporter: Pradeep Kamath
    Assignee: Shravan Matthur Narayanamurthy
    Fix For: types_branch


    The following pig script does not work:
    {code}
    register util.jar;
    define MyFilterSet util.FilterUdf('filter.txt');
    A = load 'simpletest' using PigStorage() as ( x, y );
    B = filter A by MyFilterSet(x);
    dump B;
    {code}
    The following error is seen:
    {noformat}
    java -cp pig.jar:$localc org.apache.pig.Main filter.pig
    2008-08-07 17:59:37,663 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: localhost:9000
    2008-08-07 17:59:37,748 [main] WARN org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
    2008-08-07 17:59:38,035 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: localhost:9001
    2008-08-07 17:59:38,166 [main] WARN org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
    java.io.IOException: Unable to open iterator for alias: B [org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc]
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.setPlan(POFilter.java:179)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:592)
    at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:102)
    at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:31)
    at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68)
    at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
    at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:245)
    at org.apache.pig.PigServer.compilePp(PigServer.java:590)
    at org.apache.pig.PigServer.execute(PigServer.java:516)
    at org.apache.pig.PigServer.openIterator(PigServer.java:307)
    at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:258)
    at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:175)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:82)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64)
    at org.apache.pig.Main.main(Main.java:302)
    Caused by: java.lang.ClassCastException: org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc
    ... 15 more
    {noformat}
    I looked further and the issue seems to be in POFilter which only thinks of the filter operator as a ComparisonOperator and doesn't allow a UDF for filtering:
    {code}
    public void setPlan(PhysicalPlan plan) {
    this.plan = plan;
    comOp = (ComparisonOperator) (plan.getLeaves()).get(0);
    compOperandType = comOp.getOperandType();
    }
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Shravan Matthur Narayanamurthy (JIRA) at Aug 13, 2008 at 12:49 pm
    [ https://issues.apache.org/jira/browse/PIG-369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Shravan Matthur Narayanamurthy updated PIG-369:
    -----------------------------------------------

    Comment: was deleted
    Filter does not allow udf as the filter operator and only allows ComparisonOperators
    ------------------------------------------------------------------------------------

    Key: PIG-369
    URL: https://issues.apache.org/jira/browse/PIG-369
    Project: Pig
    Issue Type: Bug
    Affects Versions: types_branch
    Reporter: Pradeep Kamath
    Assignee: Shravan Matthur Narayanamurthy
    Fix For: types_branch


    The following pig script does not work:
    {code}
    register util.jar;
    define MyFilterSet util.FilterUdf('filter.txt');
    A = load 'simpletest' using PigStorage() as ( x, y );
    B = filter A by MyFilterSet(x);
    dump B;
    {code}
    The following error is seen:
    {noformat}
    java -cp pig.jar:$localc org.apache.pig.Main filter.pig
    2008-08-07 17:59:37,663 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: localhost:9000
    2008-08-07 17:59:37,748 [main] WARN org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
    2008-08-07 17:59:38,035 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: localhost:9001
    2008-08-07 17:59:38,166 [main] WARN org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
    java.io.IOException: Unable to open iterator for alias: B [org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc]
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.setPlan(POFilter.java:179)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:592)
    at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:102)
    at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:31)
    at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68)
    at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
    at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:245)
    at org.apache.pig.PigServer.compilePp(PigServer.java:590)
    at org.apache.pig.PigServer.execute(PigServer.java:516)
    at org.apache.pig.PigServer.openIterator(PigServer.java:307)
    at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:258)
    at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:175)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:82)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64)
    at org.apache.pig.Main.main(Main.java:302)
    Caused by: java.lang.ClassCastException: org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc
    ... 15 more
    {noformat}
    I looked further and the issue seems to be in POFilter which only thinks of the filter operator as a ComparisonOperator and doesn't allow a UDF for filtering:
    {code}
    public void setPlan(PhysicalPlan plan) {
    this.plan = plan;
    comOp = (ComparisonOperator) (plan.getLeaves()).get(0);
    compOperandType = comOp.getOperandType();
    }
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Shravan Matthur Narayanamurthy (JIRA) at Aug 13, 2008 at 12:55 pm
    [ https://issues.apache.org/jira/browse/PIG-369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Shravan Matthur Narayanamurthy updated PIG-369:
    -----------------------------------------------

    Attachment: 369.patch
    Filter does not allow udf as the filter operator and only allows ComparisonOperators
    ------------------------------------------------------------------------------------

    Key: PIG-369
    URL: https://issues.apache.org/jira/browse/PIG-369
    Project: Pig
    Issue Type: Bug
    Affects Versions: types_branch
    Reporter: Pradeep Kamath
    Assignee: Shravan Matthur Narayanamurthy
    Fix For: types_branch

    Attachments: 369.patch


    The following pig script does not work:
    {code}
    register util.jar;
    define MyFilterSet util.FilterUdf('filter.txt');
    A = load 'simpletest' using PigStorage() as ( x, y );
    B = filter A by MyFilterSet(x);
    dump B;
    {code}
    The following error is seen:
    {noformat}
    java -cp pig.jar:$localc org.apache.pig.Main filter.pig
    2008-08-07 17:59:37,663 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: localhost:9000
    2008-08-07 17:59:37,748 [main] WARN org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
    2008-08-07 17:59:38,035 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: localhost:9001
    2008-08-07 17:59:38,166 [main] WARN org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
    java.io.IOException: Unable to open iterator for alias: B [org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc]
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.setPlan(POFilter.java:179)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:592)
    at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:102)
    at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:31)
    at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68)
    at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
    at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:245)
    at org.apache.pig.PigServer.compilePp(PigServer.java:590)
    at org.apache.pig.PigServer.execute(PigServer.java:516)
    at org.apache.pig.PigServer.openIterator(PigServer.java:307)
    at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:258)
    at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:175)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:82)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64)
    at org.apache.pig.Main.main(Main.java:302)
    Caused by: java.lang.ClassCastException: org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc
    ... 15 more
    {noformat}
    I looked further and the issue seems to be in POFilter which only thinks of the filter operator as a ComparisonOperator and doesn't allow a UDF for filtering:
    {code}
    public void setPlan(PhysicalPlan plan) {
    this.plan = plan;
    comOp = (ComparisonOperator) (plan.getLeaves()).get(0);
    compOperandType = comOp.getOperandType();
    }
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Shravan Matthur Narayanamurthy (JIRA) at Aug 13, 2008 at 12:55 pm
    [ https://issues.apache.org/jira/browse/PIG-369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Shravan Matthur Narayanamurthy updated PIG-369:
    -----------------------------------------------

    Status: Patch Available (was: Open)

    Changes to POFilter: Made the leaf operator to a PhysicalOperator from ComparisonOperator
    Changes to TypeCheckingVisitor: Was failing in checkInnerPlan as LOUserFunc was not one of the supported roots. Added it since it can occur in the inner plan of a filter and used the visit(LOUserFunc) method to do the necessary type checking and schema propogation.
    Changes to Translator: Minor. The addition was causing a NPE. Fixed it by putting a null check
    Added a new unit test for FilterUDF
    Filter does not allow udf as the filter operator and only allows ComparisonOperators
    ------------------------------------------------------------------------------------

    Key: PIG-369
    URL: https://issues.apache.org/jira/browse/PIG-369
    Project: Pig
    Issue Type: Bug
    Affects Versions: types_branch
    Reporter: Pradeep Kamath
    Assignee: Shravan Matthur Narayanamurthy
    Fix For: types_branch

    Attachments: 369.patch


    The following pig script does not work:
    {code}
    register util.jar;
    define MyFilterSet util.FilterUdf('filter.txt');
    A = load 'simpletest' using PigStorage() as ( x, y );
    B = filter A by MyFilterSet(x);
    dump B;
    {code}
    The following error is seen:
    {noformat}
    java -cp pig.jar:$localc org.apache.pig.Main filter.pig
    2008-08-07 17:59:37,663 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: localhost:9000
    2008-08-07 17:59:37,748 [main] WARN org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
    2008-08-07 17:59:38,035 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: localhost:9001
    2008-08-07 17:59:38,166 [main] WARN org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
    java.io.IOException: Unable to open iterator for alias: B [org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc]
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.setPlan(POFilter.java:179)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:592)
    at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:102)
    at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:31)
    at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68)
    at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
    at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:245)
    at org.apache.pig.PigServer.compilePp(PigServer.java:590)
    at org.apache.pig.PigServer.execute(PigServer.java:516)
    at org.apache.pig.PigServer.openIterator(PigServer.java:307)
    at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:258)
    at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:175)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:82)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64)
    at org.apache.pig.Main.main(Main.java:302)
    Caused by: java.lang.ClassCastException: org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc
    ... 15 more
    {noformat}
    I looked further and the issue seems to be in POFilter which only thinks of the filter operator as a ComparisonOperator and doesn't allow a UDF for filtering:
    {code}
    public void setPlan(PhysicalPlan plan) {
    this.plan = plan;
    comOp = (ComparisonOperator) (plan.getLeaves()).get(0);
    compOperandType = comOp.getOperandType();
    }
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Shravan Matthur Narayanamurthy (JIRA) at Aug 14, 2008 at 1:06 pm
    [ https://issues.apache.org/jira/browse/PIG-369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Shravan Matthur Narayanamurthy updated PIG-369:
    -----------------------------------------------

    Status: Open (was: Patch Available)

    I am including this patch in the patch for Pig-375.
    Filter does not allow udf as the filter operator and only allows ComparisonOperators
    ------------------------------------------------------------------------------------

    Key: PIG-369
    URL: https://issues.apache.org/jira/browse/PIG-369
    Project: Pig
    Issue Type: Bug
    Affects Versions: types_branch
    Reporter: Pradeep Kamath
    Assignee: Shravan Matthur Narayanamurthy
    Fix For: types_branch

    Attachments: 369.patch


    The following pig script does not work:
    {code}
    register util.jar;
    define MyFilterSet util.FilterUdf('filter.txt');
    A = load 'simpletest' using PigStorage() as ( x, y );
    B = filter A by MyFilterSet(x);
    dump B;
    {code}
    The following error is seen:
    {noformat}
    java -cp pig.jar:$localc org.apache.pig.Main filter.pig
    2008-08-07 17:59:37,663 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: localhost:9000
    2008-08-07 17:59:37,748 [main] WARN org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
    2008-08-07 17:59:38,035 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: localhost:9001
    2008-08-07 17:59:38,166 [main] WARN org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
    java.io.IOException: Unable to open iterator for alias: B [org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc]
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.setPlan(POFilter.java:179)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:592)
    at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:102)
    at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:31)
    at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68)
    at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
    at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:245)
    at org.apache.pig.PigServer.compilePp(PigServer.java:590)
    at org.apache.pig.PigServer.execute(PigServer.java:516)
    at org.apache.pig.PigServer.openIterator(PigServer.java:307)
    at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:258)
    at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:175)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:82)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64)
    at org.apache.pig.Main.main(Main.java:302)
    Caused by: java.lang.ClassCastException: org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc
    ... 15 more
    {noformat}
    I looked further and the issue seems to be in POFilter which only thinks of the filter operator as a ComparisonOperator and doesn't allow a UDF for filtering:
    {code}
    public void setPlan(PhysicalPlan plan) {
    this.plan = plan;
    comOp = (ComparisonOperator) (plan.getLeaves()).get(0);
    compOperandType = comOp.getOperandType();
    }
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Olga Natkovich (JIRA) at Aug 14, 2008 at 9:32 pm
    [ https://issues.apache.org/jira/browse/PIG-369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Olga Natkovich resolved PIG-369.
    --------------------------------

    Resolution: Fixed

    patch committed; thanks, shravan!
    Filter does not allow udf as the filter operator and only allows ComparisonOperators
    ------------------------------------------------------------------------------------

    Key: PIG-369
    URL: https://issues.apache.org/jira/browse/PIG-369
    Project: Pig
    Issue Type: Bug
    Affects Versions: types_branch
    Reporter: Pradeep Kamath
    Assignee: Shravan Matthur Narayanamurthy
    Fix For: types_branch

    Attachments: 369.patch


    The following pig script does not work:
    {code}
    register util.jar;
    define MyFilterSet util.FilterUdf('filter.txt');
    A = load 'simpletest' using PigStorage() as ( x, y );
    B = filter A by MyFilterSet(x);
    dump B;
    {code}
    The following error is seen:
    {noformat}
    java -cp pig.jar:$localc org.apache.pig.Main filter.pig
    2008-08-07 17:59:37,663 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: localhost:9000
    2008-08-07 17:59:37,748 [main] WARN org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
    2008-08-07 17:59:38,035 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: localhost:9001
    2008-08-07 17:59:38,166 [main] WARN org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
    java.io.IOException: Unable to open iterator for alias: B [org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc]
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.setPlan(POFilter.java:179)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:592)
    at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:102)
    at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:31)
    at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68)
    at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
    at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:245)
    at org.apache.pig.PigServer.compilePp(PigServer.java:590)
    at org.apache.pig.PigServer.execute(PigServer.java:516)
    at org.apache.pig.PigServer.openIterator(PigServer.java:307)
    at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:258)
    at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:175)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:82)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64)
    at org.apache.pig.Main.main(Main.java:302)
    Caused by: java.lang.ClassCastException: org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc
    ... 15 more
    {noformat}
    I looked further and the issue seems to be in POFilter which only thinks of the filter operator as a ComparisonOperator and doesn't allow a UDF for filtering:
    {code}
    public void setPlan(PhysicalPlan plan) {
    this.plan = plan;
    comOp = (ComparisonOperator) (plan.getLeaves()).get(0);
    compOperandType = comOp.getOperandType();
    }
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categoriespig, hadoop
postedAug 8, '08 at 1:09a
activeAug 14, '08 at 9:32p
posts9
users1
websitepig.apache.org

1 user in discussion

Olga Natkovich (JIRA): 9 posts

People

Translate

site design / logo © 2022 Grokbase