FAQ
Schemas for bags should contain tuples all the time
---------------------------------------------------

Key: PIG-449
URL: https://issues.apache.org/jira/browse/PIG-449
Project: Pig
Issue Type: Bug
Affects Versions: types_branch
Reporter: Santhosh Srinivasan
Assignee: Santhosh Srinivasan
Fix For: types_branch


The front end treats relations as operators that return bags. When the schema of a load statement is specified, the bag is associated with the schema specified by the user. Ideally, the schema corresponds to the tuple contained in the bag.

With PIG-380, the schema for bag constants are computed by the front end. The schema for the bag contains the tuple which in turn contains the schema of the columns. This results in errors when columns are accessed directly just like the load statements.

The front end should then treat access to the columns as a double dereference, i.e., access the tuple inside the bag and then the column inside the tuple.

{code}
grunt> a = load '/user/sms/data/student.data' using PigStorage(' ') as (name, age, gpa);
grunt> b = foreach a generate {(16, 4.0e-2, 'hello')} as b:{t:(i: int, d: double, c: chararray)};

grunt> describe b;
b: {b: {t: (i: integer,d: double,c: chararray)}}

grunt> c = foreach b generate b.i;
111064 [main] ERROR org.apache.pig.tools.grunt.GruntParser - java.io.IOException: Invalid alias: i in {t: (i: integer,d: double,c: chararray)}
at org.apache.pig.PigServer.parseQuery(PigServer.java:293)
at org.apache.pig.PigServer.registerQuery(PigServer.java:258)
at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:432)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:242)
at org.apache.pig.tools.grunt.GruntParser.parseContOnError(GruntParser.java:93)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:58)
at org.apache.pig.Main.main(Main.java:282)
Caused by: org.apache.pig.impl.logicalLayer.parser.ParseException: Invalid alias: i in {t: (i: integer,d: double,c: chararray)}
at org.apache.pig.impl.logicalLayer.parser.QueryParser.AliasFieldOrSpec(QueryParser.java:5851)
at org.apache.pig.impl.logicalLayer.parser.QueryParser.ColOrSpec(QueryParser.java:5709)
at org.apache.pig.impl.logicalLayer.parser.QueryParser.BracketedSimpleProj(QueryParser.java:5242)
at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseEvalSpec(QueryParser.java:4040)
at org.apache.pig.impl.logicalLayer.parser.QueryParser.UnaryExpr(QueryParser.java:3909)
at org.apache.pig.impl.logicalLayer.parser.QueryParser.CastExpr(QueryParser.java:3863)
at org.apache.pig.impl.logicalLayer.parser.QueryParser.MultiplicativeExpr(QueryParser.java:3772)
at org.apache.pig.impl.logicalLayer.parser.QueryParser.AdditiveExpr(QueryParser.java:3698)
at org.apache.pig.impl.logicalLayer.parser.QueryParser.InfixExpr(QueryParser.java:3664)
at org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItem(QueryParser.java:3590)
at org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItemList(QueryParser.java:3500)
at org.apache.pig.impl.logicalLayer.parser.QueryParser.GenerateStatement(QueryParser.java:3457)
at org.apache.pig.impl.logicalLayer.parser.QueryParser.NestedBlock(QueryParser.java:2933)
at org.apache.pig.impl.logicalLayer.parser.QueryParser.ForEachClause(QueryParser.java:2336)
at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:973)
at org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:748)
at org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:549)
at org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:60)
at org.apache.pig.PigServer.parseQuery(PigServer.java:290)
... 6 more

111064 [main] ERROR org.apache.pig.tools.grunt.GruntParser - Invalid alias: i in {t: (i: integer,d: double,c: chararray)}
111064 [main] ERROR org.apache.pig.tools.grunt.GruntParser - java.io.IOException: Invalid alias: i in {t: (i: integer,d: double,c: chararray)}
grunt> c = foreach b generate b.t;
grunt> describe c;
c: {t: {i: integer,d: double,c: chararray}}

{code}

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Pradeep Kamath (JIRA) at Sep 26, 2008 at 10:48 pm
    [ https://issues.apache.org/jira/browse/PIG-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635050#action_12635050 ]

    Pradeep Kamath commented on PIG-449:
    ------------------------------------

    Another script snippet which shows the above inconsistency:
    {code}
    grunt> a = load 'bla' as (b:bag{t:(x,y,z)}, t:tuple(a,b,c), m:[]);
    grunt> describe a;
    a: {b: {t: (x: bytearray,y: bytearray,z: bytearray)},t: (a: bytearray,b: bytearray,c: bytearray),m: map[ ]}
    grunt> b = foreach a generate b.y;
    2008-09-26 15:35:22,673 [main] ERROR org.apache.pig.tools.grunt.GruntParser - java.io.IOException: Invalid alias: y in {t: (x: bytearray,y: bytearray,z: bytearray)}
    at org.apache.pig.PigServer.parseQuery(PigServer.java:293)
    at org.apache.pig.PigServer.registerQuery(PigServer.java:258)
    at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:434)
    at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:242)
    at org.apache.pig.tools.grunt.GruntParser.parseContOnError(GruntParser.java:94)
    at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:58)
    at org.apache.pig.Main.main(Main.java:282)
    Caused by: org.apache.pig.impl.logicalLayer.parser.ParseException: Invalid alias: y in {t: (x: bytearray,y: bytearray,z: bytearray)}
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.AliasFieldOrSpec(QueryParser.java:5851)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.ColOrSpec(QueryParser.java:5709)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.BracketedSimpleProj(QueryParser.java:5242)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseEvalSpec(QueryParser.java:4040)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.UnaryExpr(QueryParser.java:3909)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.CastExpr(QueryParser.java:3863)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.MultiplicativeExpr(QueryParser.java:3772)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.AdditiveExpr(QueryParser.java:3698)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.InfixExpr(QueryParser.java:3664)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItem(QueryParser.java:3590)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItemList(QueryParser.java:3500)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.GenerateStatement(QueryParser.java:3457)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.NestedBlock(QueryParser.java:2933)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.ForEachClause(QueryParser.java:2336)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:973)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:748)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:549)
    at org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:60)
    at org.apache.pig.PigServer.parseQuery(PigServer.java:290)
    ... 6 more

    2008-09-26 15:35:22,674 [main] ERROR org.apache.pig.tools.grunt.GruntParser - Invalid alias: y in {t: (x: bytearray,y: bytearray,z: bytearray)}
    2008-09-26 15:35:22,674 [main] ERROR org.apache.pig.tools.grunt.GruntParser - java.io.IOException: Invalid alias: y in {t: (x: bytearray,y: bytearray,z: bytearray)}
    grunt> b = foreach a generate b.t.y;
    grunt> describe b;
    b: {y: {y: bytearray}} --> THIS ALSO LOOKS WRONG

    {code}
    Schemas for bags should contain tuples all the time
    ---------------------------------------------------

    Key: PIG-449
    URL: https://issues.apache.org/jira/browse/PIG-449
    Project: Pig
    Issue Type: Bug
    Affects Versions: types_branch
    Reporter: Santhosh Srinivasan
    Assignee: Santhosh Srinivasan
    Fix For: types_branch


    The front end treats relations as operators that return bags. When the schema of a load statement is specified, the bag is associated with the schema specified by the user. Ideally, the schema corresponds to the tuple contained in the bag.
    With PIG-380, the schema for bag constants are computed by the front end. The schema for the bag contains the tuple which in turn contains the schema of the columns. This results in errors when columns are accessed directly just like the load statements.
    The front end should then treat access to the columns as a double dereference, i.e., access the tuple inside the bag and then the column inside the tuple.
    {code}
    grunt> a = load '/user/sms/data/student.data' using PigStorage(' ') as (name, age, gpa);
    grunt> b = foreach a generate {(16, 4.0e-2, 'hello')} as b:{t:(i: int, d: double, c: chararray)};
    grunt> describe b;
    b: {b: {t: (i: integer,d: double,c: chararray)}}
    grunt> c = foreach b generate b.i;
    111064 [main] ERROR org.apache.pig.tools.grunt.GruntParser - java.io.IOException: Invalid alias: i in {t: (i: integer,d: double,c: chararray)}
    at org.apache.pig.PigServer.parseQuery(PigServer.java:293)
    at org.apache.pig.PigServer.registerQuery(PigServer.java:258)
    at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:432)
    at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:242)
    at org.apache.pig.tools.grunt.GruntParser.parseContOnError(GruntParser.java:93)
    at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:58)
    at org.apache.pig.Main.main(Main.java:282)
    Caused by: org.apache.pig.impl.logicalLayer.parser.ParseException: Invalid alias: i in {t: (i: integer,d: double,c: chararray)}
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.AliasFieldOrSpec(QueryParser.java:5851)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.ColOrSpec(QueryParser.java:5709)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.BracketedSimpleProj(QueryParser.java:5242)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseEvalSpec(QueryParser.java:4040)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.UnaryExpr(QueryParser.java:3909)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.CastExpr(QueryParser.java:3863)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.MultiplicativeExpr(QueryParser.java:3772)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.AdditiveExpr(QueryParser.java:3698)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.InfixExpr(QueryParser.java:3664)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItem(QueryParser.java:3590)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItemList(QueryParser.java:3500)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.GenerateStatement(QueryParser.java:3457)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.NestedBlock(QueryParser.java:2933)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.ForEachClause(QueryParser.java:2336)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:973)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:748)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:549)
    at org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:60)
    at org.apache.pig.PigServer.parseQuery(PigServer.java:290)
    ... 6 more
    111064 [main] ERROR org.apache.pig.tools.grunt.GruntParser - Invalid alias: i in {t: (i: integer,d: double,c: chararray)}
    111064 [main] ERROR org.apache.pig.tools.grunt.GruntParser - java.io.IOException: Invalid alias: i in {t: (i: integer,d: double,c: chararray)}
    grunt> c = foreach b generate b.t;
    grunt> describe c;
    c: {t: {i: integer,d: double,c: chararray}}
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Santhosh Srinivasan (JIRA) at Nov 11, 2008 at 10:52 pm
    [ https://issues.apache.org/jira/browse/PIG-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12646712#action_12646712 ]

    Santhosh Srinivasan commented on PIG-449:
    -----------------------------------------

    Currently, bags in Pig are containers of tuples. Accessing elements inside a bag should translate to accessing elements inside the tuple contained in the bag. In addition, accessing tuples inside a bag should be restricted to the FLATTEN keyword in a FOREACH statement. A few examples shown below will demonstrate the point.

    {code}
    a = load '/user/pig/data/student.data' using PigStorage(' ') as (name, age, gpa);
    b = foreach a generate {(16, 4.0e-2, 'hello')} as b:{t:(i: int, d: double, c: chararray)};
    c = foreach b generate b.i; -- Here b.i should generate a bag of integers by accessing the column called 'i' inside each tuple
    d = foeach b generate b.t; -- This should be outlawed as the tuple inside the bag does not have a column called 't' although the tuple inside the bag are named 't'
    {code}

    Summary:

    1. The frontend should translate access to columns in a bag to columns inside the tuple in the bag
    2. The frontend should prevent access to tuples inside the bag via projections and allow access only via the FLATTEN keyword

    Thoughts/suggestions/comments are welcome.
    Schemas for bags should contain tuples all the time
    ---------------------------------------------------

    Key: PIG-449
    URL: https://issues.apache.org/jira/browse/PIG-449
    Project: Pig
    Issue Type: Bug
    Affects Versions: types_branch
    Reporter: Santhosh Srinivasan
    Assignee: Santhosh Srinivasan
    Fix For: types_branch


    The front end treats relations as operators that return bags. When the schema of a load statement is specified, the bag is associated with the schema specified by the user. Ideally, the schema corresponds to the tuple contained in the bag.
    With PIG-380, the schema for bag constants are computed by the front end. The schema for the bag contains the tuple which in turn contains the schema of the columns. This results in errors when columns are accessed directly just like the load statements.
    The front end should then treat access to the columns as a double dereference, i.e., access the tuple inside the bag and then the column inside the tuple.
    {code}
    grunt> a = load '/user/sms/data/student.data' using PigStorage(' ') as (name, age, gpa);
    grunt> b = foreach a generate {(16, 4.0e-2, 'hello')} as b:{t:(i: int, d: double, c: chararray)};
    grunt> describe b;
    b: {b: {t: (i: integer,d: double,c: chararray)}}
    grunt> c = foreach b generate b.i;
    111064 [main] ERROR org.apache.pig.tools.grunt.GruntParser - java.io.IOException: Invalid alias: i in {t: (i: integer,d: double,c: chararray)}
    at org.apache.pig.PigServer.parseQuery(PigServer.java:293)
    at org.apache.pig.PigServer.registerQuery(PigServer.java:258)
    at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:432)
    at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:242)
    at org.apache.pig.tools.grunt.GruntParser.parseContOnError(GruntParser.java:93)
    at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:58)
    at org.apache.pig.Main.main(Main.java:282)
    Caused by: org.apache.pig.impl.logicalLayer.parser.ParseException: Invalid alias: i in {t: (i: integer,d: double,c: chararray)}
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.AliasFieldOrSpec(QueryParser.java:5851)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.ColOrSpec(QueryParser.java:5709)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.BracketedSimpleProj(QueryParser.java:5242)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseEvalSpec(QueryParser.java:4040)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.UnaryExpr(QueryParser.java:3909)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.CastExpr(QueryParser.java:3863)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.MultiplicativeExpr(QueryParser.java:3772)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.AdditiveExpr(QueryParser.java:3698)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.InfixExpr(QueryParser.java:3664)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItem(QueryParser.java:3590)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItemList(QueryParser.java:3500)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.GenerateStatement(QueryParser.java:3457)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.NestedBlock(QueryParser.java:2933)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.ForEachClause(QueryParser.java:2336)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:973)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:748)
    at org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:549)
    at org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:60)
    at org.apache.pig.PigServer.parseQuery(PigServer.java:290)
    ... 6 more
    111064 [main] ERROR org.apache.pig.tools.grunt.GruntParser - Invalid alias: i in {t: (i: integer,d: double,c: chararray)}
    111064 [main] ERROR org.apache.pig.tools.grunt.GruntParser - java.io.IOException: Invalid alias: i in {t: (i: integer,d: double,c: chararray)}
    grunt> c = foreach b generate b.t;
    grunt> describe c;
    c: {t: {i: integer,d: double,c: chararray}}
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categoriespig, hadoop
postedSep 23, '08 at 2:47a
activeNov 11, '08 at 10:52p
posts3
users1
websitepig.apache.org

1 user in discussion

Santhosh Srinivasan (JIRA): 3 posts

People

Translate

site design / logo © 2022 Grokbase