Grokbase Groups Pig user August 2011
FAQ
Hello,

I'm trying to generate a tuple from a very wide data set, but running in to problems. I'm running Pig 0.9.0 r1148983 in local mode.

Because the data set it so wide, I'd prefer not to explicitly state each field, and instead use either * or the $0..$N syntax when generating the tuple. Additionally, the schema of each field is quite long and arbitrary as the data has been generated using macros. (Which is essentially why I'm putting the data into a tuple - so that further down in my script I can more easily refer to the fields)

The code below illustrates the issue I'm having referencing fields in the tuple. Is it a bug, or am I missing something?

Thanks in advance,
Grahame


describe a;
a: {f1: int,f2: int,f3: int,f4: int,f5: int,f6: int,f7: int,f8: int,f9: int,f10: int}

aa = FOREACH a GENERATE $0, TOTUPLE($2,$3,$4,$5);
aaa = FOREACH aa GENERATE $0, $1.$0; -- OK
aaa = FOREACH aa GENERATE $0, $1.f3; -- OK
aaa = FOREACH aa GENERATE $0, $1.$1; -- OK
aaa = FOREACH aa GENERATE $0, $1.f4; -- OK

aa = FOREACH a GENERATE $0, TOTUPLE($2..$5); -- should be the same as above?
aaa = FOREACH aa GENERATE $0, $1.$0; -- OK
aaa = FOREACH aa GENERATE $0, $1.f3; -- ERROR
aaa = FOREACH aa GENERATE $0, $1.$1; -- ERROR
aaa = FOREACH aa GENERATE $0, $1.f4; -- ERROR

aa = FOREACH a GENERATE $0, TOTUPLE(*);
aaa = FOREACH aa GENERATE $0, $1.$0; -- ok
aaa = FOREACH aa GENERATE $0, $1.f3; -- ERROR
aaa = FOREACH aa GENERATE $0, $1.$1; -- ERROR
aaa = FOREACH aa GENERATE $0, $1.f4; -- ERROR

Search Discussions

  • Thejas Nair at Aug 17, 2011 at 4:40 am
    It is a bug. I have created a jira and attached a patch -
    https://issues.apache.org/jira/browse/PIG-2223
    -Thejas

    On 8/16/11 3:16 PM, ggrambo@mac.com wrote:
    Hello,

    I'm trying to generate a tuple from a very wide data set, but running in
    to problems. I'm running Pig 0.9.0 r1148983 in local mode.

    Because the data set it so wide, I'd prefer not to explicitly state each
    field, and instead use either * or the $0..$N syntax when generating the
    tuple. Additionally, the schema of each field is quite long and
    arbitrary as the data has been generated using macros. (Which is
    essentially why I'm putting the data into a tuple - so that further down
    in my script I can more easily refer to the fields)

    The code below illustrates the issue I'm having referencing fields in
    the tuple. Is it a bug, or am I missing something?

    Thanks in advance,
    Grahame


    describe a;
    a: {f1: int,f2: int,f3: int,f4: int,f5: int,f6: int,f7: int,f8: int,f9:
    int,f10: int}

    aa = FOREACH a GENERATE $0, TOTUPLE($2,$3,$4,$5);
    aaa = FOREACH aa GENERATE $0, $1.$0; -- OK
    aaa = FOREACH aa GENERATE $0, $1.f3; -- OK
    aaa = FOREACH aa GENERATE $0, $1.$1; -- OK
    aaa = FOREACH aa GENERATE $0, $1.f4; -- OK

    aa = FOREACH a GENERATE $0, TOTUPLE($2..$5); -- should be the same as above?
    aaa = FOREACH aa GENERATE $0, $1.$0; -- OK
    aaa = FOREACH aa GENERATE $0, $1.f3; -- ERROR
    aaa = FOREACH aa GENERATE $0, $1.$1; -- ERROR
    aaa = FOREACH aa GENERATE $0, $1.f4; -- ERROR

    aa = FOREACH a GENERATE $0, TOTUPLE(*);
    aaa = FOREACH aa GENERATE $0, $1.$0; -- ok
    aaa = FOREACH aa GENERATE $0, $1.f3; -- ERROR
    aaa = FOREACH aa GENERATE $0, $1.$1; -- ERROR
    aaa = FOREACH aa GENERATE $0, $1.f4; -- ERROR
  • Ggrambo at Aug 17, 2011 at 1:27 pm
    Hi Thejas,

    Thanks for looking in to this issue, confirming it's a bug and patching it, I appreciate it.

    Cheers,
    Grahame

    On Aug 16, 2011, at 09:39 PM, Thejas Nair wrote:

    It is a bug. I have created a jira and attached a patch -
    https://issues.apache.org/jira/browse/PIG-2223
    -Thejas

    On 8/16/11 3:16 PM, ggrambo@mac.com wrote:
    Hello,

    I'm trying to generate a tuple from a very wide data set, but running in
    to problems. I'm running Pig 0.9.0 r1148983 in local mode.

    Because the data set it so wide, I'd prefer not to explicitly state each
    field, and instead use either * or the $0..$N syntax when generating the
    tuple. Additionally, the schema of each field is quite long and
    arbitrary as the data has been generated using macros. (Which is
    essentially why I'm putting the data into a tuple - so that further down
    in my script I can more easily refer to the fields)

    The code below illustrates the issue I'm having referencing fields in
    the tuple. Is it a bug, or am I missing something?

    Thanks in advance,
    Grahame


    describe a;
    a: {f1: int,f2: int,f3: int,f4: int,f5: int,f6: int,f7: int,f8: int,f9:
    int,f10: int}

    aa = FOREACH a GENERATE $0, TOTUPLE($2,$3,$4,$5);
    aaa = FOREACH aa GENERATE $0, $1.$0; -- OK
    aaa = FOREACH aa GENERATE $0, $1.f3; -- OK
    aaa = FOREACH aa GENERATE $0, $1.$1; -- OK
    aaa = FOREACH aa GENERATE $0, $1.f4; -- OK

    aa = FOREACH a GENERATE $0, TOTUPLE($2..$5); -- should be the same as above?
    aaa = FOREACH aa GENERATE $0, $1.$0; -- OK
    aaa = FOREACH aa GENERATE $0, $1.f3; -- ERROR
    aaa = FOREACH aa GENERATE $0, $1.$1; -- ERROR
    aaa = FOREACH aa GENERATE $0, $1.f4; -- ERROR

    aa = FOREACH a GENERATE $0, TOTUPLE(*);
    aaa = FOREACH aa GENERATE $0, $1.$0; -- ok
    aaa = FOREACH aa GENERATE $0, $1.f3; -- ERROR
    aaa = FOREACH aa GENERATE $0, $1.$1; -- ERROR
    aaa = FOREACH aa GENERATE $0, $1.f4; -- ERROR
  • Ggrambo at Aug 18, 2011 at 6:02 pm
    Hi Thejas,

    I applied the patch and rebuilt. The initial bug is gone, but there looks to be another:

    describe a;
    a: {f1: int,f2: int,f3: int,f4: int,f5: int,f6: int,f7: int,f8: int,f9: int,f10: int}

    aa = FOREACH a GENERATE $0, TOTUPLE($2,$3,$4,$5);
    aaa = FOREACH aa GENERATE $0, $1.f4 as v; -- OK
    aaaa = FOREACH aaa GENERATE v; -- OK

    aa = FOREACH a GENERATE $0, TOTUPLE($2..$5);
    aaa = FOREACH aa GENERATE $0, $1.f4 as v; -- OK after patch
    aaaa = FOREACH aaa GENERATE v; -- ERROR
    2011-08-18 11:00:36,246 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1128: Cannot find field f4 in :tuple(f3:int,f4:int,f5:int,f6:int)

    Thanks,
    Grahame

    On Aug 17, 2011, at 06:26 AM, ggrambo@mac.com wrote:

    Hi Thejas,

    Thanks for looking in to this issue, confirming it's a bug and patching it, I appreciate it.

    Cheers,
    Grahame
    On Aug 16, 2011, at 09:39 PM, Thejas Nair wrote:

    It is a bug. I have created a jira and attached a patch -
    https://issues.apache.org/jira/browse/PIG-2223
    -Thejas

    On 8/16/11 3:16 PM, ggrambo@mac.com wrote:
    Hello,

    I'm trying to generate a tuple from a very wide data set, but running in
    to problems. I'm running Pig 0.9.0 r1148983 in local mode.

    Because the data set it so wide, I'd prefer not to explicitly state each
    field, and instead use either * or the $0..$N syntax when generating the
    tuple. Additionally, the schema of each field is quite long and
    arbitrary as the data has been generated using macros. (Which is
    essentially why I'm putting the data into a tuple - so that further down
    in my script I can more easily refer to the fields)

    The code below illustrates the issue I'm having referencing fields in
    the tuple. Is it a bug, or am I missing something?

    Thanks in advance,
    Grahame


    describe a;
    a: {f1: int,f2: int,f3: int,f4: int,f5: int,f6: int,f7: int,f8: int,f9:
    int,f10: int}

    aa = FOREACH a GENERATE $0, TOTUPLE($2,$3,$4,$5);
    aaa = FOREACH aa GENERATE $0, $1.$0; -- OK
    aaa = FOREACH aa GENERATE $0, $1.f3; -- OK
    aaa = FOREACH aa GENERATE $0, $1.$1; -- OK
    aaa = FOREACH aa GENERATE $0, $1f4; -- OK

    aa = FOREACH a GENERATE $0, TOTUPLE($2.$5); -- should be the same as above?
    aaa = FOREACH aa GENERATE $0, $1.$0; -- OK
    aaa = FOREACH aa GENERATE $0, $1.f3; -- ERROR
    aaa = FOREACH aa GENERATE $0, $1.$1; -- ERROR
    aaa = FOREACH aa GENERATE $0, $1.f4; -- ERROR

    aa = FOREACH a GENERATE $0, TOTUPLE(*);
    aaa = FOREACH aa GENERATE $0, $1.$0; -- ok
    aaa = FOREACH aa GENERATE $0, $1.f3; -- ERROR
    aaa = FOREACH aa GENERATE $0, $1.$1; -- ERROR
    aaa = FOREACH aa GENERATE $0, $1.f4; -- ERROR

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedAug 16, '11 at 10:17p
activeAug 18, '11 at 6:02p
posts4
users2
websitepig.apache.org

2 users in discussion

Ggrambo: 3 posts Thejas Nair: 1 post

People

Translate

site design / logo © 2021 Grokbase