Grokbase Groups Pig dev October 2010
FAQ
Pig get confused if map value is not bytearray
----------------------------------------------

Key: PIG-1703
URL: https://issues.apache.org/jira/browse/PIG-1703
Project: Pig
Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Daniel Dai
Assignee: Daniel Dai
Fix For: 0.9.0


It is the same nature of [PIG-999|https://issues.apache.org/jira/browse/PIG-999]. Just adding another test case:

{code}
a = load ':INPATH:/singlefile/studenttab10k' as (name: chararray, age: int, gpa: float);
sds = load ':INPATH:/somefile' using SomeLoader() as (s:map[], m:map[],
l:map[]);
views = FOREACH sds GENERATE s#'srcpvid' as srcpvid, flatten(l#'viewinfo') as viewinfo;
views1 = FILTER views BY srcpvid == '1234';
views2 = FILTER views1 BY (viewinfo#'it' EQ '25');
map_scalar = limit views2 1;
z = foreach a generate name, age+(double)map_scalar.viewinfo#'it' as some_sum;
store z into ':OUTPATH:.2';
{code}

Here l is a map of bags of maps. flatten(l#'viewinfo') suppose to get maps. However, internally Pig track all map key as bytearray. In the scalar case, ReadScalar will give bytearray as outputschema, but it is actually a map. What Pig does is to Stringize map, and then convert string back into map, which end up with nulls.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Daniel Dai (JIRA) at Oct 27, 2010 at 10:24 pm
    [ https://issues.apache.org/jira/browse/PIG-1703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Daniel Dai updated PIG-1703:
    ----------------------------

    Issue Type: Sub-task (was: Bug)
    Parent: PIG-998
    Pig get confused if map value is not bytearray
    ----------------------------------------------

    Key: PIG-1703
    URL: https://issues.apache.org/jira/browse/PIG-1703
    Project: Pig
    Issue Type: Sub-task
    Affects Versions: 0.8.0
    Reporter: Daniel Dai
    Assignee: Daniel Dai
    Fix For: 0.9.0


    It is the same nature of [PIG-999|https://issues.apache.org/jira/browse/PIG-999]. Just adding another test case:
    {code}
    a = load ':INPATH:/singlefile/studenttab10k' as (name: chararray, age: int, gpa: float);
    sds = load ':INPATH:/somefile' using SomeLoader() as (s:map[], m:map[],
    l:map[]);
    views = FOREACH sds GENERATE s#'srcpvid' as srcpvid, flatten(l#'viewinfo') as viewinfo;
    views1 = FILTER views BY srcpvid == '1234';
    views2 = FILTER views1 BY (viewinfo#'it' EQ '25');
    map_scalar = limit views2 1;
    z = foreach a generate name, age+(double)map_scalar.viewinfo#'it' as some_sum;
    store z into ':OUTPATH:.2';
    {code}
    Here l is a map of bags of maps. flatten(l#'viewinfo') suppose to get maps. However, internally Pig track all map key as bytearray. In the scalar case, ReadScalar will give bytearray as outputschema, but it is actually a map. What Pig does is to Stringize map, and then convert string back into map, which end up with nulls.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Daniel Dai (JIRA) at Apr 12, 2011 at 12:45 am
    [ https://issues.apache.org/jira/browse/PIG-1703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Daniel Dai resolved PIG-1703.
    -----------------------------

    Resolution: Fixed

    Verify fixed in trunk.
    Pig get confused if map value is not bytearray
    ----------------------------------------------

    Key: PIG-1703
    URL: https://issues.apache.org/jira/browse/PIG-1703
    Project: Pig
    Issue Type: Sub-task
    Affects Versions: 0.8.0
    Reporter: Daniel Dai
    Assignee: Daniel Dai
    Fix For: 0.9.0


    It is the same nature of [PIG-999|https://issues.apache.org/jira/browse/PIG-999]. Just adding another test case:
    {code}
    a = load ':INPATH:/singlefile/studenttab10k' as (name: chararray, age: int, gpa: float);
    sds = load ':INPATH:/somefile' using SomeLoader() as (s:map[], m:map[],
    l:map[]);
    views = FOREACH sds GENERATE s#'srcpvid' as srcpvid, flatten(l#'viewinfo') as viewinfo;
    views1 = FILTER views BY srcpvid == '1234';
    views2 = FILTER views1 BY (viewinfo#'it' EQ '25');
    map_scalar = limit views2 1;
    z = foreach a generate name, age+(double)map_scalar.viewinfo#'it' as some_sum;
    store z into ':OUTPATH:.2';
    {code}
    Here l is a map of bags of maps. flatten(l#'viewinfo') suppose to get maps. However, internally Pig track all map key as bytearray. In the scalar case, ReadScalar will give bytearray as outputschema, but it is actually a map. What Pig does is to Stringize map, and then convert string back into map, which end up with nulls.
    --
    This message is automatically generated by JIRA.
    For more information on JIRA, see: http://www.atlassian.com/software/jira

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categoriespig, hadoop
postedOct 27, '10 at 9:03p
activeApr 12, '11 at 12:45a
posts3
users1
websitepig.apache.org

1 user in discussion

Daniel Dai (JIRA): 3 posts

People

Translate

site design / logo © 2022 Grokbase