Grokbase Groups Pig dev August 2010
FAQ
use same logic for merging inner schemas in "default union" and "union onschema"
--------------------------------------------------------------------------------

Key: PIG-1536
URL: https://issues.apache.org/jira/browse/PIG-1536
Project: Pig
Issue Type: Task
Reporter: Thejas M Nair
Fix For: 0.9.0


We should consider using logic for merging inner schema in case of the two different types of union.

In case of 'default union', it merges the two inner schema of bags/tuples by position if the number of fields are same and the corresponding types are compatible.

In case of 'union onschema', it considers tuple/bag with different innerschema to be incompatible types.


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Thejas M Nair (JIRA) at Aug 4, 2010 at 8:44 pm
    [ https://issues.apache.org/jira/browse/PIG-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12895410#action_12895410 ]

    Thejas M Nair commented on PIG-1536:
    ------------------------------------


    The way 'default union' deals with columns of different but compatible types in same position is not right. It creates a merged schema choosing a merged type, but there is not cast that happens to convert the rows to this type.
    eg -

    {code}
    grunt> l1 = load '/tmp/f1' as (a : chararray, t (a : int, c : long) );
    grunt> l2 = load '/tmp/f1' as (a : chararray, t (a : int, b : int) );
    grunt> u = union l1, l2;
    grunt> describe u;
    u: {a: chararray,t: (a: int,c: long)}

    -- the result of u, only the rows originating from l1 will correspond to schema shown in describe.

    MapReduce node 1-206
    Map Plan
    u: Store(fakefile:org.apache.pig.builtin.PigStorage) - 1-203
    ---u: Union[bag] - 1-202

    ---l1: New For Each(false,false)[bag] - 1-195
    Cast[chararray] - 1-192
    ---Project[bytearray][0] - 1-191
    Cast[tuple:(int,long)] - 1-194
    ---Project[bytearray][1] - 1-193
    ---l1: Load(/tmp/f1:org.apache.pig.builtin.PigStorage) - 1-190
    ---l2: New For Each(false,false)[bag] - 1-201

    Cast[chararray] - 1-198
    ---Project[bytearray][0] - 1-197
    Cast[tuple:(int,int)] - 1-200
    ---Project[bytearray][1] - 1-199
    ---l2: Load(/tmp/f1:org.apache.pig.builtin.PigStorage) - 1-196--------
    Global sort: false
    ----------------

    {code}
    use same logic for merging inner schemas in "default union" and "union onschema"
    --------------------------------------------------------------------------------

    Key: PIG-1536
    URL: https://issues.apache.org/jira/browse/PIG-1536
    Project: Pig
    Issue Type: Task
    Reporter: Thejas M Nair
    Fix For: 0.9.0


    We should consider using logic for merging inner schema in case of the two different types of union.
    In case of 'default union', it merges the two inner schema of bags/tuples by position if the number of fields are same and the corresponding types are compatible.
    In case of 'union onschema', it considers tuple/bag with different innerschema to be incompatible types.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Alan Gates (JIRA) at Sep 14, 2010 at 2:07 pm
    [ https://issues.apache.org/jira/browse/PIG-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Alan Gates reassigned PIG-1536:
    -------------------------------

    Assignee: Alan Gates
    use same logic for merging inner schemas in "default union" and "union onschema"
    --------------------------------------------------------------------------------

    Key: PIG-1536
    URL: https://issues.apache.org/jira/browse/PIG-1536
    Project: Pig
    Issue Type: Task
    Reporter: Thejas M Nair
    Assignee: Alan Gates
    Fix For: 0.9.0


    We should consider using logic for merging inner schema in case of the two different types of union.
    In case of 'default union', it merges the two inner schema of bags/tuples by position if the number of fields are same and the corresponding types are compatible.
    In case of 'union onschema', it considers tuple/bag with different innerschema to be incompatible types.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categoriespig, hadoop
postedAug 4, '10 at 8:42p
activeSep 14, '10 at 2:07p
posts3
users1
websitepig.apache.org

1 user in discussion

Alan Gates (JIRA): 3 posts

People

Translate

site design / logo © 2022 Grokbase