Grokbase Groups Pig user July 2010
FAQ
I've got
A = FOREACH ...
B = FOREACH ...
C = FOREACH ...
...

X = UNION A, B, C,...

Each of the A, B, C data is a single tuple. I want X ordered
by the order specified in the UNION. The data in A, B, C, ... is not
necessarily in explicit sort order so ORDER X by field does not work. I've tried breaking
the union into only unioning two pieces then that union plus another piece, etc.
That does not work either.

Anyone have any ideas how to do this


elein
elein@varlena.com

Search Discussions

  • Thejas M Nair at Jul 29, 2010 at 1:06 am
    As you observed, union does not guarantee the ordering . You will need to project an additional column indicating the order you want, so that you can do an order-by on it.

    -Thejas



    On 7/28/10 2:45 PM, "elein" wrote:



    I've got
    A = FOREACH ...
    B = FOREACH ...
    C = FOREACH ...
    ...

    X = UNION A, B, C,...

    Each of the A, B, C data is a single tuple. I want X ordered
    by the order specified in the UNION. The data in A, B, C, ... is not
    necessarily in explicit sort order so ORDER X by field does not work. I've tried breaking
    the union into only unioning two pieces then that union plus another piece, etc.
    That does not work either.

    Anyone have any ideas how to do this


    elein
    elein@varlena.com
  • Gang Luo at Jul 29, 2010 at 2:01 am
    Hi all,
    by default the parallism (number of reducers) of a pig query is 1. How to change
    this value? If I set the value to 10, does that mean all the MR jobs for this
    query will run with 10 reducers?


    Thanks,
    -Gang
  • Elein at Jul 29, 2010 at 4:01 pm
    Yes, Thank you. I was trying to avoid adding a sort column.

    On Jul 28, 2010, at 6:05 PM, Thejas M Nair wrote:

    As you observed, union does not guarantee the ordering . You will need to project an additional column indicating the order you want, so that you can do an order-by on it.

    -Thejas



    On 7/28/10 2:45 PM, "elein" wrote:



    I've got
    A = FOREACH ...
    B = FOREACH ...
    C = FOREACH ...
    ...

    X = UNION A, B, C,...

    Each of the A, B, C data is a single tuple. I want X ordered
    by the order specified in the UNION. The data in A, B, C, ... is not
    necessarily in explicit sort order so ORDER X by field does not work. I've tried breaking
    the union into only unioning two pieces then that union plus another piece, etc.
    That does not work either.

    Anyone have any ideas how to do this


    elein
    elein@varlena.com




    elein
    elein@varlena.com

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedJul 28, '10 at 9:46p
activeJul 29, '10 at 4:01p
posts4
users3
websitepig.apache.org

3 users in discussion

Elein: 2 posts Thejas M Nair: 1 post Gang Luo: 1 post

People

Translate

site design / logo © 2021 Grokbase