Grokbase Groups Pig user August 2011
FAQ
My apologies if this is in the docs somewhere, I was unable to find
anything, but I might be calling it the wrong name.

I'm doing a full outer join in Pig - as such, one or the other join keys may
be null. I'd like to be able to look at 2 columns, and retrieve just the one
that is not null. Is that possible?

I tried an expression in generate with is null and the ternary operator, and
took a look at DECODE. That might do the trick but wasn't sure if null
checking would work, and if other expressions could appear inside the
decode.

In my case the fields are integers, so I abused the MAX and TOBAG operators
like this MAX(TOBAG(rx_keyed::u2,cx_keyed::u2)) to get the effect I was
after, but I would love to know if there's a better way.

Thanks for your time!

-James Kebinger

Search Discussions

  • Dmitriy Ryaboy at Aug 29, 2011 at 6:29 pm
    Hi James,
    I use ternary expressions for this: foreach joined generate ( rel1.x is null
    ? rel2.x : rel1.x) as x;
    On Mon, Aug 29, 2011 at 11:15 AM, James Kebinger wrote:

    My apologies if this is in the docs somewhere, I was unable to find
    anything, but I might be calling it the wrong name.

    I'm doing a full outer join in Pig - as such, one or the other join keys
    may
    be null. I'd like to be able to look at 2 columns, and retrieve just the
    one
    that is not null. Is that possible?

    I tried an expression in generate with is null and the ternary operator,
    and
    took a look at DECODE. That might do the trick but wasn't sure if null
    checking would work, and if other expressions could appear inside the
    decode.

    In my case the fields are integers, so I abused the MAX and TOBAG operators
    like this MAX(TOBAG(rx_keyed::u2,cx_keyed::u2)) to get the effect I was
    after, but I would love to know if there's a better way.

    Thanks for your time!

    -James Kebinger
  • James Kebinger at Aug 29, 2011 at 7:18 pm
    Thanks, it must have been the lack of parenthesis that did me in when i
    tried the ternary expression, or some other typo. I'll use that in the
    future.
    On Mon, Aug 29, 2011 at 2:29 PM, Dmitriy Ryaboy wrote:

    Hi James,
    I use ternary expressions for this: foreach joined generate ( rel1.x is
    null
    ? rel2.x : rel1.x) as x;

    On Mon, Aug 29, 2011 at 11:15 AM, James Kebinger <jkebinger@gmail.com
    wrote:
    My apologies if this is in the docs somewhere, I was unable to find
    anything, but I might be calling it the wrong name.

    I'm doing a full outer join in Pig - as such, one or the other join keys
    may
    be null. I'd like to be able to look at 2 columns, and retrieve just the
    one
    that is not null. Is that possible?

    I tried an expression in generate with is null and the ternary operator,
    and
    took a look at DECODE. That might do the trick but wasn't sure if null
    checking would work, and if other expressions could appear inside the
    decode.

    In my case the fields are integers, so I abused the MAX and TOBAG operators
    like this MAX(TOBAG(rx_keyed::u2,cx_keyed::u2)) to get the effect I was
    after, but I would love to know if there's a better way.

    Thanks for your time!

    -James Kebinger

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedAug 29, '11 at 6:15p
activeAug 29, '11 at 7:18p
posts3
users2
websitepig.apache.org

2 users in discussion

James Kebinger: 2 posts Dmitriy Ryaboy: 1 post

People

Translate

site design / logo © 2021 Grokbase