Grokbase Groups Pig user August 2010
FAQ
All,

I have what should be a simple problem. I have 2 tuples that are chararrays t1, t2 and want to do a comparision. using

x = FILTER y BY (t1 == t2);

results in zero (0) records.

x = FILTER y BY (t1 != t2);

is zero records. And

x = FILTER y BY (t1 matches t2);

is an error. Ideal would be a StrComp(t1, t2) filter func.

Is there a UDF for that?

Cheers,

Search Discussions

  • Dmitriy Ryaboy at Aug 18, 2010 at 10:57 pm
    Dave,
    Can you provide some sample data? A tuple can't be a chararray (but it can
    contain one), so I want to make sure I understand what the data you are
    working with looks like.

    -Dmitriy
    On Wed, Aug 18, 2010 at 3:29 PM, Dave Wellman wrote:

    All,

    I have what should be a simple problem. I have 2 tuples that are
    chararrays t1, t2 and want to do a comparision. using

    x = FILTER y BY (t1 == t2);

    results in zero (0) records.

    x = FILTER y BY (t1 != t2);

    is zero records. And

    x = FILTER y BY (t1 matches t2);

    is an error. Ideal would be a StrComp(t1, t2) filter func.

    Is there a UDF for that?

    Cheers,
  • Dave Wellman at Aug 19, 2010 at 12:26 am
    Because I wasn't able to find one I tossed this UDF into the mix.

    public class StrComp extends EvalFunc<Integer> {

    @Override
    public Integer exec(Tuple arg0) throws IOException {
    // should have 2 tuples.
    if (arg0.size() != 2) {
    throw new IOException("Dude where's my tuples?");
    }

    return arg0.get(0).toString().compareTo(arg0.get(1).toString());
    }
    }

    And the pig calls:

    x = FILTER y BY StrComp(a, b) == 0;

    or

    x = FILTER y BY StrComp(a, b) != 0;

    The tuples a and b are chararray. My solution "works" but a nice standard piggy bank udfs would be better.

    On Aug 18, 2010, at 4:56 PM, Dmitriy Ryaboy wrote:

    Dave,
    Can you provide some sample data? A tuple can't be a chararray (but it can
    contain one), so I want to make sure I understand what the data you are
    working with looks like.

    -Dmitriy
    On Wed, Aug 18, 2010 at 3:29 PM, Dave Wellman wrote:

    All,

    I have what should be a simple problem. I have 2 tuples that are
    chararrays t1, t2 and want to do a comparision. using

    x = FILTER y BY (t1 == t2);

    results in zero (0) records.

    x = FILTER y BY (t1 != t2);

    is zero records. And

    x = FILTER y BY (t1 matches t2);

    is an error. Ideal would be a StrComp(t1, t2) filter func.

    Is there a UDF for that?

    Cheers,
  • Dmitriy Ryaboy at Aug 20, 2010 at 12:49 am
    Something strange is happening with your data. Can you provide an example?

    I just tried this, with both Pig 6 and Pig 8 (trunk):

    grunt> strings = load 'tmp/strings.txt' as (a:chararray, b:chararray);
    grunt> dump strings;
    (foo,bar)
    (foo,baz)
    (foo,foo)
    (bar,bar)
    grunt> x = filter strings by a == b;
    grunt> dump x;
    (foo,foo)
    (bar,bar)
    grunt> x = filter strings by a != b;
    grunt> dump x
    (foo,bar)
    (foo,baz)


    On Wed, Aug 18, 2010 at 4:22 PM, Dave Wellman wrote:

    Because I wasn't able to find one I tossed this UDF into the mix.

    public class StrComp extends EvalFunc<Integer> {

    @Override
    public Integer exec(Tuple arg0) throws IOException {
    // should have 2 tuples.
    if (arg0.size() != 2) {
    throw new IOException("Dude where's my tuples?");
    }

    return
    arg0.get(0).toString().compareTo(arg0.get(1).toString());
    }
    }

    And the pig calls:

    x = FILTER y BY StrComp(a, b) == 0;

    or

    x = FILTER y BY StrComp(a, b) != 0;

    The tuples a and b are chararray. My solution "works" but a nice standard
    piggy bank udfs would be better.

    On Aug 18, 2010, at 4:56 PM, Dmitriy Ryaboy wrote:

    Dave,
    Can you provide some sample data? A tuple can't be a chararray (but it can
    contain one), so I want to make sure I understand what the data you are
    working with looks like.

    -Dmitriy
    On Wed, Aug 18, 2010 at 3:29 PM, Dave Wellman wrote:

    All,

    I have what should be a simple problem. I have 2 tuples that are
    chararrays t1, t2 and want to do a comparision. using

    x = FILTER y BY (t1 == t2);

    results in zero (0) records.

    x = FILTER y BY (t1 != t2);

    is zero records. And

    x = FILTER y BY (t1 matches t2);

    is an error. Ideal would be a StrComp(t1, t2) filter func.

    Is there a UDF for that?

    Cheers,

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedAug 18, '10 at 10:38p
activeAug 20, '10 at 12:49a
posts4
users2
websitepig.apache.org

2 users in discussion

Dave Wellman: 2 posts Dmitriy Ryaboy: 2 posts

People

Translate

site design / logo © 2022 Grokbase