I just tried this, with both Pig 6 and Pig 8 (trunk):
grunt> strings = load 'tmp/strings.txt' as (a:chararray, b:chararray);
grunt> dump strings;
(foo,bar)
(foo,baz)
(foo,foo)
(bar,bar)
grunt> x = filter strings by a == b;
grunt> dump x;
(foo,foo)
(bar,bar)
grunt> x = filter strings by a != b;
grunt> dump x
(foo,bar)
(foo,baz)
On Wed, Aug 18, 2010 at 4:22 PM, Dave Wellman wrote:
Because I wasn't able to find one I tossed this UDF into the mix.
public class StrComp extends EvalFunc<Integer> {
@Override
public Integer exec(Tuple arg0) throws IOException {
// should have 2 tuples.
if (arg0.size() != 2) {
throw new IOException("Dude where's my tuples?");
}
return
arg0.get(0).toString().compareTo(arg0.get(1).toString());
}
}
And the pig calls:
x = FILTER y BY StrComp(a, b) == 0;
or
x = FILTER y BY StrComp(a, b) != 0;
The tuples a and b are chararray. My solution "works" but a nice standard
piggy bank udfs would be better.
Because I wasn't able to find one I tossed this UDF into the mix.
public class StrComp extends EvalFunc<Integer> {
@Override
public Integer exec(Tuple arg0) throws IOException {
// should have 2 tuples.
if (arg0.size() != 2) {
throw new IOException("Dude where's my tuples?");
}
return
arg0.get(0).toString().compareTo(arg0.get(1).toString());
}
}
And the pig calls:
x = FILTER y BY StrComp(a, b) == 0;
or
x = FILTER y BY StrComp(a, b) != 0;
The tuples a and b are chararray. My solution "works" but a nice standard
piggy bank udfs would be better.
On Aug 18, 2010, at 4:56 PM, Dmitriy Ryaboy wrote:
Dave,
Can you provide some sample data? A tuple can't be a chararray (but it can
contain one), so I want to make sure I understand what the data you are
working with looks like.
-Dmitriy
Dave,
Can you provide some sample data? A tuple can't be a chararray (but it can
contain one), so I want to make sure I understand what the data you are
working with looks like.
-Dmitriy
On Wed, Aug 18, 2010 at 3:29 PM, Dave Wellman wrote:
All,
I have what should be a simple problem. I have 2 tuples that are
chararrays t1, t2 and want to do a comparision. using
x = FILTER y BY (t1 == t2);
results in zero (0) records.
x = FILTER y BY (t1 != t2);
is zero records. And
x = FILTER y BY (t1 matches t2);
is an error. Ideal would be a StrComp(t1, t2) filter func.
Is there a UDF for that?
Cheers,
All,
I have what should be a simple problem. I have 2 tuples that are
chararrays t1, t2 and want to do a comparision. using
x = FILTER y BY (t1 == t2);
results in zero (0) records.
x = FILTER y BY (t1 != t2);
is zero records. And
x = FILTER y BY (t1 matches t2);
is an error. Ideal would be a StrComp(t1, t2) filter func.
Is there a UDF for that?
Cheers,