Grokbase Groups Pig user July 2010
FAQ
Hey guys,



I'm trying to determine if a value in one data set is present in a range
of 2 values in another data set. Here is my thought process so far:



StartData1 = Load data1 as (number1:long, field2:int, field:3, int);

StartData2 = Load data2 as (field1:chararray, field2:long, field:3,
long);



Dump StartData1, StartData2;



StartData1

(100,30,30)

(200,20,10)

(300,30,40)



StartData2

(foo, 100,200)

(bar,50,150)

(bar,250,325)



GrpedData2 = Group StartData2 BY $0 as ranges;



FOREACH StartData1 {

Filtered1 = FILTER GrpedData2 BY ranges matches "bar";

Evaluated = FILTER Filter1 BY StartData1.$0
=StartData2.$1 AND StartData1.$0<=StartData2.$2;
Generate evaluated;

}



I would like the output to be any ordering:



(bar,100, 30,30)

(bar,300,30,40)



Thoughts?



Thanks,

Matt

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 1 of 1 | next ›
Discussion Overview
groupuser @
categoriespig, hadoop
postedJul 26, '10 at 11:34p
activeJul 26, '10 at 11:34p
posts1
users1
websitepig.apache.org

1 user in discussion

Matthew Smith: 1 post

People

Translate

site design / logo © 2021 Grokbase