No matter what I try, I end up losing the tuples after the initial flatten. I'm using some auto-generated test data with firstn, last and a concatanation for the key. The script and outputs. . .
rows = LOAD 'cassandra://Keyspace2/Standard1' USING CassandraStorage() as (key:chararray, cols:bag{T:tuple(name:chararray, value:chararray) } );
dump rows;
(faaaaaaaaazzzzzzeaaa,{(first,faaaaaaaaa),(last,zzzzzzeaaa)})
(jaaaaaaaaazzzlaaaaaa,{(first,jaaaaaaaaa),(last,zzzlaaaaaa)})
(naaaaaaaaazzzzzpaaaa,{(first,naaaaaaaaa),(last,zzzzzpaaaa)})
(uaaaaaaaaazzzzzsaaaa,{(first,uaaaaaaaaa),(last,zzzzzsaaaa)})
(vaaaaaaaaafaaaaaaaaa,{(first,vaaaaaaaaa),(last,faaaaaaaaa)})
(zuaaaaaaaazpaaaaaaaa,{(first,zuaaaaaaaa),(last,zpaaaaaaaa)})
(zuaaaaaaaazzzzhaaaaa,{(first,zuaaaaaaaa),(last,zzzzhaaaaa)})
(zwaaaaaaaaznaaaaaaaa,{(first,zwaaaaaaaa),(last,znaaaaaaaa)})
(zziaaaaaaazfaaaaaaaa,{(first,zziaaaaaaa),(last,zfaaaaaaaa)})
(zzkaaaaaaazzzdaaaaaa,{(first,zzkaaaaaaa),(last,zzzdaaaaaa)})
So far, so good.
columns = foreach rows generate flatten(cols) as (name, value);
dump columns;
()
()
()
()
()
()
()
()
()
()
Not so good.
I've tried multiple combinations w/ no success. If I just leave bag empty in the initial load, i.e. cols:bag{} and then leave off the as in the flatten I get something that looks like a list of tuples. But, trying to access $1 in that result gives me an Error 1000 index out of range. So, that's not the ticket either.
What I'd really like is to flatten the bag into a map, but I'm about as successful there as well.
Any help is most welcome . (Cassandra 7.4 and Pig 0.8.0)