Grokbase Groups Pig user May 2013
   I have a dataset with two three columns, group_id, position, and name. I
need for each group to generate a concatenated string of all names ordered
by their position. I can do this by sorting all data based on position, (or
group_id and position), then grouping them by group_id, and finally
concatenating names in each group. I have two questions here,
1- Does this really work? In other words, does the GROUP BY operator retain
2- What is the most efficient way to do it? Is it better, if possible, to
group first and then sort? Let's say I order by the pair (group_id,
position) first, can this be hinted to Pig to make the group by faster.
Thanks for your help

Best regards,
Ahmed Eldawy

Search Discussions

Discussion Posts


Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 2 of 3 | next ›
Discussion Overview
groupuser @
categoriespig, hadoop
postedMay 13, '13 at 5:26p
activeMay 13, '13 at 7:49p

2 users in discussion

Ahmed Eldawy: 2 posts Cheolsoo Park: 1 post



site design / logo © 2021 Grokbase