Grokbase Groups Pig user May 2013
FAQ
Hi,
   I have a dataset with two three columns, group_id, position, and name. I
need for each group to generate a concatenated string of all names ordered
by their position. I can do this by sorting all data based on position, (or
group_id and position), then grouping them by group_id, and finally
concatenating names in each group. I have two questions here,
1- Does this really work? In other words, does the GROUP BY operator retain
order?
2- What is the most efficient way to do it? Is it better, if possible, to
group first and then sort? Let's say I order by the pair (group_id,
position) first, can this be hinted to Pig to make the group by faster.
Thanks for your help


Best regards,
Ahmed Eldawy

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 2 of 3 | next ›
Discussion Overview
groupuser @
categoriespig, hadoop
postedMay 13, '13 at 5:26p
activeMay 13, '13 at 7:49p
posts3
users2
websitepig.apache.org

2 users in discussion

Ahmed Eldawy: 2 posts Cheolsoo Park: 1 post

People

Translate

site design / logo © 2021 Grokbase