FAQ
Hello,
I am writing a MR job where the distribution of the Keys emitted by the Map phase is not known beforehand and so I can't create the partitions for the TotalOrderPartitioner. I would like to sample those keys to create the partitions and then run the job that will process the whole input.

Is the InputSampler the tool I need?
I tried to use it but I think it doesn't use the mapper class to process the samples and then create the partitions,
but it just creates the partitions from the input. Am I wrong?

Thank you in advance!
Pan

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedMay 12, '11 at 12:24p
activeMay 12, '11 at 12:24p
posts1
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Panayotis Antonopoulos: 1 post

People

Translate

site design / logo © 2022 Grokbase