I am writing a MR job where the distribution of the Keys emitted by the Map phase is not known beforehand and so I can't create the partitions for the TotalOrderPartitioner. I would like to sample those keys to create the partitions and then run the job that will process the whole input.
Is the InputSampler the tool I need?
I tried to use it but I think it doesn't use the mapper class to process the samples and then create the partitions,
but it just creates the partitions from the input. Am I wrong?
Thank you in advance!