The current hadoop implementation shuffles directly to disk and then those
disk files are eventually requested by the target nodes which are
responsible for doing the reduce() on the intermediate data.
However, this requires more 2x IO than strictly necessary.
If the data were instead shuffled DIRECTLY to the target host, this IO
overhead would be removed.
I believe that any benefits from writing locally (compressing, combining)
and then doing a transfer can be had by simply allocating a buffer and (say
250-500MB per map task) and then transfering data directly. I don't think
that the savings will be 100% on par with first writing locally but
remember it's already 2x faster by not having to write to disk... so any
advantages to first shuffling to the local disk would have to be more than
However, writing data to the local disk first could in theory had some
practical advantages under certain loads. I just don't think they're
practical and that direct shuffling is superior.
Anyone have any thoughts here?