This is puzzling me ...
With a mapper producing output of size ~ 400 MB ... which one is supposed
to be faster?
1) output collector: which will write to local file then copy to HDFS since
I don't have reducers.
2) Open a unique local file inside "mapred.local.dir" for each mapper.
I thought of (2), but (1) was actually faster ... can someone explains ?