|| at Jun 17, 2008 at 4:48 am
Daniel Leffel wrote:
Why not just combine them? How do I do that?
Consider a case where the cluster (of n nodes) is configured to process
just one task per node. Let there be (n-1) reducers. Lets assume that
the map phase is complete and the reducers are shuffling. There will be
(n-1) nodes with reducers. Now consider a case where the only node
without the reducer gets lost. The cluster needs slots to run maps that
were lost since the reducers are waiting for the maps to finish. In such
a case the job will get stuck. To avoid such cases, there are separate
maps and reduce task slots.
Rationale is that our tasks are very balanced in load, but unbalanced
in timing. I've found that limiting the number of total threads to be
the most safe approach to not overloading the dfs daemon. To date,
I've done that just through intelligent scheduling of jobs to stagger
maps and reduces, but have I missed a setting that exists to simply
limit number of tasks in-total?