given a single cluster running with the default job scheduler: Is only
one job executing on the cluster, regardless of how many task
map/reduce slots it can keep busy?
In other words, If a job does not use all task slots, would the
default scheduler consider scheduling map/reduce from other jobs that
have already been submitted to the system?

I am using an 8-node cluster to run some test jobs based on gridmix
(the synthetic benchmark found in the hadoop distribution under
src/banchmarks/gridmix). The gridmix workload submits many different
jobs in parallel - 5 different kinds of jobs of varying sizes for each
kind: small, medium, large. While running, I am noticing that at any
time only one job is making progress - at least according to the
jobtracker web ui. I think this is happening even for small-size jobs,
which don't take up all slots of the cluster's tasktrackers/nodes.

If the default scheduler is not capable of scheduling tasks for
multiple jobs, would I have to use the capacity scheduler? Or
something else?

thanks for any help,

- Vasilis

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
postedSep 27, '09 at 1:38a
activeSep 27, '09 at 1:38a

1 user in discussion

Vasilis Liaskovitis: 1 post



site design / logo © 2022 Grokbase