If you remove userMaxJobsDefault, the default value is
Integer.MAX_VALUE - that is, it's unconstrained by this limit. This
means that the other limits and fair sharing would kick in if multiple
jobs are submitted. So, if you haven't set any of the min-slots, and
the jobs are all at the same priority, they'll share the number of
slots equally. Please check out the fair scheduler documentation in
docs/fair_scheduler.pdf in your distro.
On Fri, Jan 15, 2010 at 1:15 AM, Pallavi Palleti
Thanks for the reply. I figured out that *userMaxJobsDefault* was
set to 1. I have another query regarding the same. What will
happen if I remove *userMaxJobsDefault *property? What is the
default value? Would setting a value higher than 1 for a
particular user leads other users' jobs to stall till these jobs
get over? If so, is there a way where we can set that, a user can
take at max some percentage of total idle mappers existing at that
time? And, if the threshold exceeds, we can let users to run only
some defaults number of jobs at a time? This way, we can avoid
stalling other users' jobs and also efficiently utilize the
cluster. Kindly clarify.
Todd Lipcon wrote:
This doesn't sound right. Can you visithttp://jobtracker:50030/scheduler?advanced
and maybe send a
screenshot? And also upload the allocations.xml file you're using?
It sounds like you've managed to set either userMaxJobsDefault or
maxRunningJobs for that user to 1.
On Thu, Jan 14, 2010 at 9:05 PM, Pallavi Palleti
I am experimenting with fair scheduler in a cluster of 10
machines. The users are given default values("0") for minMaps
and minReduces in fair scheduler parameters. When I tried to
run two jobs using the same username, the fair scheduler is
giving 100% fair share to first job(needs 2 mappers) and the
second job(needs10 mappers) is in waiting mode though the
cluster is totally idle. Allowing these jobs to run
simultaneously would take only 10% of total available
mappers. However, the second job is not allowed to run till
the first job is over. It would be great if some one can
suggest some parameter tuning which can allow efficient
utilization of cluster. Efficient I mean, allowing jobs to
run when the cluster is idle rather letting them in waiting
mode. I am not sure whether setting "minMaps, minReduces" for
each user would resolve the issue. Kindly clarify.