Hi,

We have many different jobs running on a 0.22.0 cluster, each with its own
memory consumption. Some jobs can easily be run with a large amount of *.tasks
per job and others require much more memory and can only be run with a minimum
number of tasks per node.

Is there any way to reconfigure a running cluster on a per job basis so we can
set the heap size and number of mapper and reduce tasks per node? If not, we
have to force all settings to a level that is right for the toughest jobs
which will have a negative impact on simpler jobs.

Thoughts?
Thanks

Search Discussions

  • Arun C Murthy at Dec 20, 2011 at 12:31 am
    Markus,

    The CapacityScheduler in 0.20.205 (in fact since 0.20.203) supports the notion of 'high memory jobs' with which you can specify, for each job, the number of 'slots' for each map/reduce. For e.g. you can say for job1 that each map needs 2 slots and so on.

    Unfortunately, I don't know how well this works in 0.22 - I might be wrong, but I heavily doubt it's been tested in 0.22. YMMV.

    Hope that helps.

    Arun
    On Dec 19, 2011, at 3:02 PM, Markus Jelsma wrote:

    Hi,
    We have many different jobs running on a 0.22.0 cluster, each with its own memory consumption. Some jobs can easily be run with a large amount of *.tasks per job and others require much more memory and can only be run with a minimum number of tasks per node.
    Is there any way to reconfigure a running cluster on a per job basis so we can set the heap size and number of mapper and reduce tasks per node? If not, we have to force all settings to a level that is right for the toughest jobs which will have a negative impact on simpler jobs.
    Thoughts?
    Thanks
  • Markus Jelsma at Dec 20, 2011 at 2:19 pm
    Thanks! I'll look into it.
    On Tuesday 20 December 2011 01:31:17 Arun C Murthy wrote:
    Markus,

    The CapacityScheduler in 0.20.205 (in fact since 0.20.203) supports the
    notion of 'high memory jobs' with which you can specify, for each job, the
    number of 'slots' for each map/reduce. For e.g. you can say for job1 that
    each map needs 2 slots and so on.

    Unfortunately, I don't know how well this works in 0.22 - I might be wrong,
    but I heavily doubt it's been tested in 0.22. YMMV.

    Hope that helps.

    Arun
    On Dec 19, 2011, at 3:02 PM, Markus Jelsma wrote:
    Hi,
    We have many different jobs running on a 0.22.0 cluster, each with its
    own memory consumption. Some jobs can easily be run with a large amount
    of *.tasks per job and others require much more memory and can only be
    run with a minimum number of tasks per node. Is there any way to
    reconfigure a running cluster on a per job basis so we can set the heap
    size and number of mapper and reduce tasks per node? If not, we have to
    force all settings to a level that is right for the toughest jobs which
    will have a negative impact on simpler jobs. Thoughts?
    Thanks

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupmapreduce-user @
categorieshadoop
postedDec 19, '11 at 11:04p
activeDec 20, '11 at 2:19p
posts3
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Markus Jelsma: 2 posts Arun C Murthy: 1 post

People

Translate

site design / logo © 2022 Grokbase