FAQ
Is there a way to control maximum number of mappers or reducers per node per job? i.e. say I have a cluster, now I want to run a job such that on each node, no more than 2 mappers are expected to run at the same time (but maximum numbers of m/r slots on a node are some bigger numbers specified by "mapred.tasktracker.map.tasks.maximum", "mapred.tasktracker.reduce.tasks.maximum").

Thanks,

Michael

Search Discussions

  • Harsh J at Oct 21, 2010 at 6:04 am
    AFAIK there is no way to control this from a job submission perspective.
    Maybe the scheduler concept in Hadoop MapReduce can help you.

    --
    Harsh J
    http://www.harshj.com

    On Oct 21, 2010 6:32 AM, "jiang licht" wrote:

    Is there a way to control maximum number of mappers or reducers per node per
    job? i.e. say I have a cluster, now I want to run a job such that on each
    node, no more than 2 mappers are expected to run at the same time (but
    maximum numbers of m/r slots on a node are some bigger numbers specified by
    "mapred.tasktracker.map.tasks.maximum",
    "mapred.tasktracker.reduce.tasks.maximum").

    Thanks,

    Michael
  • Jiang licht at Oct 22, 2010 at 12:57 am
    Thanks, Harsh.

    That's my suspect as well. It's part of what a schedule is responsible for, s.a. capacity scheduler. Want to make sure I didn't miss it if there is another node-wise property at job-level to limit resource available to a job in parallel with "mapred.tasktracker.*.tasks.maximum".

    Thanks,

    Michael

    --- On Thu, 10/21/10, Harsh J wrote:

    From: Harsh J <qwertymaniac@gmail.com>
    Subject: Re: specify number of mappers/reducers per node per job?
    To: common-user@hadoop.apache.org
    Date: Thursday, October 21, 2010, 1:03 AM

    AFAIK there is no way to control this from a job submission perspective.
    Maybe the scheduler concept in Hadoop MapReduce can help you.

    --
    Harsh J
    http://www.harshj.com

    On Oct 21, 2010 6:32 AM, "jiang licht" wrote:

    Is there a way to control maximum number of mappers or reducers per node per
    job? i.e. say I have a cluster, now I want to run a job such that on each
    node, no more than 2 mappers are expected to run at the same time (but
    maximum numbers of m/r slots on a node are some bigger numbers specified by
    "mapred.tasktracker.map.tasks.maximum",
    "mapred.tasktracker.reduce.tasks.maximum").

    Thanks,

    Michael
  • Allen Wittenauer at Oct 22, 2010 at 3:50 pm
  • Jiang licht at Oct 21, 2010 at 11:10 pm
    Actually I believe this cannot be achieved by current m/r scheduler since as I understand, a scheduler can only control how much resource in total to be assigned to a job but it does not allow external control which node a mapper/reducer will run on, which is reasonable (jobtracker has to base its task assignment on the current status of the cluster, like if a node is blacklisted, then you cannot run any tasks on it).

    Thanks,

    Michael

    --- On Thu, 10/21/10, jiang licht wrote:

    From: jiang licht <licht_jiang@yahoo.com>
    Subject: Re: specify number of mappers/reducers per node per job?
    To: common-user@hadoop.apache.org
    Date: Thursday, October 21, 2010, 12:38 PM

    Thanks, Harsh.

    That's my suspect as well. It's part of what a schedule is responsible for, s.a. capacity scheduler. Want to make sure I didn't miss it if there is another node-wise property at job-level to limit resource available to a job in parallel with "mapred.tasktracker.*.tasks.maximum".

    Thanks,

    Michael

    --- On Thu, 10/21/10, Harsh J wrote:

    From: Harsh J <qwertymaniac@gmail.com>
    Subject: Re: specify number of mappers/reducers per node per job?
    To: common-user@hadoop.apache.org
    Date: Thursday, October 21, 2010, 1:03 AM

    AFAIK there is no way to control this from a job submission perspective.
    Maybe the scheduler concept in Hadoop MapReduce can help you.

    --
    Harsh J
    http://www.harshj.com

    On Oct 21, 2010 6:32 AM, "jiang licht" wrote:

    Is there a way to control maximum number of mappers or reducers per node per
    job? i.e. say I have a cluster, now I want
    to run a job such that on each
    node, no more than 2 mappers are expected to run at the same time (but
    maximum numbers of m/r slots on a node are some bigger numbers specified by
    "mapred.tasktracker.map.tasks.maximum",
    "mapred.tasktracker.reduce.tasks.maximum").

    Thanks,

    Michael

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedOct 21, '10 at 1:02a
activeOct 22, '10 at 3:50p
posts5
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase