FAQ
I need that during the execution of a particular job, a maximum of one
map task execute on each cluster node.
I've tried setting mapred.tasktracker.map.tasks.maximum=1 on job
configuration but seems not to work.

Anyone coul'd help?
Thanks

Massimo

--
DISCLAIMER: This e-mail and any attachment is for authorised use by
the intended recipient(s) only. It may contain proprietary material,
confidential information and/or be subject to legal privilege. It
should not be copied, disclosed to, retained or used by, any other
party. If you are not an intended recipient then please promptly
delete this e-mail and any attachment and all copies and inform
the sender. Thank you.

Search Discussions

  • Jim Falgout at Apr 15, 2011 at 3:13 pm
    I'm not sure that is possible. You can use the NLineInputFormat as a control file and have a line per node in the cluster. I've used that technique for a data generation program and it works well. This will run a pre-determined number of mappers. However, it's up to the scheduler to decide when and where they run. If other jobs are running concurrently, I don't believe you can be guaranteed you'll get a distinct mapper per node.

    Running my data generator job on a quiet cluster did run one mapper per node as I wanted. But if you don't have more control over your cluster, I believe the behavior is not deterministic.

    -----Original Message-----
    From: Massimo Schiavon
    Sent: Friday, April 15, 2011 10:04 AM
    To: common-user@hadoop.apache.org
    Subject: Force single map task execution per node for a job

    I need that during the execution of a particular job, a maximum of one map task execute on each cluster node.
    I've tried setting mapred.tasktracker.map.tasks.maximum=1 on job configuration but seems not to work.

    Anyone coul'd help?
    Thanks

    Massimo

    --
    DISCLAIMER: This e-mail and any attachment is for authorised use by the intended recipient(s) only. It may contain proprietary material, confidential information and/or be subject to legal privilege. It should not be copied, disclosed to, retained or used by, any other party. If you are not an intended recipient then please promptly delete this e-mail and any attachment and all copies and inform the sender. Thank you.
  • Baran cakici at Apr 15, 2011 at 3:16 pm
    mapred.map.tasks=1

    did you try that??

    2011/4/15 Massimo Schiavon <mschiavon@volunia.com>
    I need that during the execution of a particular job, a maximum of one map
    task execute on each cluster node.
    I've tried setting mapred.tasktracker.map.tasks.maximum=1 on job
    configuration but seems not to work.

    Anyone coul'd help?
    Thanks

    Massimo

    --
    DISCLAIMER: This e-mail and any attachment is for authorised use by
    the intended recipient(s) only. It may contain proprietary material,
    confidential information and/or be subject to legal privilege. It
    should not be copied, disclosed to, retained or used by, any other
    party. If you are not an intended recipient then please promptly
    delete this e-mail and any attachment and all copies and inform
    the sender. Thank you.
  • Juwei Shi at Apr 15, 2011 at 3:57 pm
    You should set mapred.tasktracker.map.tasks.maximum=1 on each node.

    2011/4/15 baran cakici <barancakici@gmail.com>
    mapred.map.tasks=1

    did you try that??

    2011/4/15 Massimo Schiavon <mschiavon@volunia.com>
    I need that during the execution of a particular job, a maximum of one map
    task execute on each cluster node.
    I've tried setting mapred.tasktracker.map.tasks.maximum=1 on job
    configuration but seems not to work.

    Anyone coul'd help?
    Thanks

    Massimo

    --
    DISCLAIMER: This e-mail and any attachment is for authorised use by
    the intended recipient(s) only. It may contain proprietary material,
    confidential information and/or be subject to legal privilege. It
    should not be copied, disclosed to, retained or used by, any other
    party. If you are not an intended recipient then please promptly
    delete this e-mail and any attachment and all copies and inform
    the sender. Thank you.


    --
    - Juwei
  • Harsh J at Apr 15, 2011 at 5:00 pm
    Hello Massimo,

    This is sort-of possible with a custom InputFormat (getSplits contents
    has the real scheduling information in it if you notice). But the
    host-to-task mapping is not strongly guaranteed (if slots are full on
    a node while launching, it could get scheduled elsewhere).
    On Fri, Apr 15, 2011 at 8:33 PM, Massimo Schiavon wrote:
    I need that during the execution of a particular job, a maximum of one map
    task execute on each cluster node.
    I've tried setting mapred.tasktracker.map.tasks.maximum=1 on job
    configuration but seems not to work.
    --
    Harsh J

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedApr 15, '11 at 3:04p
activeApr 15, '11 at 5:00p
posts5
users5
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase