FAQ
The motivation is to control the max # of mappers of a job. For example, the input data is 246MB, divided by 64M is 4. If by default there will be 4 mappers launched on the 4 blocks.
What I want is to set its max # of mappers as 2, so that 2 mappers are launched first and when they completes on the first 2 blocks, another 2 mappers start on the rest 2 blocks. Does Hadoop provide a way?

Search Discussions

  • Gopal Gandhi at Jul 31, 2008 at 11:46 pm
    The motivation is to control the max # of mappers of a job. For example, the input data is 246MB, divided by 64M is 4. If by default there will be 4 mappers launched on the 4 blocks.
    What I want is to set its max # of mappers as 2, so that 2 mappers are launched first and when they completes on the first 2 blocks, another 2 mappers start on the rest 2 blocks. Does Hadoop provide a way?
  • Goel, Ankur at Jul 31, 2008 at 5:18 am
    How big is your cluster? Assuming you are running a single node cluster,

    Hadoop-default.xml has a parameter 'mapred.map.tasks' that is set to 2.
    So
    By default, no matter how many map tasks are calculated by framework,
    only 2 map task will execute on a single node cluster.

    -----Original Message-----
    From: Gopal Gandhi
    Sent: Thursday, July 31, 2008 4:38 AM
    To: core-user@hadoop.apache.org
    Cc: core-dev@hadoop.apache.org
    Subject: How can I control Number of Mappers of a job?

    The motivation is to control the max # of mappers of a job. For example,
    the input data is 246MB, divided by 64M is 4. If by default there will
    be 4 mappers launched on the 4 blocks.
    What I want is to set its max # of mappers as 2, so that 2 mappers are
    launched first and when they completes on the first 2 blocks, another 2
    mappers start on the rest 2 blocks. Does Hadoop provide a way?
  • Andreas Kostyrka at Aug 4, 2008 at 5:39 am
    Well, the only way to reliably fix the number of maptasks that I've found is
    by using compressed input files, that forces hadoop to assign one and only
    one file to a map task ;)

    Andreas
    On Thursday 31 July 2008 21:30:33 Gopal Gandhi wrote:
    Thank you, finally someone has interests in my questions =)
    My cluster contains more than one machine. Please don't get me wrong :-). I
    don't want to limit the total mappers in one node (by mapred.map.tasks).
    What I want is to limit the total mappers for one job. The motivation is
    that I have 2 jobs to run at the same time. they have "the same input data
    in Hadoop". I found that one job has to wait until the other finishes its
    mapping. Because the 2 jobs are submitted by 2 different people, I don't
    want one job to be starving. So I want to limit the first job's total
    mappers so that the 2 jobs will be launched simultaneously.



    ----- Original Message ----
    From: "Goel, Ankur" <ankur.goel@corp.aol.com>
    To: core-user@hadoop.apache.org
    Cc: core-dev@hadoop.apache.org
    Sent: Wednesday, July 30, 2008 10:17:53 PM
    Subject: RE: How can I control Number of Mappers of a job?

    How big is your cluster? Assuming you are running a single node cluster,

    Hadoop-default.xml has a parameter 'mapred.map.tasks' that is set to 2.
    So
    By default, no matter how many map tasks are calculated by framework,
    only 2 map task will execute on a single node cluster.

    -----Original Message-----
    From: Gopal Gandhi
    Sent: Thursday, July 31, 2008 4:38 AM
    To: core-user@hadoop.apache.org
    Cc: core-dev@hadoop.apache.org
    Subject: How can I control Number of Mappers of a job?

    The motivation is to control the max # of mappers of a job. For example,
    the input data is 246MB, divided by 64M is 4. If by default there will
    be 4 mappers launched on the 4 blocks.
    What I want is to set its max # of mappers as 2, so that 2 mappers are
    launched first and when they completes on the first 2 blocks, another 2
    mappers start on the rest 2 blocks. Does Hadoop provide a way?

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedJul 31, '08 at 5:18a
activeAug 4, '08 at 5:39a
posts4
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase