FAQ
Hi,

i was trying to create a test based on mapreduce job in a local mode
testing various partitioning issues.

But curiously, whenever i switch mapreduce into local node, i can't
seem to be able to configure multiple reduce tasks.

Indeed, upon some investigation i found that the following fragment in
LocalJobRunner resets all reducers to 1 :

/* 177 */ int numReduceTasks = this.job.getNumReduceTasks();
/* 178 */ if ((numReduceTasks > 1) || (numReduceTasks < 0))
/* */ {
/* 180 */ numReduceTasks = 1;
/* 181 */ this.job.setNumReduceTasks(1);
/* */ }
/* 183 */ outputCommitter.setupJob(jContext);
/* 184 */ this.status.setSetupProgress(1.0F);
/* */
/* 186 */ Map mapOutputFiles = new HashMap();
/* */


Is this a fundamental limitation of the local mapreduce mode? what if
i need to write up a unit test that checks various partitioning
functions? Is there a workaround?

Also, i don't remember these problems when writing tests based on
local mapreduce in previous versions (this is cdh3b4) , although i
cannot be sure if i ran into exactly same situation before.

thanks.
-Dmitriy

Search Discussions

  • Jason at May 2, 2011 at 11:47 pm
    Dmitriy,

    I remember I had the same problem with local jobs when I tried to
    debug my multi-reducer use cases. So had to create this small patch
    that resolves the issue.
    You can put these classes into org.apache.hadoop.mapred package in
    your local project and make sure they preceed Hadoop's jars in the
    class path.

    My patch is based on Cloudera 0.20.2+320 release.

    Hope this helps.

    On 5/2/11, Dmitriy Lyubimov wrote:
    Hi,

    i was trying to create a test based on mapreduce job in a local mode
    testing various partitioning issues.

    But curiously, whenever i switch mapreduce into local node, i can't
    seem to be able to configure multiple reduce tasks.

    Indeed, upon some investigation i found that the following fragment in
    LocalJobRunner resets all reducers to 1 :

    /* 177 */ int numReduceTasks = this.job.getNumReduceTasks();
    /* 178 */ if ((numReduceTasks > 1) || (numReduceTasks < 0))
    /* */ {
    /* 180 */ numReduceTasks = 1;
    /* 181 */ this.job.setNumReduceTasks(1);
    /* */ }
    /* 183 */ outputCommitter.setupJob(jContext);
    /* 184 */ this.status.setSetupProgress(1.0F);
    /* */
    /* 186 */ Map mapOutputFiles = new HashMap();
    /* */


    Is this a fundamental limitation of the local mapreduce mode? what if
    i need to write up a unit test that checks various partitioning
    functions? Is there a workaround?

    Also, i don't remember these problems when writing tests based on
    local mapreduce in previous versions (this is cdh3b4) , although i
    cannot be sure if i ran into exactly same situation before.

    thanks.
    -Dmitriy
  • Dmitriy Lyubimov at May 3, 2011 at 12:03 am
    Thanks a bunch!

    (is there any chance you could do a diff only ? )

    -d
    On Mon, May 2, 2011 at 4:47 PM, jason wrote:
    Dmitriy,

    I remember I had the same problem with local jobs when I tried to
    debug my multi-reducer use cases. So had to create this small patch
    that resolves the issue.
    You can put these classes into org.apache.hadoop.mapred package in
    your local project and make sure they preceed Hadoop's jars in the
    class path.

    My patch is based on Cloudera 0.20.2+320 release.

    Hope this helps.

    On 5/2/11, Dmitriy Lyubimov wrote:
    Hi,

    i was trying to create a test based on mapreduce job in a local mode
    testing various partitioning issues.

    But curiously, whenever i switch mapreduce into local node, i can't
    seem to be able to configure multiple reduce tasks.

    Indeed, upon some investigation i found that the following fragment in
    LocalJobRunner resets all reducers to 1 :

    /* 177 */         int numReduceTasks = this.job.getNumReduceTasks();
    /* 178 */         if ((numReduceTasks > 1) || (numReduceTasks < 0))
    /*     */         {
    /* 180 */           numReduceTasks = 1;
    /* 181 */           this.job.setNumReduceTasks(1);
    /*     */         }
    /* 183 */         outputCommitter.setupJob(jContext);
    /* 184 */         this.status.setSetupProgress(1.0F);
    /*     */
    /* 186 */         Map mapOutputFiles = new HashMap();
    /*     */


    Is this a fundamental limitation of the local mapreduce mode? what if
    i need to write up a unit test that checks various partitioning
    functions? Is there a workaround?

    Also, i don't remember these problems when writing tests based on
    local mapreduce in previous versions (this is cdh3b4) , although i
    cannot be sure if i ran into exactly same situation before.

    thanks.
    -Dmitriy
  • Jason at May 3, 2011 at 12:14 am
    I am attaching the originals so you could figure out the diffs on your own :)
    On 5/2/11, Dmitriy Lyubimov wrote:
    Thanks a bunch!

    (is there any chance you could do a diff only ? )

    -d
    On Mon, May 2, 2011 at 4:47 PM, jason wrote:
    Dmitriy,

    I remember I had the same problem with local jobs when I tried to
    debug my multi-reducer use cases. So had to create this small patch
    that resolves the issue.
    You can put these classes into org.apache.hadoop.mapred package in
    your local project and make sure they preceed Hadoop's jars in the
    class path.

    My patch is based on Cloudera 0.20.2+320 release.

    Hope this helps.

    On 5/2/11, Dmitriy Lyubimov wrote:
    Hi,

    i was trying to create a test based on mapreduce job in a local mode
    testing various partitioning issues.

    But curiously, whenever i switch mapreduce into local node, i can't
    seem to be able to configure multiple reduce tasks.

    Indeed, upon some investigation i found that the following fragment in
    LocalJobRunner resets all reducers to 1 :

    /* 177 */ int numReduceTasks = this.job.getNumReduceTasks();
    /* 178 */ if ((numReduceTasks > 1) || (numReduceTasks < 0))
    /* */ {
    /* 180 */ numReduceTasks = 1;
    /* 181 */ this.job.setNumReduceTasks(1);
    /* */ }
    /* 183 */ outputCommitter.setupJob(jContext);
    /* 184 */ this.status.setSetupProgress(1.0F);
    /* */
    /* 186 */ Map mapOutputFiles = new HashMap();
    /* */


    Is this a fundamental limitation of the local mapreduce mode? what if
    i need to write up a unit test that checks various partitioning
    functions? Is there a workaround?

    Also, i don't remember these problems when writing tests based on
    local mapreduce in previous versions (this is cdh3b4) , although i
    cannot be sure if i ran into exactly same situation before.

    thanks.
    -Dmitriy
  • Tom White at May 3, 2011 at 12:28 am
    See also https://issues.apache.org/jira/browse/MAPREDUCE-434 which has
    a patch for this issue.

    Cheers,
    Tom
    On Mon, May 2, 2011 at 5:13 PM, jason wrote:
    I am attaching the originals so you could figure out the diffs on your own :)
    On 5/2/11, Dmitriy Lyubimov wrote:
    Thanks a bunch!

    (is there any chance you could do a diff only ? )

    -d
    On Mon, May 2, 2011 at 4:47 PM, jason wrote:
    Dmitriy,

    I remember I had the same problem with local jobs when I tried to
    debug my multi-reducer use cases. So had to create this small patch
    that resolves the issue.
    You can put these classes into org.apache.hadoop.mapred package in
    your local project and make sure they preceed Hadoop's jars in the
    class path.

    My patch is based on Cloudera 0.20.2+320 release.

    Hope this helps.

    On 5/2/11, Dmitriy Lyubimov wrote:
    Hi,

    i was trying to create a test based on mapreduce job in a local mode
    testing various partitioning issues.

    But curiously, whenever i switch mapreduce into local node, i can't
    seem to be able to configure multiple reduce tasks.

    Indeed, upon some investigation i found that the following fragment in
    LocalJobRunner resets all reducers to 1 :

    /* 177 */         int numReduceTasks = this.job.getNumReduceTasks();
    /* 178 */         if ((numReduceTasks > 1) || (numReduceTasks < 0))
    /*     */         {
    /* 180 */           numReduceTasks = 1;
    /* 181 */           this.job.setNumReduceTasks(1);
    /*     */         }
    /* 183 */         outputCommitter.setupJob(jContext);
    /* 184 */         this.status.setSetupProgress(1.0F);
    /*     */
    /* 186 */         Map mapOutputFiles = new HashMap();
    /*     */


    Is this a fundamental limitation of the local mapreduce mode? what if
    i need to write up a unit test that checks various partitioning
    functions? Is there a workaround?

    Also, i don't remember these problems when writing tests based on
    local mapreduce in previous versions (this is cdh3b4) , although i
    cannot be sure if i ran into exactly same situation before.

    thanks.
    -Dmitriy

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupmapreduce-user @
categorieshadoop
postedMay 2, '11 at 10:53p
activeMay 3, '11 at 12:28a
posts5
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase