FAQ
Dear All,

When i using Hadoop, I noticed that the reducer step is started immediately
when the mappers are still running. According to my project requirement, the
reducer step should not start until all the mappers finish their execution.
Anybody knows how to use some Hadoop API to achieve this? When all the
mappers finish their process, then the reducer is started.

Thanks
--
Guangfeng Jin

Search Discussions

  • Deyaa Adranale at Jul 28, 2008 at 10:31 am
    As far as I know, the reducer has three tasks: fetching results of
    mappers, sorting the results, and calling the reduce function.
    when some mappers finish their execution, the reducer starts by fetching
    their results to save time.
    neither sorting nor calling the reduce function could start before all
    the mappers have finished and all their results are available locally.

    I don't know whether you can prevent copying mappers results before all
    mappers finish. Anyway, it would be meaningless.

    hope that helped

    Deyaa

    ??? wrote:
    Dear All,

    When i using Hadoop, I noticed that the reducer step is started immediately
    when the mappers are still running. According to my project requirement, the
    reducer step should not start until all the mappers finish their execution.
    Anybody knows how to use some Hadoop API to achieve this? When all the
    mappers finish their process, then the reducer is started.

    Thanks
  • Shengkai Zhu at Jul 28, 2008 at 10:32 am
    The real reduce logic is actually started when all map tasks are finished.

    Is it still unexpected?

    On 7/28/08, 晋光峰 wrote:

    Dear All,

    When i using Hadoop, I noticed that the reducer step is started immediately
    when the mappers are still running. According to my project requirement,
    the
    reducer step should not start until all the mappers finish their execution.
    Anybody knows how to use some Hadoop API to achieve this? When all the
    mappers finish their process, then the reducer is started.

    Thanks
    --
    Guangfeng Jin


    --

    朱盛凯

    Jash Zhu

    复旦大学软件学院

    Software School, Fudan University
  • 晋光峰 at Jul 29, 2008 at 5:09 am
    I got it. Thanks!

    2008/7/28 Shengkai Zhu <geniusjash@gmail.com>
    The real reduce logic is actually started when all map tasks are finished.

    Is it still unexpected?

    On 7/28/08, 晋光峰 wrote:

    Dear All,

    When i using Hadoop, I noticed that the reducer step is started
    immediately
    when the mappers are still running. According to my project requirement,
    the
    reducer step should not start until all the mappers finish their
    execution.
    Anybody knows how to use some Hadoop API to achieve this? When all the
    mappers finish their process, then the reducer is started.

    Thanks
    --
    Guangfeng Jin


    --

    朱盛凯

    Jash Zhu

    复旦大学软件学院

    Software School, Fudan University


    --
    Guangfeng Jin
  • Rae l at Jul 29, 2008 at 8:26 am

    2008/7/29 晋光峰 <jinguangfeng@gmail.com>:
    I got it. Thanks!

    2008/7/28 Shengkai Zhu <geniusjash@gmail.com>
    The real reduce logic is actually started when all map tasks are finished.

    Is it still unexpected?

    朱盛凯

    Jash Zhu

    复旦大学软件学院
    根据我使用Hadoop和看过的Hadoop代码的经验,Reducer不会在Mapper之前运行;有时能观察到mapper先启动了,但也没有对程序运行的结果有影响;

    BTW: 原来有这么多国内的朋友在研究Hadoop啊,我也是在几个月前根据公司的任务开始研究和部署Hadoop;照此看来,如果我们建设一个Hadoop中文讨论区不知如何?或者哪位已知有了中文的Hadoop讨论区?根据PowerBy页面国内已经有了Koubei网已经在用上了:
    http://wiki.apache.org/hadoop/PoweredBy

    --
    程任全
  • Xuebing Yan at Jul 29, 2008 at 11:07 am
    阿里巴巴搜索技术研发中心已经在和Hadoop PMC协商Hadoop中文社区的事情了,
    Hadoop 0.17的中文文档有可能在近期发布。

    -闫雪冰
    On Tue, 2008-07-29 at 16:25 +0800, rae l wrote:
    2008/7/29 晋光峰 <jinguangfeng@gmail.com>:
    I got it. Thanks!

    2008/7/28 Shengkai Zhu <geniusjash@gmail.com>
    The real reduce logic is actually started when all map tasks are finished.

    Is it still unexpected?

    朱盛凯

    Jash Zhu

    复旦大学软件学院
    根据我使用Hadoop和看过的Hadoop代码的经验,Reducer不会在Mapper之前运行;有时能观察到mapper先启动了,但也没有对程序运行的结果有影响;

    BTW: 原来有这么多国内的朋友在研究Hadoop啊,我也是在几个月前根据公司的任务开始研究和部署Hadoop;照此看来,如果我们建设一个Hadoop中文讨论区不知如何?或者哪位已知有了中文的Hadoop讨论区?根据PowerBy页面国内已经有了Koubei网已经在用上了:
    http://wiki.apache.org/hadoop/PoweredBy

    --
    程任全
  • Rae l at Jul 29, 2008 at 2:59 pm
    2008/7/29 Xuebing Yan <yanxuebing@alibaba-inc.com>:
    阿里巴巴搜索技术研发中心已经在和Hadoop PMC协商Hadoop中文社区的事情了,
    Hadoop 0.17的中文文档有可能在近期发布。
    好。

    http://www.hadoop.org.cn/
    这个似乎是一个人建立的BLOG,查询结果是:

    www.hadoop.org.cn >> 218.240.14.21

    * 本站主数据:北京市 中关村信息工程股份有限公司
    * 查询结果2:北京市 中关村信息工程股份有限公司
    * 查询结果3:北京市

    --
    程任全
  • Rae l at Jul 29, 2008 at 3:01 pm
    2008/7/29 Xuebing Yan <yanxuebing@alibaba-inc.com>:
    阿里巴巴搜索技术研发中心已经在和Hadoop PMC协商Hadoop中文社区的事情了,
    Hadoop 0.17的中文文档有可能在近期发布。
    好。

    http://www.hadoop.org.cn/
    这个似乎是一个人建立的BLOG,查询结果是:

    www.hadoop.org.cn >> 218.240.14.21

    * 本站主数据:北京市 中关村信息工程股份有限公司
    * 查询结果2:北京市 中关村信息工程股份有限公司
    * 查询结果3:北京市

    --
    程任全
  • Daniel Yu at Jul 29, 2008 at 4:22 pm
    我现在在国外读书 我的毕业设计课题正好是用hadoop和hbase的 有一个中文社区是件挺不错的事
    希望相关的文档资料都能及时跟进
    2008/7/29 Xuebing Yan <yanxuebing@alibaba-inc.com>
    阿里巴巴搜索技术研发中心已经在和Hadoop PMC协商Hadoop中文社区的事情了,
    Hadoop 0.17的中文文档有可能在近期发布。

    -闫雪冰
    On Tue, 2008-07-29 at 16:25 +0800, rae l wrote:
    2008/7/29 晋光峰 <jinguangfeng@gmail.com>:
    I got it. Thanks!

    2008/7/28 Shengkai Zhu <geniusjash@gmail.com>
    The real reduce logic is actually started when all map tasks are
    finished.
    Is it still unexpected?

    朱盛凯

    Jash Zhu

    复旦大学软件学院
    根据我使用Hadoop和看过的Hadoop代码的经验,Reducer不会在Mapper之前运行;有时能观察到mapper先启动了,但也没有对程序运行的结果有影响;
    BTW:
    原来有这么多国内的朋友在研究Hadoop啊,我也是在几个月前根据公司的任务开始研究和部署Hadoop;照此看来,如果我们建设一个Hadoop中文讨论区不知如何?或者哪位已知有了中文的Hadoop讨论区?根据PowerBy页面国内已经有了Koubei网已经在用上了:
    http://wiki.apache.org/hadoop/PoweredBy

    --
    程任全
  • Lohit at Jul 29, 2008 at 4:33 pm
    Wiki和文件应该帮助。 否则,请打开JIRA要求将帮助大家:)的更好的文献



    ----- Original Message ----
    From: Daniel Yu <d4nielfree@gmail.com>
    To: core-user@hadoop.apache.org
    Sent: Tuesday, July 29, 2008 9:22:00 AM
    Subject: Re: How to control the map and reduce step sequentially

    我现在在国外读书 我的毕业设计课题正好是用hadoop和hbase的 有一个中文社区是件挺不错的事
    希望相关的文档资料都能及时跟进
    2008/7/29 Xuebing Yan <yanxuebing@alibaba-inc.com>
    阿里巴巴搜索技术研发中心已经在和Hadoop PMC协商Hadoop中文社区的事情了,
    Hadoop 0.17的中文文档有可能在近期发布。

    -闫雪冰
    On Tue, 2008-07-29 at 16:25 +0800, rae l wrote:
    2008/7/29 晋光峰 <jinguangfeng@gmail.com>:
    I got it. Thanks!

    2008/7/28 Shengkai Zhu <geniusjash@gmail.com>
    The real reduce logic is actually started when all map tasks are
    finished.
    Is it still unexpected?

    朱盛凯

    Jash Zhu

    复旦大学软件学院
    根据我使用Hadoop和看过的Hadoop代码的经验,Reducer不会在Mapper之前运行;有时能观察到mapper先启动了,但也没有对程序运行的结果有影响;
    BTW:
    原来有这么多国内的朋友在研究Hadoop啊,我也是在几个月前根据公司的任务开始研究和部署Hadoop;照此看来,如果我们建设一个Hadoop中文讨论区不知如何?或者哪位已知有了中文的Hadoop讨论区?根据PowerBy页面国内已经有了Koubei网已经在用上了:
    http://wiki.apache.org/hadoop/PoweredBy

    --
    程任全
  • Gopal Gandhi at Jul 30, 2008 at 10:51 pm
    Yes, reducer starts for sorting, but not really reduces .



    ----- Original Message ----
    From: 晋光峰 <jinguangfeng@gmail.com>
    To: core-user@hadoop.apache.org
    Sent: Monday, July 28, 2008 10:08:33 PM
    Subject: Re: How to control the map and reduce step sequentially

    I got it. Thanks!

    2008/7/28 Shengkai Zhu <geniusjash@gmail.com>
    The real reduce logic is actually started when all map tasks are finished.

    Is it still unexpected?

    On 7/28/08, 晋光峰 wrote:

    Dear All,

    When i using Hadoop, I noticed that the reducer step is started
    immediately
    when the mappers are still running. According to my project requirement,
    the
    reducer step should not start until all the mappers finish their
    execution.
    Anybody knows how to use some Hadoop API to achieve this? When all the
    mappers finish their process, then the reducer is started.

    Thanks
    --
    Guangfeng Jin


    --

    朱盛凯

    Jash Zhu

    复旦大学软件学院

    Software School, Fudan University


    --
    Guangfeng Jin

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedJul 28, '08 at 10:12a
activeJul 30, '08 at 10:51p
posts11
users8
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2021 Grokbase