FAQ
Hello,
I was wondering if some one could me some information on hadoop does the
sorting. From what I have read there does not seem to be a map class and reduce
class ? Where and how is the sorting parallelized ?


Best Regards from Buffalo

Abhishek Agrawal

SUNY- Buffalo
(716-435-7122)

Search Discussions

  • Gang Luo at Feb 19, 2010 at 10:06 pm
    Hi,
    the sorting is done by the MapReduce framework. At map side, the output record will first go to a sorting buffer where the sorting, partitioning and combining (if there is combiner) happen. If necessary, multi-phase sorting is done to make a single sorted result for each map task. At reduce side, all the data from multiple map tasks will be merged (each of them is sorted at the map side, you only need merge sort here). It goes multiple rounds if necessary.

    -Gang



    ----- 原始邮件 ----
    发件人: "aa225@buffalo.edu" <aa225@buffalo.edu>
    收件人: common-user@hadoop.apache.org
    发送日期: 2010/2/19 (周五) 2:25:50 下午
    主 题: Some information on Hadoop Sort

    Hello,
    I was wondering if some one could me some information on hadoop does the
    sorting. From what I have read there does not seem to be a map class and reduce
    class ? Where and how is the sorting parallelized ?


    Best Regards from Buffalo

    Abhishek Agrawal

    SUNY- Buffalo
    (716-435-7122)


    ___________________________________________________________
    好玩贺卡等你发,邮箱贺卡全新上线!
    http://card.mail.cn.yahoo.com/

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedFeb 19, '10 at 7:26p
activeFeb 19, '10 at 10:06p
posts2
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Aa225: 1 post Gang Luo: 1 post

People

Translate

site design / logo © 2022 Grokbase