FAQ
Hi,

I see in the code that while we assign a number of map tasks, we assign only
one reduce task per tasktracker during the heartbeat.

Is there a brief somewhere on why this design decision is made ?

Thanks
Sudhan S

Search Discussions

  • Harsh J at Aug 24, 2011 at 1:50 pm
    The reducer's primary work begins by pulling in data files from all
    the other tasktrackers. Due to this fact, assigning multiple reduce
    tasks in one go would tax the node (in terms of number of network
    connections) since they'll all begin individually connecting and
    pulling at about the same time, and for this reason it was chosen to
    assign only one per heartbeat, and thereby give each r-task some
    breather time to finish up a round of connections before another comes
    in to do the same.
    On Wed, Aug 24, 2011 at 4:18 PM, Sudharsan Sampath wrote:
    Hi,
    I see in the code that while we assign a number of map tasks, we assign only
    one reduce task per tasktracker during the heartbeat.
    Is there a brief somewhere on why this design decision is made ?
    Thanks
    Sudhan S


    --
    Harsh J

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupmapreduce-user @
categorieshadoop
postedAug 24, '11 at 10:49a
activeAug 24, '11 at 1:50p
posts2
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Sudharsan Sampath: 1 post Harsh J: 1 post

People

Translate

site design / logo © 2022 Grokbase