|
Amogh Vasekar |
at Aug 3, 2009 at 4:54 am
|
⇧ |
| |
And the combiner runs while fetching the outputs right?
-----Original Message-----
From: Arun C Murthy
Sent: Monday, August 03, 2009 9:27 AM
To:
[email protected]Cc:
[email protected]Subject: Re: why reduce task can be scheduled before map tasks are 100% completed?
That check ensures sufficient #maps are completed before any of the
reduces for the job are started.
The reduces 'shuffle' map outputs from completed maps, but don't get
into the 'reduce' phase until all map-outputs are copied over.
Arun
PS: Moving this to mapreduce-dev@
On Aug 2, 2009, at 5:02 AM, 我的Gmail邮箱 wrote:
Hi, everyone.
In class org.apache.hadoop.mapred.JobInProgress, there is a public
method:
scheduleReduces(), it will return true if "finishedMapTasks >=
completedMapsForReduceSlowstart"
and then the scheduler can schedule a new reduce task for a given
taskTracker.
but as I konw, reduce can not be started unitl map is 100%
completed. Does
anyone can explain it? thanks a lot.