Running Hadoop on EC2 using 32 bit small instances with 5 slaves. The job has
5 map tasks and 1 reduce task. Each map task is 5 minutes long.
After 5 minutes and 14 seconds, 4 map tasks completed but the last one is
now stalling the entire process.
Here are the logs for JobTracker:
----------------------------------------
[start of JobTracker Logs]
----------------------------------------
2009-09-30 04:31:24,468 INFO org.apache.hadoop.mapred.JobTracker: Adding
task 'attempt_200909300419_0001_m_000006_0' to tip
task_200909300419_0001_m_000006, for tracker
'tracker_domU-12-31-39-02-66-21.compute-1.internal:localhost.localdomain/127.0.0.1:58857'
2009-09-30 04:31:28,009 INFO org.apache.hadoop.mapred.JobInProgress: Task
'attempt_200909300419_0001_m_000006_0' has completed
task_200909300419_0001_m_000006 successfully.
2009-09-30 04:31:28,012 INFO org.apache.hadoop.mapred.JobTracker: Adding
task 'attempt_200909300419_0001_m_000001_0' to tip
task_200909300419_0001_m_000001, for tracker
'tracker_domU-12-31-39-02-66-21.compute-1.internal:localhost.localdomain/127.0.0.1:58857'
2009-09-30 04:31:28,015 INFO org.apache.hadoop.mapred.JobInProgress:
Choosing data-local task task_200909300419_0001_m_000001
2009-09-30 04:31:28,505 INFO org.apache.hadoop.mapred.JobTracker: Adding
task 'attempt_200909300419_0001_m_000002_0' to tip
task_200909300419_0001_m_000002, for tracker
'tracker_domU-12-31-39-02-F5-D8.compute-1.internal:localhost.localdomain/127.0.0.1:60031'
2009-09-30 04:31:28,505 INFO org.apache.hadoop.mapred.JobInProgress:
Choosing data-local task task_200909300419_0001_m_000002
2009-09-30 04:31:29,577 INFO org.apache.hadoop.mapred.JobTracker: Adding
task 'attempt_200909300419_0001_m_000000_0' to tip
task_200909300419_0001_m_000000, for tracker
'tracker_domU-12-31-39-03-BA-14.compute-1.internal:localhost.localdomain/127.0.0.1:54570'
2009-09-30 04:31:29,577 INFO org.apache.hadoop.mapred.JobInProgress:
Choosing data-local task task_200909300419_0001_m_000000
2009-09-30 04:31:30,851 INFO org.apache.hadoop.mapred.JobTracker: Adding
task 'attempt_200909300419_0001_m_000003_0' to tip
task_200909300419_0001_m_000003, for tracker
'tracker_domU-12-31-39-03-41-B6.compute-1.internal:localhost.localdomain/127.0.0.1:42359'
2009-09-30 04:31:30,851 INFO org.apache.hadoop.mapred.JobInProgress:
Choosing data-local task task_200909300419_0001_m_000003
2009-09-30 04:31:31,965 INFO org.apache.hadoop.mapred.JobTracker: Adding
task 'attempt_200909300419_0001_m_000004_0' to tip
task_200909300419_0001_m_000004, for tracker
'tracker_domU-12-31-39-02-25-A1.compute-1.internal:localhost.localdomain/127.0.0.1:56865'
2009-09-30 04:31:31,965 INFO org.apache.hadoop.mapred.JobInProgress:
Choosing data-local task task_200909300419_0001_m_000004
2009-09-30 04:31:33,095 INFO org.apache.hadoop.mapred.JobTracker: Adding
task 'attempt_200909300419_0001_r_000000_0' to tip
task_200909300419_0001_r_000000, for tracker
'tracker_domU-12-31-39-02-66-21.compute-1.internal:localhost.localdomain/127.0.0.1:58857'
2009-09-30 04:36:33,940 INFO org.apache.hadoop.mapred.JobInProgress: Task
'attempt_200909300419_0001_m_000000_0' has completed
task_200909300419_0001_m_000000 successfully.
2009-09-30 04:36:33,940 INFO org.apache.hadoop.mapred.ResourceEstimator:
measured blowup on task_200909300419_0001_m_000000 was 803/200 = 4.015
2009-09-30 04:36:33,941 INFO org.apache.hadoop.mapred.ResourceEstimator: new
estimate is blowup = 4.015
2009-09-30 04:36:33,945 INFO org.apache.hadoop.mapred.JobTracker: Adding
task 'attempt_200909300419_0001_m_000002_1' to tip
task_200909300419_0001_m_000002, for tracker
'tracker_domU-12-31-39-03-BA-14.compute-1.internal:localhost.localdomain/127.0.0.1:54570'
2009-09-30 04:36:33,945 INFO org.apache.hadoop.mapred.JobInProgress:
Choosing data-local task task_200909300419_0001_m_000002
2009-09-30 04:36:34,793 INFO org.apache.hadoop.mapred.JobInProgress: Task
'attempt_200909300419_0001_m_000002_0' has completed
task_200909300419_0001_m_000002 successfully.
2009-09-30 04:36:34,793 INFO org.apache.hadoop.mapred.ResourceEstimator:
measured blowup on task_200909300419_0001_m_000002 was 317/190 =
1.668421052631579
2009-09-30 04:36:34,794 INFO org.apache.hadoop.mapred.ResourceEstimator: new
estimate is blowup = 2.841710526315789
2009-09-30 04:36:34,796 INFO org.apache.hadoop.mapred.JobTracker: Adding
task 'attempt_200909300419_0001_m_000003_1' to tip
task_200909300419_0001_m_000003, for tracker
'tracker_domU-12-31-39-02-F5-D8.compute-1.internal:localhost.localdomain/127.0.0.1:60031'
2009-09-30 04:36:34,796 INFO org.apache.hadoop.mapred.JobInProgress:
Choosing data-local task task_200909300419_0001_m_000003
2009-09-30 04:36:35,170 INFO org.apache.hadoop.mapred.JobInProgress: Task
'attempt_200909300419_0001_m_000003_0' has completed
task_200909300419_0001_m_000003 successfully.
2009-09-30 04:36:35,170 INFO org.apache.hadoop.mapred.ResourceEstimator:
measured blowup on task_200909300419_0001_m_000003 was 73/189 =
0.3862433862433862
2009-09-30 04:36:35,170 INFO org.apache.hadoop.mapred.ResourceEstimator: new
estimate is blowup = 2.023221479624988
2009-09-30 04:36:35,171 INFO org.apache.hadoop.mapred.JobTracker: Adding
task 'attempt_200909300419_0001_m_000001_1' to tip
task_200909300419_0001_m_000001, for tracker
'tracker_domU-12-31-39-03-41-B6.compute-1.internal:localhost.localdomain/127.0.0.1:42359'
2009-09-30 04:36:35,171 INFO org.apache.hadoop.mapred.JobInProgress:
Choosing data-local task task_200909300419_0001_m_000001
2009-09-30 04:36:35,854 INFO org.apache.hadoop.mapred.JobTracker:
attempt_200909300419_0001_m_000002_1 is 1909 ms debug.
2009-09-30 04:36:36,038 INFO org.apache.hadoop.mapred.JobInProgress: Task
'attempt_200909300419_0001_m_000004_0' has completed
task_200909300419_0001_m_000004 successfully.
2009-09-30 04:36:36,039 INFO org.apache.hadoop.mapred.ResourceEstimator:
measured blowup on task_200909300419_0001_m_000004 was 918/187 =
4.909090909090909
2009-09-30 04:36:36,039 INFO org.apache.hadoop.mapred.ResourceEstimator: new
estimate is blowup = 2.744688836991468
2009-09-30 04:36:43,966 INFO org.apache.hadoop.mapred.JobTracker: Removed
completed task 'attempt_200909300419_0001_m_000002_1' from
'tracker_domU-12-31-39-03-BA-14.compute-1.internal:localhost.localdomain/127.0.0.1:54570'
2009-09-30 04:36:44,853 INFO org.apache.hadoop.mapred.JobTracker: Removed
completed task 'attempt_200909300419_0001_m_000003_1' from
'tracker_domU-12-31-39-02-F5-D8.compute-1.internal:localhost.localdomain/127.0.0.1:60031'
----------------------------------------
[End of JobTracker Logs]
----------------------------------------
Amogh Vasekar-2 wrote:
Hi,
Can you provide info on the input like compression etc? Also, are you
using cached files in your map tasks? It might be helpful if you paste the
logs here after blanking your system specific info., as then one can find
out where till the reduce it went or if the copy phase started at all.
Thanks,
Amogh
-----Original Message-----
From: achilles852
Sent: Wednesday, September 30, 2009 6:38 AM
To:
[email protected]Subject: Re: last map task taking too long
Basically, it finishes what it is supposed to do (I view the logs to find
out), but does not move onto the reduce stage.
Ted Dunning wrote:
Is that last map task actually running, or is it pending?
On Tue, Sep 29, 2009 at 5:57 PM, achilles852 <
[email protected]>
wrote:
Hey.. I am trying to write a small mapreduce program. I launch a few map
tasks, each of which should complete within a certain time (say 5
minutes)... all the tasks complete within 5 minutes except the last one
-
which takes around 10 times more the time taken by all other map
tasks.....any idea why this is happening?
I am using Hadoop version 0.19.2, tried running it locally as well as on
EC2.
--
View this message in context:
http://www.nabble.com/last-map-task-taking-too-long-tp25673359p25673359.htmlSent from the Hadoop core-dev mailing list archive at Nabble.com.
--
Ted Dunning, CTO
DeepDyve
--
View this message in context:
http://www.nabble.com/last-map-task-taking-too-long-tp25673359p25673431.htmlSent from the Hadoop core-dev mailing list archive at Nabble.com.
--
View this message in context:
http://www.nabble.com/last-map-task-taking-too-long-tp25673359p25677160.htmlSent from the Hadoop core-dev mailing list archive at Nabble.com.