keep timing out). The pattern is like this:
- tasktracker says reduce task is not responding:
2007-10-20 18:40:28,225 INFO org.apache.hadoop.mapred.TaskTracker:
task_0006_r_000000_38 0.0% reduce > copy >
2007-10-20 18:50:36,772 INFO org.apache.hadoop.mapred.TaskTracker:
task_0006_r_000000_38: Task failed to report status for 608 seconds.
Killing.
- but reduce task is chugging away:
2007-10-20 18:46:18,070 INFO org.apache.hadoop.mapred.ReduceTask:
task_0006_r_000000_38 Copying task_0006_m_000003_0 output from
hadoop037.sf2p.facebook.com.
2007-10-20 18:46:28,235 INFO org.apache.hadoop.mapred.ReduceTask:
task_0006_r_000000_38 done copying task_0006_m_000007_0 output from
hadoop021.sf2p.facebook.com.
From the timestamps - the reduce task seems working away happily when
the tasktracker times it out?Is there a relevant patch I should apply? Help appreciated - this is
wreaking havoc ..
Thx,
Joydeep