Hi,
We sometimes see reducers fail just when all mappers are finishing. All
mappers finish roughly at the same time. The reducers only dump the following
exception:
java.lang.Throwable: Child Error
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
Caused by: java.io.IOException: Task process exit with nonzero status of 137.
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
The reducers own log output also shows nothing that gives a clue, this is the
last part of the log:
2011-12-26 19:35:19,116 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2011-12-26 19:35:19,117 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2011-12-26 19:35:19,117 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2011-12-26 19:35:19,120 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_201112261420_0006_r_000009_0 Thread started: Thread for merging on-
disk files
2011-12-26 19:35:19,120 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_201112261420_0006_r_000009_0 Thread waiting: Thread for merging on-
disk files
2011-12-26 19:35:19,121 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_201112261420_0006_r_000009_0 Thread started: Thread for merging in
memory files
2011-12-26 19:35:19,122 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_201112261420_0006_r_000009_0 Need another 50 map output(s) where 0 is
already in progress
2011-12-26 19:35:19,122 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_201112261420_0006_r_000009_0 Thread started: Thread for polling Map
Completion Events
2011-12-26 19:35:19,122 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_201112261420_0006_r_000009_0 Scheduled 0 outputs (0 slow hosts and0
dup hosts)
2011-12-26 19:35:24,124 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_201112261420_0006_r_000009_0 Scheduled 2 outputs (0 slow hosts and0
dup hosts)
2011-12-26 19:35:25,805 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_201112261420_0006_r_000009_0 Scheduled 1 outputs (0 slow hosts and0
dup hosts)
2011-12-26 19:36:21,578 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_201112261420_0006_r_000009_0 Need another 47 map output(s) where 0 is
already in progress
2011-12-26 19:36:21,593 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_201112261420_0006_r_000009_0 Scheduled 1 outputs (0 slow hosts and0
dup hosts)
2011-12-26 19:36:42,412 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_201112261420_0006_r_000009_0 Scheduled 1 outputs (0 slow hosts and0
dup hosts)
Is there any advice?
Thanks