Can you please let us know your system configuration running hadoop?
The error you see is when the reducer is copying its respective map output into memory. The parameter mapred.job.shuffle.input.buffer.percent can be manipulated for this ( a bunch of others will also help you optimize sort later ), but I would say 200M is far too less memory allocated for hadoop application jvms :)
On 1/8/10 2:46 AM, "Mayuran Yogarajah" wrote:
I'm seeing this error when a job runs:
Shuffling 35338524 bytes (35338524 raw bytes) into RAM from attempt_201001051549_0036_m_000003_0
Map output copy failure: java.lang.OutOfMemoryError: Java heap space
I originally had mapred.child.java.opts set to 200M. If I boost this up
to 512M the error goes away.
I'm trying to understand whats going on though. Can anyone explain?
Also are there any other parameters that
I should be tweaking to help with this?
thank you very much,