FAQ
I'm seeing this error when a job runs:

Shuffling 35338524 bytes (35338524 raw bytes) into RAM from attempt_201001051549_0036_m_000003_0
Map output copy failure: java.lang.OutOfMemoryError: Java heap space
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1198)


I originally had mapred.child.java.opts set to 200M. If I boost this up
to 512M the error goes away.
I'm trying to understand whats going on though. Can anyone explain?
Also are there any other parameters that
I should be tweaking to help with this?

thank you very much,
M

Search Discussions

  • Amogh Vasekar at Jan 8, 2010 at 10:25 am
    Hi,
    Can you please let us know your system configuration running hadoop?
    The error you see is when the reducer is copying its respective map output into memory. The parameter mapred.job.shuffle.input.buffer.percent can be manipulated for this ( a bunch of others will also help you optimize sort later ), but I would say 200M is far too less memory allocated for hadoop application jvms :)

    Amogh


    On 1/8/10 2:46 AM, "Mayuran Yogarajah" wrote:

    I'm seeing this error when a job runs:

    Shuffling 35338524 bytes (35338524 raw bytes) into RAM from attempt_201001051549_0036_m_000003_0
    Map output copy failure: java.lang.OutOfMemoryError: Java heap space
    at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1198)


    I originally had mapred.child.java.opts set to 200M. If I boost this up
    to 512M the error goes away.
    I'm trying to understand whats going on though. Can anyone explain?
    Also are there any other parameters that
    I should be tweaking to help with this?

    thank you very much,
    M
  • Mayuran Yogarajah at Jan 8, 2010 at 6:59 pm

    Amogh Vasekar wrote:
    Hi,
    Can you please let us know your system configuration running hadoop?
    The error you see is when the reducer is copying its respective map output into memory. The parameter mapred.job.shuffle.input.buffer.percent can be manipulated for this ( a bunch of others will also help you optimize sort later ), but I would say 200M is far too less memory allocated for hadoop application jvms :)

    Amogh
    Hi Amogh,

    We're using a 3 node cluster, all are quad cores (intel x3220) with 4
    gigs of ram.
    They are running Centos 5.3, and Hadoop 0.18.3.

    After looking at the source code I (possibly mistakenly) thought that
    fs.inmemory.size.mb might
    have something to do with this. I had bumped it up to 200 (default is
    75), but the heap was left
    at 200M. I think when I configured the cluster initially I had
    mistakenly thought that 200M for
    heap was enough, but it wasn't.

    I was able to make the error go away by:
    1) increasing mapred.child.java.opts
    2) decreasing fs.inmemory.size.mb

    Do you know of any other parameters that I should be tweaking ?

    thanks,
    M

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedJan 7, '10 at 9:17p
activeJan 8, '10 at 6:59p
posts3
users2
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2023 Grokbase