FAQ
Hey guys,

I'm running into issues when doing a moderate-size EMR job on 12 m1.large
nodes. Mappers and Reducers will randomly fail.

The EMR defaults are 2 mappers / 2 reducers per node. I've tried running
with mapred.child.opts set in the jobconf to -Xmx256m and -Xmx1024m. No
difference.

There are about 1,000 map tasks. Not very much data, maybe 50G at most?

My job fails to complete. Looking in syslog shows this:

java.io.IOException: Cannot run program "bash": java.io.IOException:
error=12, Cannot allocate memory
at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:149)
at org.apache.hadoop.util.Shell.run(Shell.java:134)
at org.apache.hadoop.fs.DF.getAvailable(DF.java:73)
at
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:329)
at
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
at
org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:107)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1238)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1146)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:365)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:312)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.io.IOException: java.io.IOException: error=12, Cannot
allocate memory
at java.lang.UNIXProcess.(ProcessImpl.java:65)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
... 11 more



I would ask the EMR forums, but I think I may get faster feedback here :)


--
Bradford Stephens,
Founder, Drawn to Scale
drawntoscalehq.com
727.697.7528

http://www.drawntoscalehq.com -- The intuitive, cloud-scale data solution.
Process, store, query, search, and serve all your data.

http://www.roadtofailure.com -- The Fringes of Scalability, Social Media,
and Computer Science

Search Discussions

  • Edward Capriolo at Sep 18, 2010 at 2:22 pm
    This happens because child processes try to allocate the same memory
    as the parent. One way to solve this is setting memory overcommit to
    your linux system.

    On Sat, Sep 18, 2010 at 4:47 AM, Bradford Stephens
    wrote:
    Hey guys,

    I'm running into issues when doing a moderate-size EMR job on 12 m1.large
    nodes. Mappers and Reducers will randomly fail.

    The EMR defaults are 2 mappers / 2 reducers per node. I've tried running
    with mapred.child.opts set in the jobconf to -Xmx256m and -Xmx1024m. No
    difference.

    There are about 1,000 map tasks. Not very much data, maybe 50G at most?

    My job fails to complete. Looking in syslog shows this:

    java.io.IOException: Cannot run program "bash": java.io.IOException:
    error=12, Cannot allocate memory
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:149)
    at org.apache.hadoop.util.Shell.run(Shell.java:134)
    at org.apache.hadoop.fs.DF.getAvailable(DF.java:73)
    at
    org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:329)
    at
    org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
    at
    org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:107)
    at
    org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1238)
    at
    org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1146)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:365)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:312)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)
    Caused by: java.io.IOException: java.io.IOException: error=12, Cannot
    allocate memory
    at java.lang.UNIXProcess.<init>(UNIXProcess.java:148)
    at java.lang.ProcessImpl.start(ProcessImpl.java:65)
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
    ... 11 more



    I would ask the EMR forums, but I think I may get faster feedback here :)


    --
    Bradford Stephens,
    Founder, Drawn to Scale
    drawntoscalehq.com
    727.697.7528

    http://www.drawntoscalehq.com --  The intuitive, cloud-scale data solution.
    Process, store, query, search, and serve all your data.

    http://www.roadtofailure.com -- The Fringes of Scalability, Social Media,
    and Computer Science
  • Bradford Stephens at Sep 18, 2010 at 8:23 pm
    That could be a problem... this is Elastic MapReduce. Can I do that? Guess
    I'll have to experiment.

    Cheers,
    B
    On Sat, Sep 18, 2010 at 7:21 AM, Edward Capriolo wrote:

    This happens because child processes try to allocate the same memory
    as the parent. One way to solve this is setting memory overcommit to
    your linux system.

    On Sat, Sep 18, 2010 at 4:47 AM, Bradford Stephens
    wrote:
    Hey guys,

    I'm running into issues when doing a moderate-size EMR job on 12 m1.large
    nodes. Mappers and Reducers will randomly fail.

    The EMR defaults are 2 mappers / 2 reducers per node. I've tried running
    with mapred.child.opts set in the jobconf to -Xmx256m and -Xmx1024m. No
    difference.

    There are about 1,000 map tasks. Not very much data, maybe 50G at most?

    My job fails to complete. Looking in syslog shows this:

    java.io.IOException: Cannot run program "bash": java.io.IOException:
    error=12, Cannot allocate memory
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:149)
    at org.apache.hadoop.util.Shell.run(Shell.java:134)
    at org.apache.hadoop.fs.DF.getAvailable(DF.java:73)
    at
    org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:329)
    at
    org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
    at
    org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:107)
    at
    org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1238)
    at
    org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1146)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:365)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:312)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)
    Caused by: java.io.IOException: java.io.IOException: error=12, Cannot
    allocate memory
    at java.lang.UNIXProcess.<init>(UNIXProcess.java:148)
    at java.lang.ProcessImpl.start(ProcessImpl.java:65)
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
    ... 11 more



    I would ask the EMR forums, but I think I may get faster feedback here :)


    --
    Bradford Stephens,
    Founder, Drawn to Scale
    drawntoscalehq.com
    727.697.7528

    http://www.drawntoscalehq.com -- The intuitive, cloud-scale data solution.
    Process, store, query, search, and serve all your data.

    http://www.roadtofailure.com -- The Fringes of Scalability, Social Media,
    and Computer Science


    --
    Bradford Stephens,
    Founder, Drawn to Scale
    drawntoscalehq.com
    727.697.7528

    http://www.drawntoscalehq.com -- The intuitive, cloud-scale data solution.
    Process, store, query, search, and serve all your data.

    http://www.roadtofailure.com -- The Fringes of Scalability, Social Media,
    and Computer Science
  • Chris K Wensel at Sep 18, 2010 at 10:48 pm
    see if this helps matters

    --bootstrap-action s3://elasticmapreduce/bootstrap-actions/create-swap-file.rb --args "-E,/mnt/swap,1000"

    ckw
    On Sep 18, 2010, at 1:22 PM, Bradford Stephens wrote:

    That could be a problem... this is Elastic MapReduce. Can I do that? Guess
    I'll have to experiment.

    Cheers,
    B
    On Sat, Sep 18, 2010 at 7:21 AM, Edward Capriolo wrote:

    This happens because child processes try to allocate the same memory
    as the parent. One way to solve this is setting memory overcommit to
    your linux system.

    On Sat, Sep 18, 2010 at 4:47 AM, Bradford Stephens
    wrote:
    Hey guys,

    I'm running into issues when doing a moderate-size EMR job on 12 m1.large
    nodes. Mappers and Reducers will randomly fail.

    The EMR defaults are 2 mappers / 2 reducers per node. I've tried running
    with mapred.child.opts set in the jobconf to -Xmx256m and -Xmx1024m. No
    difference.

    There are about 1,000 map tasks. Not very much data, maybe 50G at most?

    My job fails to complete. Looking in syslog shows this:

    java.io.IOException: Cannot run program "bash": java.io.IOException:
    error=12, Cannot allocate memory
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:149)
    at org.apache.hadoop.util.Shell.run(Shell.java:134)
    at org.apache.hadoop.fs.DF.getAvailable(DF.java:73)
    at
    org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:329)
    at
    org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
    at
    org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:107)
    at
    org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1238)
    at
    org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1146)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:365)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:312)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)
    Caused by: java.io.IOException: java.io.IOException: error=12, Cannot
    allocate memory
    at java.lang.UNIXProcess.<init>(UNIXProcess.java:148)
    at java.lang.ProcessImpl.start(ProcessImpl.java:65)
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
    ... 11 more



    I would ask the EMR forums, but I think I may get faster feedback here :)


    --
    Bradford Stephens,
    Founder, Drawn to Scale
    drawntoscalehq.com
    727.697.7528

    http://www.drawntoscalehq.com -- The intuitive, cloud-scale data solution.
    Process, store, query, search, and serve all your data.

    http://www.roadtofailure.com -- The Fringes of Scalability, Social Media,
    and Computer Science


    --
    Bradford Stephens,
    Founder, Drawn to Scale
    drawntoscalehq.com
    727.697.7528

    http://www.drawntoscalehq.com -- The intuitive, cloud-scale data solution.
    Process, store, query, search, and serve all your data.

    http://www.roadtofailure.com -- The Fringes of Scalability, Social Media,
    and Computer Science
    --
    Chris K Wensel
    chris@concurrentinc.com
    http://www.concurrentinc.com

    -- Concurrent, Inc. offers mentoring, support, and licensing for Cascading
  • Bradford Stephens at Sep 20, 2010 at 8:05 am
    Awesome! I think that may have fixed it. Will have to see the final results
    tomorrow...
    On Sat, Sep 18, 2010 at 3:47 PM, Chris K Wensel wrote:

    see if this helps matters

    --bootstrap-action
    s3://elasticmapreduce/bootstrap-actions/create-swap-file.rb --args
    "-E,/mnt/swap,1000"

    ckw
    On Sep 18, 2010, at 1:22 PM, Bradford Stephens wrote:

    That could be a problem... this is Elastic MapReduce. Can I do that? Guess
    I'll have to experiment.

    Cheers,
    B

    On Sat, Sep 18, 2010 at 7:21 AM, Edward Capriolo <edlinuxguru@gmail.com
    wrote:
    This happens because child processes try to allocate the same memory
    as the parent. One way to solve this is setting memory overcommit to
    your linux system.

    On Sat, Sep 18, 2010 at 4:47 AM, Bradford Stephens
    wrote:
    Hey guys,

    I'm running into issues when doing a moderate-size EMR job on 12
    m1.large
    nodes. Mappers and Reducers will randomly fail.

    The EMR defaults are 2 mappers / 2 reducers per node. I've tried
    running
    with mapred.child.opts set in the jobconf to -Xmx256m and -Xmx1024m. No
    difference.

    There are about 1,000 map tasks. Not very much data, maybe 50G at most?

    My job fails to complete. Looking in syslog shows this:

    java.io.IOException: Cannot run program "bash": java.io.IOException:
    error=12, Cannot allocate memory
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:149)
    at org.apache.hadoop.util.Shell.run(Shell.java:134)
    at org.apache.hadoop.fs.DF.getAvailable(DF.java:73)
    at
    org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:329)
    at
    org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
    at
    org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:107)
    at
    org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1238)
    at
    org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1146)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:365)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:312)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)
    Caused by: java.io.IOException: java.io.IOException: error=12, Cannot
    allocate memory
    at java.lang.UNIXProcess.<init>(UNIXProcess.java:148)
    at java.lang.ProcessImpl.start(ProcessImpl.java:65)
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
    ... 11 more



    I would ask the EMR forums, but I think I may get faster feedback here
    :)

    --
    Bradford Stephens,
    Founder, Drawn to Scale
    drawntoscalehq.com
    727.697.7528

    http://www.drawntoscalehq.com -- The intuitive, cloud-scale data solution.
    Process, store, query, search, and serve all your data.

    http://www.roadtofailure.com -- The Fringes of Scalability, Social Media,
    and Computer Science


    --
    Bradford Stephens,
    Founder, Drawn to Scale
    drawntoscalehq.com
    727.697.7528

    http://www.drawntoscalehq.com -- The intuitive, cloud-scale data solution.
    Process, store, query, search, and serve all your data.

    http://www.roadtofailure.com -- The Fringes of Scalability, Social Media,
    and Computer Science
    --
    Chris K Wensel
    chris@concurrentinc.com
    http://www.concurrentinc.com

    -- Concurrent, Inc. offers mentoring, support, and licensing for Cascading

    --
    Bradford Stephens,
    Founder, Drawn to Scale
    drawntoscalehq.com
    727.697.7528

    http://www.drawntoscalehq.com -- The intuitive, cloud-scale data solution.
    Process, store, query, search, and serve all your data.

    http://www.roadtofailure.com -- The Fringes of Scalability, Social Media,
    and Computer Science

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedSep 18, '10 at 8:48a
activeSep 20, '10 at 8:05a
posts5
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase