FAQ
Hello,

I have a question regarding MapRed jobs.

I have 24 nodes, each node have 4 disks (mnt – mnt3), 500GB each mnt.

All balanced ( I used the balancer, except mnt, which have 97% used).

My question is:
I got the following error and I relate it to the disk space (maybe I'm wrong).

Maybe there is a configuration that I can add, change in order to have few more retries on separate disk:


10/10/27 21:59:01 INFO mapred.JobClient: map 100% reduce 26%

10/10/27 21:59:02 INFO mapred.JobClient: Task Id : attempt_201010201240_4059_r_000023_0, Status : FAILED

java.io.IOException: Task process exit with nonzero status of 134.

at org.apache.hadoop.mapred.TaskRunner.runChild(TaskRunner.java:462)

at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:403)



attempt_201010201240_4059_r_000023_0: #

attempt_201010201240_4059_r_000023_0: # A fatal error has been detected by the Java Runtime Environment:

attempt_201010201240_4059_r_000023_0: #

attempt_201010201240_4059_r_000023_0: # java.lang.OutOfMemoryError: requested 32744 bytes for ChunkPool::allocate. Out of swap space?

attempt_201010201240_4059_r_000023_0: #

attempt_201010201240_4059_r_000023_0: # Internal Error (allocation.cpp:117), pid=15974, tid=1089702224

attempt_201010201240_4059_r_000023_0: # Error: ChunkPool::allocate

attempt_201010201240_4059_r_000023_0: #

attempt_201010201240_4059_r_000023_0: # JRE version: 6.0_14-b08

attempt_201010201240_4059_r_000023_0: # Java VM: Java HotSpot(TM) 64-Bit Server VM (14.0-b16 mixed mode linux-amd64 )

attempt_201010201240_4059_r_000023_0: # An error report file with more information is saved as:

attempt_201010201240_4059_r_000023_0: # /mnt2/hadoop/mapred/local/taskTracker/jobcache/job_201010201240_4059/attempt_201010201240_4059_r_000023_0/work/hs_err_pid15974.log

attempt_201010201240_4059_r_000023_0: #

attempt_201010201240_4059_r_000023_0: # If you would like to submit a bug report, please visit:

attempt_201010201240_4059_r_000023_0: # http://java.sun.com/webapps/bugreport/crash.jsp

attempt_201010201240_4059_r_000023_0: #

Regards,
Shavit

Search Discussions

  • Hari Sreekumar at Nov 6, 2010 at 8:59 am
    What's the RAM on each node?
    On Sat, Nov 6, 2010 at 11:03 AM, Shavit Netzer wrote:

    Hello,

    I have a question regarding MapRed jobs.

    I have 24 nodes, each node have 4 disks (mnt – mnt3), 500GB each mnt.

    All balanced ( I used the balancer, except mnt, which have 97% used).

    My question is:
    I got the following error and I relate it to the disk space (maybe I'm
    wrong).

    Maybe there is a configuration that I can add, change in order to have few
    more retries on separate disk:


    10/10/27 21:59:01 INFO mapred.JobClient: map 100% reduce 26%

    10/10/27 21:59:02 INFO mapred.JobClient: Task Id :
    attempt_201010201240_4059_r_000023_0, Status : FAILED

    java.io.IOException: Task process exit with nonzero status of 134.

    at
    org.apache.hadoop.mapred.TaskRunner.runChild(TaskRunner.java:462)

    at
    org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:403)



    attempt_201010201240_4059_r_000023_0: #

    attempt_201010201240_4059_r_000023_0: # A fatal error has been detected by
    the Java Runtime Environment:

    attempt_201010201240_4059_r_000023_0: #

    attempt_201010201240_4059_r_000023_0: # java.lang.OutOfMemoryError:
    requested 32744 bytes for ChunkPool::allocate. Out of swap space?

    attempt_201010201240_4059_r_000023_0: #

    attempt_201010201240_4059_r_000023_0: # Internal Error
    (allocation.cpp:117), pid=15974, tid=1089702224

    attempt_201010201240_4059_r_000023_0: # Error: ChunkPool::allocate

    attempt_201010201240_4059_r_000023_0: #

    attempt_201010201240_4059_r_000023_0: # JRE version: 6.0_14-b08

    attempt_201010201240_4059_r_000023_0: # Java VM: Java HotSpot(TM) 64-Bit
    Server VM (14.0-b16 mixed mode linux-amd64 )

    attempt_201010201240_4059_r_000023_0: # An error report file with more
    information is saved as:

    attempt_201010201240_4059_r_000023_0: #
    /mnt2/hadoop/mapred/local/taskTracker/jobcache/job_201010201240_4059/attempt_201010201240_4059_r_000023_0/work/hs_err_pid15974.log

    attempt_201010201240_4059_r_000023_0: #

    attempt_201010201240_4059_r_000023_0: # If you would like to submit a bug
    report, please visit:

    attempt_201010201240_4059_r_000023_0: #
    http://java.sun.com/webapps/bugreport/crash.jsp

    attempt_201010201240_4059_r_000023_0: #

    Regards,
    Shavit
  • Shavit Netzer at Nov 6, 2010 at 9:06 am
    7GB

    Sent from my mobile

    On 06/11/2010, at 11:00, "Hari Sreekumar" wrote:

    What's the RAM on each node?

    On Sat, Nov 6, 2010 at 11:03 AM, Shavit Netzer wrote:

    Hello,

    I have a question regarding MapRed jobs.

    I have 24 nodes, each node have 4 disks (mnt – mnt3), 500GB each mnt.

    All balanced ( I used the balancer, except mnt, which have 97% used).

    My question is:
    I got the following error and I relate it to the disk space (maybe I'm
    wrong).

    Maybe there is a configuration that I can add, change in order to have few
    more retries on separate disk:


    10/10/27 21:59:01 INFO mapred.JobClient: map 100% reduce 26%

    10/10/27 21:59:02 INFO mapred.JobClient: Task Id :
    attempt_201010201240_4059_r_000023_0, Status : FAILED

    java.io.IOException: Task process exit with nonzero status of 134.

    at
    org.apache.hadoop.mapred.TaskRunner.runChild(TaskRunner.java:462)

    at
    org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:403)



    attempt_201010201240_4059_r_000023_0: #

    attempt_201010201240_4059_r_000023_0: # A fatal error has been detected by
    the Java Runtime Environment:

    attempt_201010201240_4059_r_000023_0: #

    attempt_201010201240_4059_r_000023_0: # java.lang.OutOfMemoryError:
    requested 32744 bytes for ChunkPool::allocate. Out of swap space?

    attempt_201010201240_4059_r_000023_0: #

    attempt_201010201240_4059_r_000023_0: # Internal Error
    (allocation.cpp:117), pid=15974, tid=1089702224

    attempt_201010201240_4059_r_000023_0: # Error: ChunkPool::allocate

    attempt_201010201240_4059_r_000023_0: #

    attempt_201010201240_4059_r_000023_0: # JRE version: 6.0_14-b08

    attempt_201010201240_4059_r_000023_0: # Java VM: Java HotSpot(TM) 64-Bit
    Server VM (14.0-b16 mixed mode linux-amd64 )

    attempt_201010201240_4059_r_000023_0: # An error report file with more
    information is saved as:

    attempt_201010201240_4059_r_000023_0: #
    /mnt2/hadoop/mapred/local/taskTracker/jobcache/job_201010201240_4059/attempt_201010201240_4059_r_000023_0/work/hs_err_pid15974.log

    attempt_201010201240_4059_r_000023_0: #

    attempt_201010201240_4059_r_000023_0: # If you would like to submit a bug
    report, please visit:

    attempt_201010201240_4059_r_000023_0: #
    http://java.sun.com/webapps/bugreport/crash.jsp

    attempt_201010201240_4059_r_000023_0: #

    Regards,
    Shavit
  • Hari Sreekumar at Nov 6, 2010 at 9:16 am
    It's an out of mem error, so I feel it has to do with ram rather tham disk
    space. Did you check if it's swapping? (top/htop)... Is your reduce phase
    very mem-intensive? Seems to be a memory leak somewhere.. What does htop
    say? What processes are you running on each node? What does the log file
    that it is showing say?
    On Sat, Nov 6, 2010 at 2:36 PM, Shavit Netzer wrote:

    7GB

    Sent from my mobile

    On 06/11/2010, at 11:00, "Hari Sreekumar" wrote:

    What's the RAM on each node?

    On Sat, Nov 6, 2010 at 11:03 AM, Shavit Netzer <[email protected]<mailto:
    [email protected]>> wrote:

    Hello,

    I have a question regarding MapRed jobs.

    I have 24 nodes, each node have 4 disks (mnt – mnt3), 500GB each mnt.

    All balanced ( I used the balancer, except mnt, which have 97% used).

    My question is:
    I got the following error and I relate it to the disk space (maybe I'm
    wrong).

    Maybe there is a configuration that I can add, change in order to have few
    more retries on separate disk:


    10/10/27 21:59:01 INFO mapred.JobClient: map 100% reduce 26%

    10/10/27 21:59:02 INFO mapred.JobClient: Task Id :
    attempt_201010201240_4059_r_000023_0, Status : FAILED

    java.io.IOException: Task process exit with nonzero status of 134.

    at
    org.apache.hadoop.mapred.TaskRunner.runChild(TaskRunner.java:462)

    at
    org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:403)



    attempt_201010201240_4059_r_000023_0: #

    attempt_201010201240_4059_r_000023_0: # A fatal error has been detected by
    the Java Runtime Environment:

    attempt_201010201240_4059_r_000023_0: #

    attempt_201010201240_4059_r_000023_0: # java.lang.OutOfMemoryError:
    requested 32744 bytes for ChunkPool::allocate. Out of swap space?

    attempt_201010201240_4059_r_000023_0: #

    attempt_201010201240_4059_r_000023_0: # Internal Error
    (allocation.cpp:117), pid=15974, tid=1089702224

    attempt_201010201240_4059_r_000023_0: # Error: ChunkPool::allocate

    attempt_201010201240_4059_r_000023_0: #

    attempt_201010201240_4059_r_000023_0: # JRE version: 6.0_14-b08

    attempt_201010201240_4059_r_000023_0: # Java VM: Java HotSpot(TM) 64-Bit
    Server VM (14.0-b16 mixed mode linux-amd64 )

    attempt_201010201240_4059_r_000023_0: # An error report file with more
    information is saved as:

    attempt_201010201240_4059_r_000023_0: #

    /mnt2/hadoop/mapred/local/taskTracker/jobcache/job_201010201240_4059/attempt_201010201240_4059_r_000023_0/work/hs_err_pid15974.log

    attempt_201010201240_4059_r_000023_0: #

    attempt_201010201240_4059_r_000023_0: # If you would like to submit a bug
    report, please visit:

    attempt_201010201240_4059_r_000023_0: #
    http://java.sun.com/webapps/bugreport/crash.jsp

    attempt_201010201240_4059_r_000023_0: #

    Regards,
    Shavit
  • Shavit Netzer at Nov 6, 2010 at 9:34 am
    Hi,

    I will get the information and will reply to you.

    Sent from my mobile
    On 06/11/2010, at 11:16, "Hari Sreekumar" wrote:

    It's an out of mem error, so I feel it has to do with ram rather
    tham disk
    space. Did you check if it's swapping? (top/htop)... Is your reduce
    phase
    very mem-intensive? Seems to be a memory leak somewhere.. What does
    htop
    say? What processes are you running on each node? What does the log
    file
    that it is showing say?
    On Sat, Nov 6, 2010 at 2:36 PM, Shavit Netzer wrote:

    7GB

    Sent from my mobile

    On 06/11/2010, at 11:00, "Hari Sreekumar" <[email protected]
    wrote:

    What's the RAM on each node?

    On Sat, Nov 6, 2010 at 11:03 AM, Shavit Netzer
    <[email protected]<mailto:
    [email protected]>> wrote:

    Hello,

    I have a question regarding MapRed jobs.

    I have 24 nodes, each node have 4 disks (mnt – mnt3), 500GB each m
    nt.

    All balanced ( I used the balancer, except mnt, which have 97% used).

    My question is:
    I got the following error and I relate it to the disk space (maybe
    I'm
    wrong).

    Maybe there is a configuration that I can add, change in order to
    have few
    more retries on separate disk:


    10/10/27 21:59:01 INFO mapred.JobClient: map 100% reduce 26%

    10/10/27 21:59:02 INFO mapred.JobClient: Task Id :
    attempt_201010201240_4059_r_000023_0, Status : FAILED

    java.io.IOException: Task process exit with nonzero status of 134.

    at
    org.apache.hadoop.mapred.TaskRunner.runChild(TaskRunner.java:462)

    at
    org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:403)



    attempt_201010201240_4059_r_000023_0: #

    attempt_201010201240_4059_r_000023_0: # A fatal error has been
    detected by
    the Java Runtime Environment:

    attempt_201010201240_4059_r_000023_0: #

    attempt_201010201240_4059_r_000023_0: # java.lang.OutOfMemoryError:
    requested 32744 bytes for ChunkPool::allocate. Out of swap space?

    attempt_201010201240_4059_r_000023_0: #

    attempt_201010201240_4059_r_000023_0: # Internal Error
    (allocation.cpp:117), pid=15974, tid=1089702224

    attempt_201010201240_4059_r_000023_0: # Error: ChunkPool::allocate

    attempt_201010201240_4059_r_000023_0: #

    attempt_201010201240_4059_r_000023_0: # JRE version: 6.0_14-b08

    attempt_201010201240_4059_r_000023_0: # Java VM: Java HotSpot(TM)
    64-Bit
    Server VM (14.0-b16 mixed mode linux-amd64 )

    attempt_201010201240_4059_r_000023_0: # An error report file with
    more
    information is saved as:

    attempt_201010201240_4059_r_000023_0: #

    /mnt2/hadoop/mapred/local/taskTracker/jobcache/
    job_201010201240_4059/attempt_201010201240_4059_r_000023_0/work/
    hs_err_pid15974.log

    attempt_201010201240_4059_r_000023_0: #

    attempt_201010201240_4059_r_000023_0: # If you would like to submit
    a bug
    report, please visit:

    attempt_201010201240_4059_r_000023_0: #
    http://java.sun.com/webapps/bugreport/crash.jsp

    attempt_201010201240_4059_r_000023_0: #

    Regards,
    Shavit

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedNov 6, '10 at 5:34a
activeNov 6, '10 at 9:34a
posts5
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Shavit Netzer: 3 posts Hari Sreekumar: 2 posts

People

Translate

site design / logo © 2023 Grokbase