FAQ
For one long running job we are noticing that the mapper jvms do not exit
even after the mapper is done. Any suggestions on why this could be
happening.
The java processes get cleaned up if I do a hadoop job -kill <job_id>. The
java processes get cleaned up of I run in it in a smaller batch and the job
gets done fairly quickly(say half an hour). For larger inputs the nodes
eventually run out of memory because of these java processes that the
cluster thinks are gone but they haven't been cleaned up yet. I am
suspecting the TaskTrackers are failing to kill JVMs for some reason by
themselves.
The following exceptions can be seen in the hadoop logs.

2011-05-12 13:52:04,147 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
Error executing shell command
org.apache.hadoop.util.Shell$ExitCodeException: kill -12545: No such process
2011-05-12 13:52:08,071 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
Error executing shell command
org.apache.hadoop.util.Shell$ExitCodeException: kill -11061: No such process
2011-05-12 13:52:09,009 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
Error executing shell command
org.apache.hadoop.util.Shell$ExitCodeException: kill -11151: No such process
2011-05-12 13:52:12,009 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
Error executing shell command
org.apache.hadoop.util.Shell$ExitCodeException: kill -25057: No such process
2011-05-12 13:52:13,306 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
Error executing shell command
org.apache.hadoop.util.Shell$ExitCodeException: kill -19805: No such process
2011-05-12 13:52:14,996 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
Error executing shell command
org.apache.hadoop.util.Shell$ExitCodeException: kill -11103: No such process

2011-05-12 15:51:41,105 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
Error executing shell command
org.apache.hadoop.util.Shell$ExitCodeException: kill -17202: No such process
2011-05-12 15:51:43,481 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
Error executing shell command
org.apache.hadoop.util.Shell$ExitCodeException: kill -15981: No such process
2011-05-12 15:51:45,916 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
Error executing shell command
org.apache.hadoop.util.Shell$ExitCodeException: kill -17931: No such process
2011-05-12 15:52:06,328 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
Error executing shell command
org.apache.hadoop.util.Shell$ExitCodeException: kill -14867: No such process
2011-05-12 15:52:34,503 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
Error executing shell command
org.apache.hadoop.util.Shell$ExitCodeException: kill -29376: No such process
2011-05-12 15:52:38,607 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
Error executing shell command
org.apache.hadoop.util.Shell$ExitCodeException: kill -32491: No such process
2011-05-12 15:52:39,292 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
Error executing shell command
org.apache.hadoop.util.Shell$ExitCodeException: kill -31529: No such process
2011-05-12 15:52:46,547 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
Error executing shell command
org.apache.hadoop.util.Shell$ExitCodeException: kill -15140: No such process

Some other exceptions also seen in the logs may or may not be related to the
above problem.
2011-05-12 16:01:20,534 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 6 on 33465 caught: java.nio.channels.ClosedChannelException
2011-05-12 16:01:48,869 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 80 on 33465 caught: java.nio.channels.ClosedChannelException
2011-05-12 16:01:53,922 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 59 on 33465 caught: java.nio.channels.ClosedChannelException
2011-05-12 16:01:58,977 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 28 on 33465 caught: java.nio.channels.ClosedChannelException
2011-05-12 16:02:04,040 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 37 on 33465 caught: java.nio.channels.ClosedChannelException
2011-05-12 16:02:09,095 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 100 on 33465 caught: java.nio.channels.ClosedChannelException

Thanks.

-Adi

Search Discussions

  • Joey Echeverria at May 12, 2011 at 11:03 pm
    Which version of hadoop are you running?

    Are you running on linux?

    -Joey
    On Thu, May 12, 2011 at 1:39 PM, Adi wrote:
    For one long running job we are noticing that the mapper jvms do not exit
    even after the mapper is done. Any suggestions on why this could be
    happening.
    The java processes get cleaned up if I do a hadoop job -kill <job_id>. The
    java processes get cleaned up of I run in it in a smaller batch and the job
    gets done fairly quickly(say half an hour). For larger inputs the nodes
    eventually run out of memory because of these java processes that the
    cluster thinks are gone but they haven't been cleaned up yet. I am
    suspecting the TaskTrackers are failing to kill JVMs for some reason by
    themselves.
    The following exceptions can be seen in the hadoop logs.

    2011-05-12 13:52:04,147 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
    Error executing shell command
    org.apache.hadoop.util.Shell$ExitCodeException: kill -12545: No such process
    2011-05-12 13:52:08,071 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
    Error executing shell command
    org.apache.hadoop.util.Shell$ExitCodeException: kill -11061: No such process
    2011-05-12 13:52:09,009 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
    Error executing shell command
    org.apache.hadoop.util.Shell$ExitCodeException: kill -11151: No such process
    2011-05-12 13:52:12,009 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
    Error executing shell command
    org.apache.hadoop.util.Shell$ExitCodeException: kill -25057: No such process
    2011-05-12 13:52:13,306 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
    Error executing shell command
    org.apache.hadoop.util.Shell$ExitCodeException: kill -19805: No such process
    2011-05-12 13:52:14,996 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
    Error executing shell command
    org.apache.hadoop.util.Shell$ExitCodeException: kill -11103: No such process

    2011-05-12 15:51:41,105 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
    Error executing shell command
    org.apache.hadoop.util.Shell$ExitCodeException: kill -17202: No such process
    2011-05-12 15:51:43,481 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
    Error executing shell command
    org.apache.hadoop.util.Shell$ExitCodeException: kill -15981: No such process
    2011-05-12 15:51:45,916 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
    Error executing shell command
    org.apache.hadoop.util.Shell$ExitCodeException: kill -17931: No such process
    2011-05-12 15:52:06,328 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
    Error executing shell command
    org.apache.hadoop.util.Shell$ExitCodeException: kill -14867: No such process
    2011-05-12 15:52:34,503 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
    Error executing shell command
    org.apache.hadoop.util.Shell$ExitCodeException: kill -29376: No such process
    2011-05-12 15:52:38,607 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
    Error executing shell command
    org.apache.hadoop.util.Shell$ExitCodeException: kill -32491: No such process
    2011-05-12 15:52:39,292 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
    Error executing shell command
    org.apache.hadoop.util.Shell$ExitCodeException: kill -31529: No such process
    2011-05-12 15:52:46,547 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
    Error executing shell command
    org.apache.hadoop.util.Shell$ExitCodeException: kill -15140: No such process

    Some other exceptions also seen in the logs may or may not be related to the
    above problem.
    2011-05-12 16:01:20,534 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 6 on 33465 caught: java.nio.channels.ClosedChannelException
    2011-05-12 16:01:48,869 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 80 on 33465 caught: java.nio.channels.ClosedChannelException
    2011-05-12 16:01:53,922 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 59 on 33465 caught: java.nio.channels.ClosedChannelException
    2011-05-12 16:01:58,977 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 28 on 33465 caught: java.nio.channels.ClosedChannelException
    2011-05-12 16:02:04,040 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 37 on 33465 caught: java.nio.channels.ClosedChannelException
    2011-05-12 16:02:09,095 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 100 on 33465 caught: java.nio.channels.ClosedChannelException

    Thanks.

    -Adi


    --
    Joseph Echeverria
    Cloudera, Inc.
    443.305.9434
  • Adi at May 13, 2011 at 12:33 am
    Which version of hadoop are you running?
    Hadoop 0.21.0 with some patches.
    Are you running on linux?

    Yes
    Linux 2.6.18-238.9.1.el5 #1 SMP x86_64 x86_64 x86_64 GNU/Linux
    java version "1.6.0_21"
    Java(TM) SE Runtime Environment (build 1.6.0_21-b06)
    Java HotSpot(TM) 64-Bit Server VM (build 17.0-b16, mixed mode)

    I set up 0.21.0 on another linux box and am not seeing this issue as hadoop
    is reusing JVMs(as configured).
    In the production cluster it is not re-using JVMs and runs out of memory
    because of mapper JVMs staying alive even after they have ended according to
    hadoop.

    The production node is a 64 bit OS/JVM. Is there any known issue workaround
    for enabling JVM reuse in 64 bit environments.

    Test node is 32 bit:
    Linux 2.6.18-194.32.1.el5.centos.plus #1 SMP i686 i686 i386 GNU/Linux
    java version "1.6.0_17"
    OpenJDK Runtime Environment (IcedTea6 1.7.5) (rhel-1.16.b17.el5-i386)
    OpenJDK Server VM (build 14.0-b16, mixed mode)

    Even if I can get it to reuse JVM it will be grrreat.

    -Adi




    -Joey
    On Thu, May 12, 2011 at 1:39 PM, Adi wrote:
    For one long running job we are noticing that the mapper jvms do not exit
    even after the mapper is done. Any suggestions on why this could be
    happening.
    The java processes get cleaned up if I do a hadoop job -kill <job_id>. The
    java processes get cleaned up of I run in it in a smaller batch and the job
    gets done fairly quickly(say half an hour). For larger inputs the nodes
    eventually run out of memory because of these java processes that the
    cluster thinks are gone but they haven't been cleaned up yet. I am
    suspecting the TaskTrackers are failing to kill JVMs for some reason by
    themselves.
    The following exceptions can be seen in the hadoop logs.

    2011-05-12 13:52:04,147 WARN
    org.apache.hadoop.mapreduce.util.ProcessTree:
    Error executing shell command
    org.apache.hadoop.util.Shell$ExitCodeException: kill -12545: No such process
    2011-05-12 13:52:08,071 WARN
    org.apache.hadoop.mapreduce.util.ProcessTree:
    Error executing shell command
    org.apache.hadoop.util.Shell$ExitCodeException: kill -11061: No such process
    2011-05-12 13:52:09,009 WARN
    org.apache.hadoop.mapreduce.util.ProcessTree:
    Error executing shell command
    org.apache.hadoop.util.Shell$ExitCodeException: kill -11151: No such process
    2011-05-12 13:52:12,009 WARN
    org.apache.hadoop.mapreduce.util.ProcessTree:
    Error executing shell command
    org.apache.hadoop.util.Shell$ExitCodeException: kill -25057: No such process
    2011-05-12 13:52:13,306 WARN
    org.apache.hadoop.mapreduce.util.ProcessTree:
    Error executing shell command
    org.apache.hadoop.util.Shell$ExitCodeException: kill -19805: No such process
    2011-05-12 13:52:14,996 WARN
    org.apache.hadoop.mapreduce.util.ProcessTree:
    Error executing shell command
    org.apache.hadoop.util.Shell$ExitCodeException: kill -11103: No such process
    2011-05-12 15:51:41,105 WARN
    org.apache.hadoop.mapreduce.util.ProcessTree:
    Error executing shell command
    org.apache.hadoop.util.Shell$ExitCodeException: kill -17202: No such process
    2011-05-12 15:51:43,481 WARN
    org.apache.hadoop.mapreduce.util.ProcessTree:
    Error executing shell command
    org.apache.hadoop.util.Shell$ExitCodeException: kill -15981: No such process
    2011-05-12 15:51:45,916 WARN
    org.apache.hadoop.mapreduce.util.ProcessTree:
    Error executing shell command
    org.apache.hadoop.util.Shell$ExitCodeException: kill -17931: No such process
    2011-05-12 15:52:06,328 WARN
    org.apache.hadoop.mapreduce.util.ProcessTree:
    Error executing shell command
    org.apache.hadoop.util.Shell$ExitCodeException: kill -14867: No such process
    2011-05-12 15:52:34,503 WARN
    org.apache.hadoop.mapreduce.util.ProcessTree:
    Error executing shell command
    org.apache.hadoop.util.Shell$ExitCodeException: kill -29376: No such process
    2011-05-12 15:52:38,607 WARN
    org.apache.hadoop.mapreduce.util.ProcessTree:
    Error executing shell command
    org.apache.hadoop.util.Shell$ExitCodeException: kill -32491: No such process
    2011-05-12 15:52:39,292 WARN
    org.apache.hadoop.mapreduce.util.ProcessTree:
    Error executing shell command
    org.apache.hadoop.util.Shell$ExitCodeException: kill -31529: No such process
    2011-05-12 15:52:46,547 WARN
    org.apache.hadoop.mapreduce.util.ProcessTree:
    Error executing shell command
    org.apache.hadoop.util.Shell$ExitCodeException: kill -15140: No such process
    Some other exceptions also seen in the logs may or may not be related to the
    above problem.
    2011-05-12 16:01:20,534 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 6 on 33465 caught: java.nio.channels.ClosedChannelException
    2011-05-12 16:01:48,869 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 80 on 33465 caught: java.nio.channels.ClosedChannelException
    2011-05-12 16:01:53,922 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 59 on 33465 caught: java.nio.channels.ClosedChannelException
    2011-05-12 16:01:58,977 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 28 on 33465 caught: java.nio.channels.ClosedChannelException
    2011-05-12 16:02:04,040 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 37 on 33465 caught: java.nio.channels.ClosedChannelException
    2011-05-12 16:02:09,095 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 100 on 33465 caught: java.nio.channels.ClosedChannelException

    Thanks.

    -Adi


    --
    Joseph Echeverria
    Cloudera, Inc.
    443.305.9434
  • Joey Echeverria at May 13, 2011 at 1:55 am
    Hadoop 0.21.0 with some patches.
    Hadoop 0.21.0 doesn't get much use, so I'm not sure how much help I can be.
    2011-05-12 13:52:04,147 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
    Error executing shell command
    org.apache.hadoop.util.Shell$ExitCodeException: kill -12545: No such process
    Your logs showed that Hadoop tried to kill processes but the kill
    command claimed they didn't exist. The next time you see this problem,
    can you check the logs and see if any of the PIDs that appear in the
    logs are in fact still running?

    A more likely scenario is that Hadoop's tracking of child VMs is
    getting out of sync, but I'm not sure what would cause that.

    -Joey

    --
    Joseph Echeverria
    Cloudera, Inc.
    443.305.9434
  • Adi at May 13, 2011 at 2:12 am

    2011-05-12 13:52:04,147 WARN
    org.apache.hadoop.mapreduce.util.ProcessTree:
    Error executing shell command
    org.apache.hadoop.util.Shell$ExitCodeException: kill -12545: No such
    process

    Your logs showed that Hadoop tried to kill processes but the kill
    command claimed they didn't exist. The next time you see this problem,
    can you check the logs and see if any of the PIDs that appear in the
    logs are in fact still running?

    A more likely scenario is that Hadoop's tracking of child VMs is
    getting out of sync, but I'm not sure what would cause that.
    Yes those java processes are in fact running. And those error messages do
    not always show up. Just sometimes. But the processes never get cleaned up.

    -Adi
  • Highpointe at May 13, 2011 at 2:27 am
    Is there a reason for using OpenJDK and not Sun's JDK?

    Also... I believe there were noted issues with the .17 JDK. I will look for a link and post if I can find.

    Otherwise, the behaviour I have seen before. Hadoop is detaching from the JVM and stops seeing it.

    I think your problem lies in the JDK and not Hadoop.

    On May 12, 2011 at 8:12 PM, Adi wrote:

    2011-05-12 13:52:04,147 WARN
    org.apache.hadoop.mapreduce.util.ProcessTree:
    Error executing shell command
    org.apache.hadoop.util.Shell$ExitCodeException: kill -12545: No such
    process

    Your logs showed that Hadoop tried to kill processes but the kill
    command claimed they didn't exist. The next time you see this problem,
    can you check the logs and see if any of the PIDs that appear in the
    logs are in fact still running?

    A more likely scenario is that Hadoop's tracking of child VMs is
    getting out of sync, but I'm not sure what would cause that.
    Yes those java processes are in fact running. And those error messages do
    not always show up. Just sometimes. But the processes never get cleaned up.

    -Adi
  • Adi at May 13, 2011 at 2:06 pm
    Is there a reason for using OpenJDK and not Sun's JDK?
    The cluster we are seeing the problem in uses Sun's JDK java version
    "1.6.0_21",Java(TM) SE Runtime Environment (build 1.6.0_21-b06),Java
    HotSpot(TM) 64-Bit Server VM (build 17.0-b16, mixed mode)

    The standalone node where I tried to reproduce the issue uses OpenJDK and
    this one does not see this issue as it is able to reuse JVMs.

    -Adi

    Also... I believe there were noted issues with the .17 JDK. I will look for
    a link and post if I can find.

    Otherwise, the behaviour I have seen before. Hadoop is detaching from the
    JVM and stops seeing it.

    I think your problem lies in the JDK and not Hadoop.

    On May 12, 2011 at 8:12 PM, Adi wrote:

    2011-05-12 13:52:04,147 WARN
    org.apache.hadoop.mapreduce.util.ProcessTree:
    Error executing shell command
    org.apache.hadoop.util.Shell$ExitCodeException: kill -12545: No such
    process

    Your logs showed that Hadoop tried to kill processes but the kill
    command claimed they didn't exist. The next time you see this problem,
    can you check the logs and see if any of the PIDs that appear in the
    logs are in fact still running?

    A more likely scenario is that Hadoop's tracking of child VMs is
    getting out of sync, but I'm not sure what would cause that.
    Yes those java processes are in fact running. And those error messages do
    not always show up. Just sometimes. But the processes never get cleaned up.
    -Adi
  • Highpointe at May 13, 2011 at 2:17 pm
    You posted system specifics earlier; would you mind posting again? can't find them in the thread.

    Sent from my iPhone
    On May 13, 2011, at 8:05 AM, Adi wrote:

    Is there a reason for using OpenJDK and not Sun's JDK?
    The cluster we are seeing the problem in uses Sun's JDK java version
    "1.6.0_21",Java(TM) SE Runtime Environment (build 1.6.0_21-b06),Java
    HotSpot(TM) 64-Bit Server VM (build 17.0-b16, mixed mode)

    The standalone node where I tried to reproduce the issue uses OpenJDK and
    this one does not see this issue as it is able to reuse JVMs.

    -Adi

    Also... I believe there were noted issues with the .17 JDK. I will look for
    a link and post if I can find.

    Otherwise, the behaviour I have seen before. Hadoop is detaching from the
    JVM and stops seeing it.

    I think your problem lies in the JDK and not Hadoop.

    On May 12, 2011 at 8:12 PM, Adi wrote:

    2011-05-12 13:52:04,147 WARN
    org.apache.hadoop.mapreduce.util.ProcessTree:
    Error executing shell command
    org.apache.hadoop.util.Shell$ExitCodeException: kill -12545: No such
    process

    Your logs showed that Hadoop tried to kill processes but the kill
    command claimed they didn't exist. The next time you see this problem,
    can you check the logs and see if any of the PIDs that appear in the
    logs are in fact still running?

    A more likely scenario is that Hadoop's tracking of child VMs is
    getting out of sync, but I'm not sure what would cause that.
    Yes those java processes are in fact running. And those error messages do
    not always show up. Just sometimes. But the processes never get cleaned up.
    -Adi

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedMay 12, '11 at 8:40p
activeMay 13, '11 at 2:17p
posts8
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase