FAQ
Hi,

I'm having issues with CDH4.1 with a long-running job with lots of map
tasks. The issue seems that the TaskTracker kills JVMs currently being used
by other tasks. Here is the relevant output from the tasktracker logs (with
comments inline):

https://gist.github.com/plaflamme/5030187

Why is the task tracker killing a JVM that is still being used by a task?
Our JVM reuse is set to 0:

<property>
<name>mapred.job.reuse.jvm.num.tasks</name>
<value>0</value>
</property>

But it should still wait for the task completion before destroying the JVM
(hopefully!) Anyone have an idea why this is happening?

Thanks,
Philippe

--

Search Discussions

  • Philippe Laflamme at Feb 25, 2013 at 5:36 pm
    Some more information about the issue.

    The tasks are long-running (tens of minutes). When removing the heavy
    processing from the mapper, the job completes successfully (tasks take
    between 30s and 1m to complete). Adding a Thread.sleep() with a random
    value of 0-1s to mimic the heavy processing, the tasks start taking tens of
    minutes again and JVM kill issue comes back and the job fails.

    So this would make the problem be related to how long tasks take to
    complete. Maybe the TaskTracker thinks the JVM is stale after several
    minutes after being spawned? Anyone have an idea what may be causing this?
    I'm using the FairScheduler.

    Thanks,
    Philippe

    On Mon, Feb 25, 2013 at 9:40 AM, Philippe Laflamme wrote:

    Hi,

    I'm having issues with CDH4.1 with a long-running job with lots of map
    tasks. The issue seems that the TaskTracker kills JVMs currently being used
    by other tasks. Here is the relevant output from the tasktracker logs (with
    comments inline):

    https://gist.github.com/plaflamme/5030187

    Why is the task tracker killing a JVM that is still being used by a task?
    Our JVM reuse is set to 0:

    <property>
    <name>mapred.job.reuse.jvm.num.tasks</name>
    <value>0</value>
    </property>

    But it should still wait for the task completion before destroying the JVM
    (hopefully!) Anyone have an idea why this is happening?

    Thanks,
    Philippe
    --
  • Alexey Babutin at Feb 27, 2013 at 10:49 am
    Hi,I want to join this problem.I have the same problem but only with
    openjdk7(with jdk6 all is good).i have thought that problem in jetty and I
    have changed jetty6 to jetty7 but It didn't help me.

    --
  • Philippe Laflamme at Feb 27, 2013 at 2:48 pm
    So it looks like our issue is caused by preemption. We've activated that
    feature and it does kill certain tasks to "make room" for other ones.

    The issue though is that normally, preemption will kill a task which
    doesn't count as a failure. It looks as though the first time around, the
    TaskTracker will kill the task like expected, but subsequent killing are
    not done the same way and count as a task failure. There's probably a bug
    somewhere in the preemption code combined with our settings which cause
    this problem... I've reduced the number of mappers for this task such that
    there is always room on a node for tasks from other jobs.

    If you haven't activated preemption and the fair scheduler, your problem is
    different.

    Cheers,
    Philippe


    On Wed, Feb 27, 2013 at 5:49 AM, Alexey Babutin
    wrote:
    Hi,I want to join this problem.I have the same problem but only with
    openjdk7(with jdk6 all is good).i have thought that problem in jetty and I
    have changed jetty6 to jetty7 but It didn't help me.

    --


    --

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcdh-user @
categorieshadoop
postedFeb 25, '13 at 2:47p
activeFeb 27, '13 at 2:48p
posts4
users2
websitecloudera.com
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase