FAQ
Hi,

I faced a problem that the jobs are still running after executing "hadoop
job -kill jobId". I rebooted the cluster but the job still can not be
killed.

The hadoop version is 0.20.2.

Any idea?

Thanks in advance!

--
- Juwei

Search Discussions

  • Jeff Schmitz at Jul 5, 2011 at 2:06 pm
    Um kill -9 "pid" ?

    -----Original Message-----
    From: Juwei Shi
    Sent: Friday, July 01, 2011 10:53 AM
    To: [email protected]; [email protected]
    Subject: Jobs are still in running state after executing "hadoop job
    -kill jobId"

    Hi,

    I faced a problem that the jobs are still running after executing
    "hadoop
    job -kill jobId". I rebooted the cluster but the job still can not be
    killed.

    The hadoop version is 0.20.2.

    Any idea?

    Thanks in advance!

    --
    - Juwei
  • Edward Capriolo at Jul 5, 2011 at 2:50 pm

    On Tue, Jul 5, 2011 at 10:05 AM, wrote:

    Um kill -9 "pid" ?

    -----Original Message-----
    From: Juwei Shi
    Sent: Friday, July 01, 2011 10:53 AM
    To: [email protected]; [email protected]
    Subject: Jobs are still in running state after executing "hadoop job
    -kill jobId"

    Hi,

    I faced a problem that the jobs are still running after executing
    "hadoop
    job -kill jobId". I rebooted the cluster but the job still can not be
    killed.

    The hadoop version is 0.20.2.

    Any idea?

    Thanks in advance!

    --
    - Juwei
    This happens sometimes. A task gets orphaned from the Task Tracker and never
    goes away. It is a good idea to have a nagios check for very old tasks
    because the orphans slowly such your memory away especially if the task
    launches with a big Xmx. You really *should not* need to be nuking tasks
    like this but occasionally it happens.

    Edward
  • Juwei Shi at Jul 5, 2011 at 3:45 pm
    We sometimes have hundreds of map or reduce tasks for a job. I think it is
    hard to find all of them and kill the corresponding jvm processes. If we do
    not want to restart hadoop, is there any automatic methods?

    2011/7/5 <[email protected]>
    Um kill -9 "pid" ?

    -----Original Message-----
    From: Juwei Shi
    Sent: Friday, July 01, 2011 10:53 AM
    To: [email protected]; [email protected]
    Subject: Jobs are still in running state after executing "hadoop job
    -kill jobId"

    Hi,

    I faced a problem that the jobs are still running after executing
    "hadoop
    job -kill jobId". I rebooted the cluster but the job still can not be
    killed.

    The hadoop version is 0.20.2.

    Any idea?

    Thanks in advance!

    --
    - Juwei
  • Edward Capriolo at Jul 5, 2011 at 5:30 pm

    On Tue, Jul 5, 2011 at 11:45 AM, Juwei Shi wrote:

    We sometimes have hundreds of map or reduce tasks for a job. I think it is
    hard to find all of them and kill the corresponding jvm processes. If we do
    not want to restart hadoop, is there any automatic methods?

    2011/7/5 <[email protected]>
    Um kill -9 "pid" ?

    -----Original Message-----
    From: Juwei Shi
    Sent: Friday, July 01, 2011 10:53 AM
    To: [email protected]; [email protected]
    Subject: Jobs are still in running state after executing "hadoop job
    -kill jobId"

    Hi,

    I faced a problem that the jobs are still running after executing
    "hadoop
    job -kill jobId". I rebooted the cluster but the job still can not be
    killed.

    The hadoop version is 0.20.2.

    Any idea?

    Thanks in advance!

    --
    - Juwei
    I do not think they pop up very often but after days and months of running a
    orphans can be alive. The way I would handle it is write a check that runs
    over Nagios (NRPE) and looks for Hadoop task processes using ps, that are
    older then a certain age such as 1 day or 1 week etc. Then you can decide if
    want nagios to terminate these orphans or do it by hand.

    Edward

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedJul 1, '11 at 3:53p
activeJul 5, '11 at 5:30p
posts5
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2023 Grokbase