FAQ
Job killed when backup tasks fail
---------------------------------

Key: HADOOP-39
URL: http://issues.apache.org/jira/browse/HADOOP-39
Project: Hadoop
Type: Bug
Components: mapred
Reporter: Owen O'Malley


I had a map job with side effects that meant that any speculative tasks would fail.

Currently, the job tracker kills the job when the speculative task fails 4 times.

It would be better to stop scheduling speculative tasks for that fragment, but let the job continue as long as one of the the instances of that fragment continue to run.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira

Search Discussions

  • Doug Cutting (JIRA) at Feb 16, 2006 at 8:35 pm
    [ http://issues.apache.org/jira/browse/HADOOP-39?page=comments#action_12366674 ]

    Doug Cutting commented on HADOOP-39:
    ------------------------------------

    The point is to try to get map tasks with side effects to sometimes succeed, even with speculative execution? That sounds like it could be a bad idea. Wouldn't it be better to have map tasks with side effects fail more frequently with speculative execution, so that you find such problems sooner, with smaller datasets on a smaller cluster, before you try a big run? Or am I misunderstanding you?
    Job killed when backup tasks fail
    ---------------------------------

    Key: HADOOP-39
    URL: http://issues.apache.org/jira/browse/HADOOP-39
    Project: Hadoop
    Type: Bug
    Components: mapred
    Reporter: Owen O'Malley
    I had a map job with side effects that meant that any speculative tasks would fail.
    Currently, the job tracker kills the job when the speculative task fails 4 times.
    It would be better to stop scheduling speculative tasks for that fragment, but let the job continue as long as one of the the instances of that fragment continue to run.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators:
    http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see:
    http://www.atlassian.com/software/jira
  • Owen O'Malley (JIRA) at Jul 27, 2006 at 3:20 am
    [ http://issues.apache.org/jira/browse/HADOOP-39?page=comments#action_12423760 ]

    Owen O'Malley commented on HADOOP-39:
    -------------------------------------

    My goal with this would be to do the equivalent of "make -k" or a "best effort" job. It the option was set, the job would continue after a given TIP had failed 4 times, but that TIP would be abandoned.
    Job killed when backup tasks fail
    ---------------------------------

    Key: HADOOP-39
    URL: http://issues.apache.org/jira/browse/HADOOP-39
    Project: Hadoop
    Issue Type: Bug
    Components: mapred
    Reporter: Owen O'Malley

    I had a map job with side effects that meant that any speculative tasks would fail.
    Currently, the job tracker kills the job when the speculative task fails 4 times.
    It would be better to stop scheduling speculative tasks for that fragment, but let the job continue as long as one of the the instances of that fragment continue to run.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see: http://www.atlassian.com/software/jira
  • Owen O'Malley (JIRA) at Apr 20, 2007 at 11:24 pm
    [ https://issues.apache.org/jira/browse/HADOOP-39?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Owen O'Malley updated HADOOP-39:
    --------------------------------

    Description: I propose having a job option that when a tip fails 4 times, stops trying to run that tip, but does not kill the job. (was: I had a map job with side effects that meant that any speculative tasks would fail.

    Currently, the job tracker kills the job when the speculative task fails 4 times.

    It would be better to stop scheduling speculative tasks for that fragment, but let the job continue as long as one of the the instances of that fragment continue to run.)
    Summary: Create a job-configurable best effort for job execution (was: Job killed when backup tasks fail)
    Create a job-configurable best effort for job execution
    -------------------------------------------------------

    Key: HADOOP-39
    URL: https://issues.apache.org/jira/browse/HADOOP-39
    Project: Hadoop
    Issue Type: Bug
    Components: mapred
    Reporter: Owen O'Malley
    Assigned To: Owen O'Malley

    I propose having a job option that when a tip fails 4 times, stops trying to run that tip, but does not kill the job.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Owen O'Malley (JIRA) at Apr 30, 2007 at 6:45 am
    [ https://issues.apache.org/jira/browse/HADOOP-39?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Owen O'Malley reassigned HADOOP-39:
    -----------------------------------

    Assignee: Arun C Murthy (was: Owen O'Malley)
    Create a job-configurable best effort for job execution
    -------------------------------------------------------

    Key: HADOOP-39
    URL: https://issues.apache.org/jira/browse/HADOOP-39
    Project: Hadoop
    Issue Type: Bug
    Components: mapred
    Reporter: Owen O'Malley
    Assigned To: Arun C Murthy

    I propose having a job option that when a tip fails 4 times, stops trying to run that tip, but does not kill the job.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Arun C Murthy (JIRA) at May 8, 2007 at 4:22 am
    [ https://issues.apache.org/jira/browse/HADOOP-39?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Arun C Murthy resolved HADOOP-39.
    ---------------------------------

    Resolution: Duplicate

    Fixed as a part of HADOOP-1144
    Create a job-configurable best effort for job execution
    -------------------------------------------------------

    Key: HADOOP-39
    URL: https://issues.apache.org/jira/browse/HADOOP-39
    Project: Hadoop
    Issue Type: Bug
    Components: mapred
    Reporter: Owen O'Malley
    Assigned To: Arun C Murthy

    I propose having a job option that when a tip fails 4 times, stops trying to run that tip, but does not kill the job.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedFeb 16, '06 at 1:02a
activeMay 8, '07 at 4:22a
posts6
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Arun C Murthy (JIRA): 6 posts

People

Translate

site design / logo © 2023 Grokbase