FAQ
JobConf option for minimum progress threshold before reducers are assigned
--------------------------------------------------------------------------

Key: HADOOP-5271
URL: https://issues.apache.org/jira/browse/HADOOP-5271
Project: Hadoop Core
Issue Type: Improvement
Reporter: Tim Williamson


A specific sub-case of the general priority inversion problem noted in HADOOP-4557 is when many lower priority jobs are submitted and are waiting for mappers to free up. Even though they haven't actually done any work, they will be assigned any free reducers. If a higher priority job is submitted, priority inversion results not just due to the lower priority tasks that are in the midst of completing, but also due to the ones that haven't yet started but have claimed all the free reducers. A simple workaround is to require a job to complete some useful work before assigning it a reducer. This can be done in a tunable and backwards compatible manner by adding a "minimum map progress percentage before assigning a reducer" option to the JobConf. Setting this to 0 would eliminate the common case above, and setting it to 100 would technically eliminate the inversion of HADOOP-4557, though likely at an unacceptably high cost.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Tim Williamson (JIRA) at Feb 17, 2009 at 6:43 am
    [ https://issues.apache.org/jira/browse/HADOOP-5271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Tim Williamson updated HADOOP-5271:
    -----------------------------------

    Attachment: HADOOP-5271.patch

    Patch to add mapred.min.reduce.assign.percent config (as well as unit tests). Patch is made against trunk, but could be easily ported to earlier versions by modifying the corresponding code in the JobTracker class.
    JobConf option for minimum progress threshold before reducers are assigned
    --------------------------------------------------------------------------

    Key: HADOOP-5271
    URL: https://issues.apache.org/jira/browse/HADOOP-5271
    Project: Hadoop Core
    Issue Type: Improvement
    Reporter: Tim Williamson
    Attachments: HADOOP-5271.patch


    A specific sub-case of the general priority inversion problem noted in HADOOP-4557 is when many lower priority jobs are submitted and are waiting for mappers to free up. Even though they haven't actually done any work, they will be assigned any free reducers. If a higher priority job is submitted, priority inversion results not just due to the lower priority tasks that are in the midst of completing, but also due to the ones that haven't yet started but have claimed all the free reducers. A simple workaround is to require a job to complete some useful work before assigning it a reducer. This can be done in a tunable and backwards compatible manner by adding a "minimum map progress percentage before assigning a reducer" option to the JobConf. Setting this to 0 would eliminate the common case above, and setting it to 100 would technically eliminate the inversion of HADOOP-4557, though likely at an unacceptably high cost.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hemanth Yamijala (JIRA) at Feb 17, 2009 at 9:03 am
    [ https://issues.apache.org/jira/browse/HADOOP-5271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12674152#action_12674152 ]

    Hemanth Yamijala commented on HADOOP-5271:
    ------------------------------------------

    Tim, in HADOOP-3136, "mapred.reduce.slowstart.completed.maps" was introduced. The documentation reads thus:

    bq. Fraction of the number of maps in the job which should be complete before reduces are scheduled for the job.

    Is this similar to what you are proposing ? This option is not in JobConf though. So, maybe the intention is to *not* let it be overridden per job.

    Arun, any thoughts ?
    JobConf option for minimum progress threshold before reducers are assigned
    --------------------------------------------------------------------------

    Key: HADOOP-5271
    URL: https://issues.apache.org/jira/browse/HADOOP-5271
    Project: Hadoop Core
    Issue Type: Improvement
    Reporter: Tim Williamson
    Attachments: HADOOP-5271.patch


    A specific sub-case of the general priority inversion problem noted in HADOOP-4557 is when many lower priority jobs are submitted and are waiting for mappers to free up. Even though they haven't actually done any work, they will be assigned any free reducers. If a higher priority job is submitted, priority inversion results not just due to the lower priority tasks that are in the midst of completing, but also due to the ones that haven't yet started but have claimed all the free reducers. A simple workaround is to require a job to complete some useful work before assigning it a reducer. This can be done in a tunable and backwards compatible manner by adding a "minimum map progress percentage before assigning a reducer" option to the JobConf. Setting this to 0 would eliminate the common case above, and setting it to 100 would technically eliminate the inversion of HADOOP-4557, though likely at an unacceptably high cost.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Tim Williamson (JIRA) at Feb 17, 2009 at 7:19 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12674314#action_12674314 ]

    Tim Williamson commented on HADOOP-5271:
    ----------------------------------------

    Sweet: "mapred.reduce.slowstart.completed.maps" looks like essentially the same idea, and using number of completed map tasks (as opposed to progress) is a better approach. Not sure if there is any utility in being able to set this per-job or not, but from my use-case perspective, this JIRA could be closed as a duplicate.
    JobConf option for minimum progress threshold before reducers are assigned
    --------------------------------------------------------------------------

    Key: HADOOP-5271
    URL: https://issues.apache.org/jira/browse/HADOOP-5271
    Project: Hadoop Core
    Issue Type: Improvement
    Reporter: Tim Williamson
    Attachments: HADOOP-5271.patch


    A specific sub-case of the general priority inversion problem noted in HADOOP-4557 is when many lower priority jobs are submitted and are waiting for mappers to free up. Even though they haven't actually done any work, they will be assigned any free reducers. If a higher priority job is submitted, priority inversion results not just due to the lower priority tasks that are in the midst of completing, but also due to the ones that haven't yet started but have claimed all the free reducers. A simple workaround is to require a job to complete some useful work before assigning it a reducer. This can be done in a tunable and backwards compatible manner by adding a "minimum map progress percentage before assigning a reducer" option to the JobConf. Setting this to 0 would eliminate the common case above, and setting it to 100 would technically eliminate the inversion of HADOOP-4557, though likely at an unacceptably high cost.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedFeb 17, '09 at 6:38a
activeFeb 17, '09 at 7:19p
posts4
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Tim Williamson (JIRA): 4 posts

People

Translate

site design / logo © 2022 Grokbase