FAQ
fair share scheduler does not utilize all slots if the task trackers are configured heterogeneously
---------------------------------------------------------------------------------------------------

Key: HADOOP-4943
URL: https://issues.apache.org/jira/browse/HADOOP-4943
Project: Hadoop Core
Issue Type: Bug
Components: contrib/fair-share
Affects Versions: 0.19.0
Reporter: Zheng Shao
Assignee: Zheng Shao


There is some code in the fairshare scheduler that tries to make the load across the whole cluster the same.
That piece of code will break if the task trackers are configured differently. Basically, we will stop assigning more tasks to tasks trackers that have tasks above the cluster average, but we may still want to do that because other task trackers may have less slots.

We should change the code to maintain a cluster-wide slot usage percentage (instead of absolute number of slot usage) to make sure the load is evenly distributed.


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Zheng Shao (JIRA) at Dec 24, 2008 at 10:27 pm
    [ https://issues.apache.org/jira/browse/HADOOP-4943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Zheng Shao updated HADOOP-4943:
    -------------------------------

    Attachment: HADOOP-4943-1.patch
    fair share scheduler does not utilize all slots if the task trackers are configured heterogeneously
    ---------------------------------------------------------------------------------------------------

    Key: HADOOP-4943
    URL: https://issues.apache.org/jira/browse/HADOOP-4943
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/fair-share
    Affects Versions: 0.19.0
    Reporter: Zheng Shao
    Assignee: Zheng Shao
    Attachments: HADOOP-4943-1.patch


    There is some code in the fairshare scheduler that tries to make the load across the whole cluster the same.
    That piece of code will break if the task trackers are configured differently. Basically, we will stop assigning more tasks to tasks trackers that have tasks above the cluster average, but we may still want to do that because other task trackers may have less slots.
    We should change the code to maintain a cluster-wide slot usage percentage (instead of absolute number of slot usage) to make sure the load is evenly distributed.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Zheng Shao (JIRA) at Dec 24, 2008 at 10:31 pm
    [ https://issues.apache.org/jira/browse/HADOOP-4943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Zheng Shao updated HADOOP-4943:
    -------------------------------

    Affects Version/s: (was: 0.19.0)
    Status: Patch Available (was: Open)

    Tested on an 8-node cluster with (4 map, 2 reduce) on half of the cluster, and (2 map, 4 reduces) on the other half.

    Tested with a streaming job of 24 zero-length input + 24 reducers (map/reduce='sleep 60'). We are able to schedule all of maps at the same time (also for reducers).

    Tested with a setting of 12, we are able to uniformly spread the load.

    Tested with a setting of 48, we are able to throttle at running 24 maps (instead of going beyond limit) (also for reducers).

    The test was done on hadoop 0.17.

    fair share scheduler does not utilize all slots if the task trackers are configured heterogeneously
    ---------------------------------------------------------------------------------------------------

    Key: HADOOP-4943
    URL: https://issues.apache.org/jira/browse/HADOOP-4943
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/fair-share
    Reporter: Zheng Shao
    Assignee: Zheng Shao
    Attachments: HADOOP-4943-1.patch


    There is some code in the fairshare scheduler that tries to make the load across the whole cluster the same.
    That piece of code will break if the task trackers are configured differently. Basically, we will stop assigning more tasks to tasks trackers that have tasks above the cluster average, but we may still want to do that because other task trackers may have less slots.
    We should change the code to maintain a cluster-wide slot usage percentage (instead of absolute number of slot usage) to make sure the load is evenly distributed.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • dhruba borthakur (JIRA) at Dec 29, 2008 at 11:22 pm
    [ https://issues.apache.org/jira/browse/HADOOP-4943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    dhruba borthakur updated HADOOP-4943:
    -------------------------------------

    Affects Version/s: 0.19.0
    Fix Version/s: 0.19.1
    fair share scheduler does not utilize all slots if the task trackers are configured heterogeneously
    ---------------------------------------------------------------------------------------------------

    Key: HADOOP-4943
    URL: https://issues.apache.org/jira/browse/HADOOP-4943
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/fair-share
    Affects Versions: 0.19.0
    Reporter: Zheng Shao
    Assignee: Zheng Shao
    Fix For: 0.19.1

    Attachments: HADOOP-4943-1.patch


    There is some code in the fairshare scheduler that tries to make the load across the whole cluster the same.
    That piece of code will break if the task trackers are configured differently. Basically, we will stop assigning more tasks to tasks trackers that have tasks above the cluster average, but we may still want to do that because other task trackers may have less slots.
    We should change the code to maintain a cluster-wide slot usage percentage (instead of absolute number of slot usage) to make sure the load is evenly distributed.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Matei Zaharia (JIRA) at Dec 30, 2008 at 12:42 am
    [ https://issues.apache.org/jira/browse/HADOOP-4943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12659713#action_12659713 ]

    Matei Zaharia commented on HADOOP-4943:
    ---------------------------------------

    Looks good to me. This is an issue that actually affects other schedulers too, because that "max load" code was taken from the implementation of the default scheduler. (Unless it has been fixed until then). Do you think it might be possible to create a unit test for this using the fake tasktrackers in the existing test class?
    fair share scheduler does not utilize all slots if the task trackers are configured heterogeneously
    ---------------------------------------------------------------------------------------------------

    Key: HADOOP-4943
    URL: https://issues.apache.org/jira/browse/HADOOP-4943
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/fair-share
    Affects Versions: 0.19.0
    Reporter: Zheng Shao
    Assignee: Zheng Shao
    Fix For: 0.19.1

    Attachments: HADOOP-4943-1.patch


    There is some code in the fairshare scheduler that tries to make the load across the whole cluster the same.
    That piece of code will break if the task trackers are configured differently. Basically, we will stop assigning more tasks to tasks trackers that have tasks above the cluster average, but we may still want to do that because other task trackers may have less slots.
    We should change the code to maintain a cluster-wide slot usage percentage (instead of absolute number of slot usage) to make sure the load is evenly distributed.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Matei Zaharia (JIRA) at Jan 11, 2009 at 11:51 pm
    [ https://issues.apache.org/jira/browse/HADOOP-4943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Matei Zaharia updated HADOOP-4943:
    ----------------------------------

    Attachment: hadoop-4943-2.patch

    Here's a patch that includes a unit test.
    fair share scheduler does not utilize all slots if the task trackers are configured heterogeneously
    ---------------------------------------------------------------------------------------------------

    Key: HADOOP-4943
    URL: https://issues.apache.org/jira/browse/HADOOP-4943
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/fair-share
    Affects Versions: 0.19.0
    Reporter: Zheng Shao
    Assignee: Zheng Shao
    Fix For: 0.19.1

    Attachments: HADOOP-4943-1.patch, hadoop-4943-2.patch


    There is some code in the fairshare scheduler that tries to make the load across the whole cluster the same.
    That piece of code will break if the task trackers are configured differently. Basically, we will stop assigning more tasks to tasks trackers that have tasks above the cluster average, but we may still want to do that because other task trackers may have less slots.
    We should change the code to maintain a cluster-wide slot usage percentage (instead of absolute number of slot usage) to make sure the load is evenly distributed.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Zheng Shao (JIRA) at Jan 12, 2009 at 6:40 pm
    [ https://issues.apache.org/jira/browse/HADOOP-4943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663037#action_12663037 ]

    Zheng Shao commented on HADOOP-4943:
    ------------------------------------

    +1. Looks good to me.
    Do you want to fix the default scheduler (as you mentioned above) in the same transaction?

    fair share scheduler does not utilize all slots if the task trackers are configured heterogeneously
    ---------------------------------------------------------------------------------------------------

    Key: HADOOP-4943
    URL: https://issues.apache.org/jira/browse/HADOOP-4943
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/fair-share
    Affects Versions: 0.19.0
    Reporter: Zheng Shao
    Assignee: Zheng Shao
    Fix For: 0.19.1

    Attachments: HADOOP-4943-1.patch, hadoop-4943-2.patch


    There is some code in the fairshare scheduler that tries to make the load across the whole cluster the same.
    That piece of code will break if the task trackers are configured differently. Basically, we will stop assigning more tasks to tasks trackers that have tasks above the cluster average, but we may still want to do that because other task trackers may have less slots.
    We should change the code to maintain a cluster-wide slot usage percentage (instead of absolute number of slot usage) to make sure the load is evenly distributed.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Matei Zaharia (JIRA) at Jan 12, 2009 at 8:44 pm
    [ https://issues.apache.org/jira/browse/HADOOP-4943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663088#action_12663088 ]

    Matei Zaharia commented on HADOOP-4943:
    ---------------------------------------

    I just committed this. Thanks Zheng!

    I also looked at the default and capacity schedulers, but the default scheduler already seems to have this logic as part of the patch for HADOOP-3136, and the capacity scheduler doesn't try to do this kind of load balancing when there are fewer tasks than slots so I think this should be a separate JIRA.
    fair share scheduler does not utilize all slots if the task trackers are configured heterogeneously
    ---------------------------------------------------------------------------------------------------

    Key: HADOOP-4943
    URL: https://issues.apache.org/jira/browse/HADOOP-4943
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/fair-share
    Affects Versions: 0.19.0
    Reporter: Zheng Shao
    Assignee: Zheng Shao
    Fix For: 0.19.1

    Attachments: HADOOP-4943-1.patch, hadoop-4943-2.patch


    There is some code in the fairshare scheduler that tries to make the load across the whole cluster the same.
    That piece of code will break if the task trackers are configured differently. Basically, we will stop assigning more tasks to tasks trackers that have tasks above the cluster average, but we may still want to do that because other task trackers may have less slots.
    We should change the code to maintain a cluster-wide slot usage percentage (instead of absolute number of slot usage) to make sure the load is evenly distributed.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Zheng Shao (JIRA) at Jan 13, 2009 at 6:09 am
    [ https://issues.apache.org/jira/browse/HADOOP-4943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Zheng Shao updated HADOOP-4943:
    -------------------------------

    Resolution: Fixed
    Fix Version/s: 0.21.0
    0.20.0
    Release Note: HADOOP-4943: Fixed fair share scheduler to utilize all slots when the task trackers are configured heterogeneously.
    Hadoop Flags: [Reviewed]
    Status: Resolved (was: Patch Available)
    fair share scheduler does not utilize all slots if the task trackers are configured heterogeneously
    ---------------------------------------------------------------------------------------------------

    Key: HADOOP-4943
    URL: https://issues.apache.org/jira/browse/HADOOP-4943
    Project: Hadoop Core
    Issue Type: Bug
    Components: contrib/fair-share
    Affects Versions: 0.19.0
    Reporter: Zheng Shao
    Assignee: Zheng Shao
    Fix For: 0.19.1, 0.20.0, 0.21.0

    Attachments: HADOOP-4943-1.patch, hadoop-4943-2.patch


    There is some code in the fairshare scheduler that tries to make the load across the whole cluster the same.
    That piece of code will break if the task trackers are configured differently. Basically, we will stop assigning more tasks to tasks trackers that have tasks above the cluster average, but we may still want to do that because other task trackers may have less slots.
    We should change the code to maintain a cluster-wide slot usage percentage (instead of absolute number of slot usage) to make sure the load is evenly distributed.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedDec 24, '08 at 10:21p
activeJan 13, '09 at 6:09a
posts9
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Zheng Shao (JIRA): 9 posts

People

Translate

site design / logo © 2022 Grokbase