FAQ
To Mailing-List.

I was trying to implement the fair scheduler in my hadoop cluster. Could
you please let me know the best place to find the all the configurations
for fair scheduler. Like assigning certain jobs or certain users to
particular pools by default. Would really appreciate any help.

Kindly,
James George

Search Discussions

  • Monroe, Mark at Apr 16, 2012 at 4:18 pm
    Here is a simple, config for simply balancing based on usernames:

    (restart the env have making the changes)



    mapred-site.xml:



    <property>

    <name>mapred.jobtracker.taskScheduler</name>

    <value>org.apache.hadoop.mapred.FairScheduler</value>

    </property>

    <property>

    <name>mapred.fairscheduler.allocation.file</name>

    <value>/etc/hadoop-0.20/conf/fair-scheduler.xml</value>

    </property>

    <property>

    <name>mapred.fairscheduler.assignmultiple</name>

    <value>true</value>

    </property>

    <property>

    <name>mapred.fairscheduler.sizebasedweight</name>

    <value>true</value>

    </property>

    <property>

    <name>mapred.fairscheduler.preemption</name>

    <value>true</value>

    </property>



    fair-scheduler.xml:



    <allocations>

    <pool name="default">

    <minMaps>1</minMaps>

    <minReduces>1</minReduces>

    <maxRunningJobs>5</maxRunningJobs>

    <minSharePreemptionTimeout>300</minSharePreemptionTimeout>

    <weight>1.0</weight>

    </pool>



    <poolMaxJobsDefault>20</poolMaxJobsDefault>

    <userMaxJobsDefault>10</userMaxJobsDefault>

    <defaultMinSharePreemptionTimeout>600</defaultMinSharePreemptionTimeout>

    <fairSharePreemptionTimeout>300</fairSharePreemptionTimeout>

    </allocations>





    From: James George
    Sent: Monday, April 16, 2012 11:13 AM
    To: cdh-user@cloudera.org
    Subject: Re: Response to Recent Blog Comment

    To Mailing-List.

    I was trying to implement the fair scheduler in my hadoop cluster. Could you please let me know the best place to find the all the configurations for fair scheduler. Like assigning certain jobs or certain users to particular pools by default. Would really appreciate any help.

    Kindly,
    James George
  • James George at Apr 16, 2012 at 6:53 pm
    Mark,

    Thank you so much for the prompt reply.
    But am not sure, if I still have the answer.
    For an example, there are two users MarkM and JamesG, and there are two
    pools *high* and l*ow*.
    How can I assign MarkM to the "*high*" pool, and JamesG to the "*low*" pool
    by default.
    Like all the jobs submitted by MarkM should automatically go to "*high*"
    pool.
    Please let me know. Really appreciate your help.

    Thank you,
    James.
    On Mon, Apr 16, 2012 at 11:18 AM, Monroe, Mark wrote:

    Here is a simple, config for simply balancing based on usernames:****

    (restart the env have making the changes)****

    ** **

    mapred-site.xml:****

    ** **

    <property>****

    <name>mapred.jobtracker.taskScheduler</name>****

    <value>org.apache.hadoop.mapred.FairScheduler</value>****

    </property>****

    <property>****

    <name>mapred.fairscheduler.allocation.file</name>****

    <value>/etc/hadoop-0.20/conf/fair-scheduler.xml</value>****

    </property>****

    <property>****

    <name>mapred.fairscheduler.assignmultiple</name>****

    <value>true</value>****

    </property>****

    <property>****

    <name>mapred.fairscheduler.sizebasedweight</name>****

    <value>true</value>****

    </property>****

    <property>****

    <name>mapred.fairscheduler.preemption</name>****

    <value>true</value>****

    </property>****

    ** **

    fair-scheduler.xml:****

    ** **

    <allocations>****

    <pool name="default">****

    <minMaps>1</minMaps>****

    <minReduces>1</minReduces>****

    <maxRunningJobs>5</maxRunningJobs>****

    <minSharePreemptionTimeout>300</minSharePreemptionTimeout>****

    <weight>1.0</weight>****

    </pool>****

    ** **

    <poolMaxJobsDefault>20</poolMaxJobsDefault>****

    <userMaxJobsDefault>10</userMaxJobsDefault>****

    <defaultMinSharePreemptionTimeout>600</defaultMinSharePreemptionTimeout>**
    **

    <fairSharePreemptionTimeout>300</fairSharePreemptionTimeout>****

    </allocations>****

    ** **

    ** **

    ** **

    ** **

    ** **

    *From:* James George
    *Sent:* Monday, April 16, 2012 11:13 AM
    *To:* cdh-user@cloudera.org
    *Subject:* Re: Response to Recent Blog Comment****

    ** **

    To Mailing-List.

    I was trying to implement the fair scheduler in my hadoop cluster. Could
    you please let me know the best place to find the all the configurations
    for fair scheduler. Like assigning certain jobs or certain users to
    particular pools by default. Would really appreciate any help.

    Kindly,
    James George****
  • Monroe, Mark at Apr 16, 2012 at 7:55 pm
    Min and Limit are the way to setup them up
    Pools have properties:
    Minimum map slots
    Minimum reduce slots
    Limit on # of running jobs
    And there is a website that you can change on the fly....http://{jobtracker_server}:{port}/scheduler

    Example: (for our env)
    http://{jobtracker_server}:50030/jobtracker.jsp<http://%7bjobtracker_server%7d:50030/jobtracker.jsp>
    http://{jobtracker_server}:50030/scheduler<http://%7bjobtracker_server%7d:50030/scheduler>
    (this url you can see which user/stats and ability to interact with high/low...

    within the setJobPriority() method on JobClient (both of which take one of the
    values VERY_HIGH, HIGH, NORMAL, LOW, VERY_LOW).


    Here is a great PPT from google.com search
    http://www.google.com/url?sa=t&rct=j&q=&esrc=s&frm=1&source=web&cd=3&ved=0CF4QFjAC&url=http%3A%2F%2Fwww.cs.berkeley.edu%2F~matei%2Ftalks%2F2009%2Fhadoop_summit_fair_scheduler.ppt&ei=iHWMT-nyNYXS2gXSiNG1CQ&usg=AFQjCNEcyPaVmozJZI2gkxg2vlzzRENNgA&sig2=3GufDOwWPyeVbVeidqGHUA



    Job Scheduling with the Fair and Capacity Schedulers<http://www.cs.berkeley.edu/~matei/talks/2009/hadoop_summit_fair_scheduler.ppt>
    www.cs.berkeley.edu/~matei/.../hadoop_summit_fair_scheduler.pptSimilar<https://www.google.com/search?hl=en&biw=2276&bih=1184&q=related:www.cs.berkeley.edu/~matei/talks/2009/hadoop_summit_fair_scheduler.ppt+fair+scheduler+pools&tbo=1&sa=X&ei=iHWMT-nyNYXS2gXSiNG1CQ&ved=0CFwQHzAC>
    You +1'd this publicly. Undo<https://www.google.com/>
    File Format: Microsoft Powerpoint - Quick View<https://docs.google.com/viewer?a=v&q=cache:K5AGY_UydgEJ:www.cs.berkeley.edu/~matei/talks/2009/hadoop_summit_fair_scheduler.ppt+&hl=en&gl=us&pid=bl&srcid=ADGEESjGd-83ood9163yy_tcgJQzdFOeuDzIbxl-xFUOtqIXlDrErQxsMI3CY6xRifGab8kyucHxM5krY60NNOlSOMpl_KnuVZx3qADSxUNeAD_K6YsZ6M-U-qRSsGGvdNtpGK-sRlg_&sig=AHIEtbTcrJ7gH-Mc_sPaXQ-2E4aoN9LTiA>
    Jun 10, 2009 - Fair Scheduling. Job Queue. Fair Scheduler Basics. Group jobs into "pools". Assign each pool a guaranteed minimum share. Divide excess ...



    From: James George
    Sent: Monday, April 16, 2012 1:53 PM
    To: Monroe, Mark
    Cc: cdh-user@cloudera.org
    Subject: Re: Response to Recent Blog Comment

    Mark,

    Thank you so much for the prompt reply.
    But am not sure, if I still have the answer.
    For an example, there are two users MarkM and JamesG, and there are two pools high and low.
    How can I assign MarkM to the "high" pool, and JamesG to the "low" pool by default.
    Like all the jobs submitted by MarkM should automatically go to "high" pool.
    Please let me know. Really appreciate your help.

    Thank you,
    James.

    On Mon, Apr 16, 2012 at 11:18 AM, Monroe, Mark wrote:

    Here is a simple, config for simply balancing based on usernames:

    (restart the env have making the changes)



    mapred-site.xml:



    <property>

    <name>mapred.jobtracker.taskScheduler</name>

    <value>org.apache.hadoop.mapred.FairScheduler</value>

    </property>

    <property>

    <name>mapred.fairscheduler.allocation.file</name>

    <value>/etc/hadoop-0.20/conf/fair-scheduler.xml</value>

    </property>

    <property>

    <name>mapred.fairscheduler.assignmultiple</name>

    <value>true</value>

    </property>

    <property>

    <name>mapred.fairscheduler.sizebasedweight</name>

    <value>true</value>

    </property>

    <property>

    <name>mapred.fairscheduler.preemption</name>

    <value>true</value>

    </property>



    fair-scheduler.xml:



    <allocations>

    <pool name="default">

    <minMaps>1</minMaps>

    <minReduces>1</minReduces>

    <maxRunningJobs>5</maxRunningJobs>

    <minSharePreemptionTimeout>300</minSharePreemptionTimeout>

    <weight>1.0</weight>

    </pool>



    <poolMaxJobsDefault>20</poolMaxJobsDefault>

    <userMaxJobsDefault>10</userMaxJobsDefault>

    <defaultMinSharePreemptionTimeout>600</defaultMinSharePreemptionTimeout>

    <fairSharePreemptionTimeout>300</fairSharePreemptionTimeout>

    </allocations>





    From: James George
    Sent: Monday, April 16, 2012 11:13 AM
    To: cdh-user@cloudera.org
    Subject: Re: Response to Recent Blog Comment

    To Mailing-List.

    I was trying to implement the fair scheduler in my hadoop cluster. Could you please let me know the best place to find the all the configurations for fair scheduler. Like assigning certain jobs or certain users to particular pools by default. Would really appreciate any help.

    Kindly,
    James George
  • James George at Apr 16, 2012 at 9:24 pm
    Hello Mark,

    I did create four pools with their respective properties, like min map
    slots, min red slots and et.al.
    Also I did go to the scheduler adminstration page and I was able to change
    the pools manually.
    My question was, How I can set it up in the config file - like what
    parameters should I use to assign certain users to certain pools.
    Please let me know.

    Kindly,
    James George.
    On Mon, Apr 16, 2012 at 2:54 PM, Monroe, Mark wrote:

    *Min and Limit are the way to setup them up*

    **» ***Pools have properties:*****

    **› ***Minimum map slots*****

    **› ***Minimum reduce slots*****

    **› ***Limit on # of running jobs*****

    * *

    *And there is a website that you can change on the fly….http://
    {jobtracker_server}:{port}/scheduler*

    * *

    *Example: (for our env)*

    *http://{jobtracker_server}:50030/jobtracker.jsp *

    *http://{jobtracker_server}:50030/scheduler*

    *(this url you can see which user/stats and ability to interact with
    high/low…*

    * *

    *within the *setJobPriority() method on JobClient (both of which take one
    of the****

    values VERY_HIGH, HIGH, NORMAL, LOW, VERY_LOW).**

    ** **

    ** **

    Here is a great PPT from google.com search****


    http://www.google.com/url?sa=t&rct=j&q=&esrc=s&frm=1&source=web&cd=3&ved=0CF4QFjAC&url=http%3A%2F%2Fwww.cs.berkeley.edu%2F~matei%2Ftalks%2F2009%2Fhadoop_summit_fair_scheduler.ppt&ei=iHWMT-nyNYXS2gXSiNG1CQ&usg=AFQjCNEcyPaVmozJZI2gkxg2vlzzRENNgA&sig2=3GufDOwWPyeVbVeidqGHUA<http://www.google.com/url?sa=t&rct=j&q=&esrc=s&frm=1&source=web&cd=3&ved=0CF4QFjAC&url=http%3A%2F%2Fwww.cs.berkeley.edu%2F%7Ematei%2Ftalks%2F2009%2Fhadoop_summit_fair_scheduler.ppt&ei=iHWMT-nyNYXS2gXSiNG1CQ&usg=AFQjCNEcyPaVmozJZI2gkxg2vlzzRENNgA&sig2=3GufDOwWPyeVbVeidqGHUA>
    ****

    ** **

    ** **

    ** **

    Job *Scheduling* with the *Fair* and Capacity *Schedulers*<http://www.cs.berkeley.edu/%7Ematei/talks/2009/hadoop_summit_fair_scheduler.ppt>
    ****

    www.cs.berkeley.edu/~matei/.../hadoop_summit_<http://www.cs.berkeley.edu/%7Ematei/.../hadoop_summit_>
    *fair*_*scheduler*.pptSimilar<https://www.google.com/search?hl=en&biw=2276&bih=1184&q=related:www.cs.berkeley.edu/%7Ematei/talks/2009/hadoop_summit_fair_scheduler.ppt+fair+scheduler+pools&tbo=1&sa=X&ei=iHWMT-nyNYXS2gXSiNG1CQ&ved=0CFwQHzAC>
    ****

    You +1'd this publicly. Undo <https://www.google.com/>****

    File Format: Microsoft Powerpoint - Quick View<https://docs.google.com/viewer?a=v&q=cache:K5AGY_UydgEJ:www.cs.berkeley.edu/%7Ematei/talks/2009/hadoop_summit_fair_scheduler.ppt+&hl=en&gl=us&pid=bl&srcid=ADGEESjGd-83ood9163yy_tcgJQzdFOeuDzIbxl-xFUOtqIXlDrErQxsMI3CY6xRifGab8kyucHxM5krY60NNOlSOMpl_KnuVZx3qADSxUNeAD_K6YsZ6M-U-qRSsGGvdNtpGK-sRlg_&sig=AHIEtbTcrJ7gH-Mc_sPaXQ-2E4aoN9LTiA>
    Jun 10, 2009 – *Fair Scheduling*. Job Queue. *Fair Scheduler* Basics.
    Group jobs into “*pools*”. Assign each *pool* a guaranteed minimum share.
    Divide excess *...*****

    ** **

    ** **

    ** **

    *From:* James George
    *Sent:* Monday, April 16, 2012 1:53 PM
    *To:* Monroe, Mark
    *Cc:* cdh-user@cloudera.org

    *Subject:* Re: Response to Recent Blog Comment****

    ** **

    Mark,****

    ** **

    Thank you so much for the prompt reply. ****

    But am not sure, if I still have the answer. ****

    For an example, there are two users MarkM and JamesG, and there are two
    pools *high* and l*ow*. ****

    How can I assign MarkM to the "*high*" pool, and JamesG to the "*low*"
    pool by default. ****

    Like all the jobs submitted by MarkM should automatically go to "*high*"
    pool. ****

    Please let me know. Really appreciate your help. ****

    ** **

    Thank you,****

    James. ****

    ** **

    On Mon, Apr 16, 2012 at 11:18 AM, Monroe, Mark wrote:****

    Here is a simple, config for simply balancing based on usernames:****

    (restart the env have making the changes)****

    ****

    mapred-site.xml:****

    ****

    <property>****

    <name>mapred.jobtracker.taskScheduler</name>****

    <value>org.apache.hadoop.mapred.FairScheduler</value>****

    </property>****

    <property>****

    <name>mapred.fairscheduler.allocation.file</name>****

    <value>/etc/hadoop-0.20/conf/fair-scheduler.xml</value>****

    </property>****

    <property>****

    <name>mapred.fairscheduler.assignmultiple</name>****

    <value>true</value>****

    </property>****

    <property>****

    <name>mapred.fairscheduler.sizebasedweight</name>****

    <value>true</value>****

    </property>****

    <property>****

    <name>mapred.fairscheduler.preemption</name>****

    <value>true</value>****

    </property>****

    ****

    fair-scheduler.xml:****

    ****

    <allocations>****

    <pool name="default">****

    <minMaps>1</minMaps>****

    <minReduces>1</minReduces>****

    <maxRunningJobs>5</maxRunningJobs>****

    <minSharePreemptionTimeout>300</minSharePreemptionTimeout>****

    <weight>1.0</weight>****

    </pool>****

    ****

    <poolMaxJobsDefault>20</poolMaxJobsDefault>****

    <userMaxJobsDefault>10</userMaxJobsDefault>****

    <defaultMinSharePreemptionTimeout>600</defaultMinSharePreemptionTimeout>**
    **

    <fairSharePreemptionTimeout>300</fairSharePreemptionTimeout>****

    </allocations>****

    ****

    ****

    ****

    ****

    ****

    *From:* James George
    *Sent:* Monday, April 16, 2012 11:13 AM
    *To:* cdh-user@cloudera.org
    *Subject:* Re: Response to Recent Blog Comment****

    ****

    To Mailing-List.

    I was trying to implement the fair scheduler in my hadoop cluster. Could
    you please let me know the best place to find the all the configurations
    for fair scheduler. Like assigning certain jobs or certain users to
    particular pools by default. Would really appreciate any help.

    Kindly,
    James George****

    ** **

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcdh-user @
categorieshadoop
postedApr 16, '12 at 4:13p
activeApr 16, '12 at 9:24p
posts5
users2
websitecloudera.com
irc#hadoop

2 users in discussion

James George: 3 posts Monroe, Mark: 2 posts

People

Translate

site design / logo © 2022 Grokbase