Hi Everyone,

Does anybody know of a way to limit the number of concurrent map-
reduce jobs?

Say I have a query that consists of 5 different map reduce jobs. Based
on the dependency tree the jobs are scheduled as 3-1-1. I have to
share resources on our cluster, so it would be helpful if I could tell
cascalog to only schedule 1 at time or 2 at time so I don't take up a
disproportionate share of the resources at any given time.

I realize that my problem is a bit more nuanced than that. I could
increase the number of mappers, or use the hadoop fair-scheduler. But
I was wondering if there was a way to easily set this up through
cascalog.

Thanks,
Chun

Search Discussions

  • Nathan Marz at Feb 17, 2012 at 9:14 am
    Cascalog doesn't provide anything for this. You'll need to setup a Hadoop
    scheduler for this.

    On Thu, Feb 16, 2012 at 5:14 PM, Chun Kok wrote:

    Hi Everyone,

    Does anybody know of a way to limit the number of concurrent map-
    reduce jobs?

    Say I have a query that consists of 5 different map reduce jobs. Based
    on the dependency tree the jobs are scheduled as 3-1-1. I have to
    share resources on our cluster, so it would be helpful if I could tell
    cascalog to only schedule 1 at time or 2 at time so I don't take up a
    disproportionate share of the resources at any given time.

    I realize that my problem is a bit more nuanced than that. I could
    increase the number of mappers, or use the hadoop fair-scheduler. But
    I was wondering if there was a way to easily set this up through
    cascalog.

    Thanks,
    Chun


    --
    Twitter: @nathanmarz
    http://nathanmarz.com
  • Paul Lam at Feb 20, 2012 at 9:49 am
    how about (with-job-conf {"mapred.job.priority" "VERY_LOW"} ...)
  • Sam Ritchie at Feb 20, 2012 at 10:32 pm
    We've been using the fair scheduler on our jobs with great results. I
    believe EMR installs it by default, but you'll need to configure your
    project's job-conf using the settings in this article:

    http://hadoop.apache.org/common/docs/r0.20.2/fair_scheduler.html

    Cheers,
    Sam
    On Mon, Feb 20, 2012 at 1:48 AM, Paul Lam wrote:

    how about (with-job-conf {"mapred.job.priority" "VERY_LOW"} ...)



    --
    Sam Ritchie, Twitter Inc
    703.662.1337
    @sritchie09

    (Too brief? Here's why! http://emailcharter.org)

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcascalog-user @
categoriesclojure, hadoop
postedFeb 17, '12 at 1:19a
activeFeb 20, '12 at 10:32p
posts4
users4
websiteclojure.org
irc#clojure

People

Translate

site design / logo © 2022 Grokbase