FAQ
Dear Hadoop users,



Is it possible without using java manage task assignment to implement some
simple rules? Like do not launch more that 1 instance of crawling task on
a machine, and do not run data intensive tasks on remote machines, and do
not run computationally intensive tasks on single-core machines:etc.



Now it's done by failing tasks that decided to run on a wrong machine, but I
hope to find some solution on jobtracker side..



---

Dmitry

Search Discussions

  • Devaraj Das at Sep 8, 2008 at 4:56 am
    No that is not possible today. However, you might want to look at the
    TaskScheduler to see if you can implement a scheduler to provide this kind
    of task scheduling.

    In the current hadoop, one point regarding computationally intensive task is
    that if the machine is not able to keep up with the rest of the machines
    (and the task on that machine is running slower than others), speculative
    execution, if enabled, can help a lot. Also, implicitly, faster/better
    machines get more work than the slower machines.

    On 9/8/08 3:27 AM, "Dmitry Pushkarev" wrote:

    Dear Hadoop users,



    Is it possible without using java manage task assignment to implement some
    simple rules? Like do not launch more that 1 instance of crawling task on
    a machine, and do not run data intensive tasks on remote machines, and do
    not run computationally intensive tasks on single-core machines:etc.



    Now it's done by failing tasks that decided to run on a wrong machine, but I
    hope to find some solution on jobtracker side..



    ---

    Dmitry
  • Alejandro Abdelnur at Sep 8, 2008 at 6:56 am
    We need something similar
    (https://issues.apache.org/jira/browse/HADOOP-3740), the problem with
    the TaskScheduler is that does not have hooks into the lifecycle of a
    task.

    A
    On Mon, Sep 8, 2008 at 10:25 AM, Devaraj Das wrote:
    No that is not possible today. However, you might want to look at the
    TaskScheduler to see if you can implement a scheduler to provide this kind
    of task scheduling.

    In the current hadoop, one point regarding computationally intensive task is
    that if the machine is not able to keep up with the rest of the machines
    (and the task on that machine is running slower than others), speculative
    execution, if enabled, can help a lot. Also, implicitly, faster/better
    machines get more work than the slower machines.

    On 9/8/08 3:27 AM, "Dmitry Pushkarev" wrote:

    Dear Hadoop users,



    Is it possible without using java manage task assignment to implement some
    simple rules? Like do not launch more that 1 instance of crawling task on
    a machine, and do not run data intensive tasks on remote machines, and do
    not run computationally intensive tasks on single-core machines:etc.



    Now it's done by failing tasks that decided to run on a wrong machine, but I
    hope to find some solution on jobtracker side..



    ---

    Dmitry
  • Dmitry Pushkarev at Sep 8, 2008 at 7:46 am
    How about just specify machines to run the task on? I haven't seen it
    anywhere..

    -----Original Message-----
    From: Devaraj Das
    Sent: Sunday, September 07, 2008 9:55 PM
    To: core-user@hadoop.apache.org
    Subject: Re: task assignment managemens.

    No that is not possible today. However, you might want to look at the
    TaskScheduler to see if you can implement a scheduler to provide this kind
    of task scheduling.

    In the current hadoop, one point regarding computationally intensive task is
    that if the machine is not able to keep up with the rest of the machines
    (and the task on that machine is running slower than others), speculative
    execution, if enabled, can help a lot. Also, implicitly, faster/better
    machines get more work than the slower machines.

    On 9/8/08 3:27 AM, "Dmitry Pushkarev" wrote:

    Dear Hadoop users,



    Is it possible without using java manage task assignment to implement some
    simple rules? Like do not launch more that 1 instance of crawling task on
    a machine, and do not run data intensive tasks on remote machines, and do
    not run computationally intensive tasks on single-core machines:etc.



    Now it's done by failing tasks that decided to run on a wrong machine, but I
    hope to find some solution on jobtracker side..



    ---

    Dmitry

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedSep 7, '08 at 9:58p
activeSep 8, '08 at 7:46a
posts4
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase