FAQ
Thanks Sanel,

Assuming my driver class would always use a "custom" job ID like
"MyCustomJob" instead of "job_<YYYYMMDDHHMM>_<nnnn>" e.g. job_201006171232_0004,
which is the default, how would I then query for the jobID?

Seems like it might just be easier to have my driver class
submit the job, write the jobid to a lock file (hdfs://myapp/myjob.lock), and then
a. remove the lock file when the job finishes, or
b. if a new job is triggered before the first finished, read the jobid from the lock file
kill the previous job, and start a new one

Alan


----- original message --------

Subject: Re: how to query JobTracker
Sent: Thu, 17 Jun 2010
From: Sanel Zukan<sanelz@gmail.com>
AFAIK, there is no such method (to get a job name from client side) :(
(at least I wasn't able to find it). Via JobProfile can be
extracted job name via given id, but only JobTracker can access it (if
you try to instantiate it, you will start own job tracker).

The only solution is to directly query things via job id, received
when job was started.
On Thu, Jun 17, 2010 at 2:53 PM, Some Body wrote:
Hi All,

What are the steps to query the cluster for running jobs with a particular JobName?
My driver class always submits my job with a preset name.
Job job = new Job(config, "My Job Name");
......
return job.waitForCompletion(true) ? 0 : 1;

I want to setup a cron to trigger the job submission and I want to ensure
only 1 instance of my job is running.
Surely I could do this via a shell wrapper, but I'd rather implement it in
my driver class.
i.e. getAllJobs from the JobTracker, check for "My Job Name", and kill the
old job before submitting a new job.
I'm using (cloudera's) hadoop 0.20.2+228

Thanks,
Alan
--- original message end ----

Search Discussions

  • Sanel Zukan at Jun 17, 2010 at 4:07 pm
    JobClient is able to directly connect to job tracker address (see
    JobTracker constructor
    with InetSocketAddress parameter). After that, getAllJobs() will
    return known jobs and
    you will able to find your job id there.

    I would go with similar solution (with proposed one): write some lock
    with job id
    and on second job start, fetch currently running jobs, find my id,
    check if is running and
    decide what to do next.

    PS:
    I'm not sure you will able to construct custom job id from client side ;)

    On Thu, Jun 17, 2010 at 5:12 PM, Some Body wrote:
    Thanks Sanel,

    Assuming my driver class would always use a "custom" job ID like
    "MyCustomJob"    instead of    "job_<YYYYMMDDHHMM>_<nnnn>" e.g. job_201006171232_0004,
    which is the default, how would I then query for the jobID?

    Seems like it might just be easier to have my driver class
    submit the job, write the jobid to a lock file (hdfs://myapp/myjob.lock),  and then
    a. remove the lock file when the job finishes, or
    b. if a new job is triggered before the first finished, read the jobid from the lock file
    kill the previous job, and start a new one

    Alan


    ----- original message --------

    Subject: Re: how to query JobTracker
    Sent: Thu, 17 Jun 2010
    From: Sanel Zukan<sanelz@gmail.com>
    AFAIK, there is no such method (to get a job name from client side) :(
    (at least I wasn't able to find it). Via JobProfile can be
    extracted job name via given id, but only JobTracker can access it (if
    you try to instantiate it, you will start own job tracker).

    The only solution is to directly query things via job id, received
    when job was started.

    On Thu, Jun 17, 2010 at 2:53 PM, Some Body <somebody@squareplanet.de>
    wrote:
    Hi All,

    What are the steps to query the cluster for running jobs with a particular JobName?
    My driver class always submits my job with a preset name.
    Job job = new Job(config, "My Job Name");
    ......
    return job.waitForCompletion(true) ? 0 : 1;

    I want to setup a cron to trigger the job submission and I want to ensure
    only 1 instance of my job is running.
    Surely I could do this via a shell wrapper, but I'd rather implement it in
    my driver class.
    i.e. getAllJobs from the JobTracker, check for "My Job Name", and kill the
    old job before submitting a new job.
    I'm using (cloudera's) hadoop 0.20.2+228

    Thanks,
    Alan
    --- original message end ----

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedJun 17, '10 at 3:12p
activeJun 17, '10 at 4:07p
posts2
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Some Body: 1 post Sanel Zukan: 1 post

People

Translate

site design / logo © 2022 Grokbase