FAQ
I'm using Hadoop 1.0.3 on a small cluster (1 namenode, 1 jobtracker, 2
compute nodes). My input size is a sequence file of around 280mb.

Generally, my jobs run just fine and all finish in 2-5 minutes. However,
quite randomly the jobs refuse to run. They submit and appear when running
'hadoop job -list' but don't appear on the jobtracker's webpage. If I
manually type in the job ID on the webpage I can see it is trying to run
the setup task - the map tasks haven't even started. I've left them to run
and even after several minutes it is still in this state.

When I spot this, I kill the job and resubmit it and generally it works.

A couple of times I have seen similar problems with reduce tasks that get
stuck while 'initializing'.

Any ideas?

Search Discussions

  • Bejoy KS at Jul 13, 2012 at 4:39 am
    Hi Robert

    It could be because there are no free slots available in your cluster during job submission time to launch those tasks. Some other tasks may have already occupied the map/reduce slots.

    When you experience this random issue please verify whether there are free task slots available.

    Regards
    Bejoy KS

    Sent from handheld, please excuse typos.

    -----Original Message-----
    From: Robert Dyer <psybers@gmail.com>
    Date: Thu, 12 Jul 2012 23:03:02
    To: <mapreduce-user@hadoop.apache.org>
    Reply-To: mapreduce-user@hadoop.apache.org
    Subject: Jobs randomly not starting

    I'm using Hadoop 1.0.3 on a small cluster (1 namenode, 1 jobtracker, 2
    compute nodes). My input size is a sequence file of around 280mb.

    Generally, my jobs run just fine and all finish in 2-5 minutes. However,
    quite randomly the jobs refuse to run. They submit and appear when running
    'hadoop job -list' but don't appear on the jobtracker's webpage. If I
    manually type in the job ID on the webpage I can see it is trying to run
    the setup task - the map tasks haven't even started. I've left them to run
    and even after several minutes it is still in this state.

    When I spot this, I kill the job and resubmit it and generally it works.

    A couple of times I have seen similar problems with reduce tasks that get
    stuck while 'initializing'.

    Any ideas?
  • Harsh J at Jul 13, 2012 at 6:04 am
    Hey Robert,

    Any chance you can pastebin the JT logs, grepped for the bad job ID,
    and send the link across? They shouldn't hang the way you describe.
    On Fri, Jul 13, 2012 at 9:33 AM, Robert Dyer wrote:
    I'm using Hadoop 1.0.3 on a small cluster (1 namenode, 1 jobtracker, 2
    compute nodes). My input size is a sequence file of around 280mb.

    Generally, my jobs run just fine and all finish in 2-5 minutes. However,
    quite randomly the jobs refuse to run. They submit and appear when running
    'hadoop job -list' but don't appear on the jobtracker's webpage. If I
    manually type in the job ID on the webpage I can see it is trying to run the
    setup task - the map tasks haven't even started. I've left them to run and
    even after several minutes it is still in this state.

    When I spot this, I kill the job and resubmit it and generally it works.

    A couple of times I have seen similar problems with reduce tasks that get
    stuck while 'initializing'.

    Any ideas?


    --
    Harsh J
  • Robert Dyer at Jul 17, 2012 at 8:28 pm
    Upon further inspection of that log, it appears the problem is the startup
    task just takes a very long time.

    Typically it is taking at most 6 seconds, but sometimes (the cases I think
    its hanging) it actually runs and finishes but takes 3-5 minutes.

    Same problem with the cleanup (which is where I thought the reduce was
    getting stuck).

    I am currently the only user on this cluster and I never have more than 1
    job in the queue at a time.

    Ideas?
    On Fri, Jul 13, 2012 at 1:04 AM, Harsh J wrote:

    Hey Robert,

    Any chance you can pastebin the JT logs, grepped for the bad job ID,
    and send the link across? They shouldn't hang the way you describe.
    On Fri, Jul 13, 2012 at 9:33 AM, Robert Dyer wrote:
    I'm using Hadoop 1.0.3 on a small cluster (1 namenode, 1 jobtracker, 2
    compute nodes). My input size is a sequence file of around 280mb.

    Generally, my jobs run just fine and all finish in 2-5 minutes. However,
    quite randomly the jobs refuse to run. They submit and appear when running
    'hadoop job -list' but don't appear on the jobtracker's webpage. If I
    manually type in the job ID on the webpage I can see it is trying to run the
    setup task - the map tasks haven't even started. I've left them to run and
    even after several minutes it is still in this state.

    When I spot this, I kill the job and resubmit it and generally it works.

    A couple of times I have seen similar problems with reduce tasks that get
    stuck while 'initializing'.

    Any ideas?

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupmapreduce-user @
categorieshadoop
postedJul 13, '12 at 4:03a
activeJul 17, '12 at 8:28p
posts4
users3
websitehadoop.apache.org...
irc#hadoop

3 users in discussion

Robert Dyer: 2 posts Harsh J: 1 post Bejoy KS: 1 post

People

Translate

site design / logo © 2021 Grokbase