FAQ
I have been experiencing some unusual behavior from Hadoop recently.
When trying to run a job, some of the tasks fail with:

java.io.IOException: Task process exit with nonzero status of 1.
at org.apache.hadoop.mapred.TaskRunner.runChild(TaskRunner.java:462)
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:403


Not all the tasks fail, but enough tasks fail such that the job fails.
Unfortunately, there are no further logs for these tasks. Trying to
retrieve the logs produces:

HTTP ERROR: 410

Failed to retrieve stdout log for task:
attempt_200811101232_0218_m_000001_0

RequestURI=/tasklog


It seems like the tasktracker isn't able to even start the tasks on
those machines. Has anyone seen anything like this before?


--------------------------------------------------------
We're looking for an Amazing Software Engineers (+ interns):
http://business.rapleaf.com/careers.html

The Rapleaf Bailout Plan - Send a qualified referral (resume) and we
will award you with $10,007 bailout package if we hire that person.

Search Discussions

  • Sagar Naik at Dec 22, 2008 at 6:38 pm
    Check the logs on disk
    On TaskTracker node : check for {HADOOP_HOME}/logs/*tasktracker.log and out
    check for logs under
    {HADOOP_HOME}/logs/userlog/attempt_200811101232_0218_m_000001_0/[stdout,
    stderr, syslog]

    Nathan Marz wrote:
    I have been experiencing some unusual behavior from Hadoop recently.
    When trying to run a job, some of the tasks fail with:

    java.io.IOException: Task process exit with nonzero status of 1.
    at org.apache.hadoop.mapred.TaskRunner.runChild(TaskRunner.java:462)
    at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:403


    Not all the tasks fail, but enough tasks fail such that the job fails.
    Unfortunately, there are no further logs for these tasks. Trying to
    retrieve the logs produces:

    HTTP ERROR: 410

    Failed to retrieve stdout log for task:
    attempt_200811101232_0218_m_000001_0

    RequestURI=/tasklog


    It seems like the tasktracker isn't able to even start the tasks on
    those machines. Has anyone seen anything like this before?


    --------------------------------------------------------
    We're looking for an Amazing Software Engineers (+ interns):
    http://business.rapleaf.com/careers.html

    The Rapleaf Bailout Plan - Send a qualified referral (resume) and we
    will award you with $10,007 bailout package if we hire that person.
  • Karl Anderson at Jan 7, 2009 at 4:08 am

    On 22-Dec-08, at 10:24 AM, Nathan Marz wrote:

    I have been experiencing some unusual behavior from Hadoop recently.
    When trying to run a job, some of the tasks fail with:

    java.io.IOException: Task process exit with nonzero status of 1.
    at org.apache.hadoop.mapred.TaskRunner.runChild(TaskRunner.java:462)
    at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:403


    Not all the tasks fail, but enough tasks fail such that the job
    fails. Unfortunately, there are no further logs for these tasks.
    Trying to retrieve the logs produces:

    HTTP ERROR: 410

    Failed to retrieve stdout log for task:
    attempt_200811101232_0218_m_000001_0

    RequestURI=/tasklog


    It seems like the tasktracker isn't able to even start the tasks on
    those machines. Has anyone seen anything like this before?
    I see this on jobs that also get the "too many open files" task
    errors, or on subsequent jobs. I've always assumed that it's another
    manifestation of the same problem. Once I start getting these errors,
    I keep getting them until I shut down the cluster, although I don't
    always get enough to cause a job to fail. I haven't bothered
    restarting individual boxes or services.

    I haven't been able to reproduce it consistently, but it seems to
    happen when I have many small input files; a job with one large input
    file broke after I split the input up. I'm using Streaming.

    Karl Anderson
    [email protected]
    http://monkey.org/~kra

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedDec 22, '08 at 6:24p
activeJan 7, '09 at 4:08a
posts3
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2023 Grokbase