Marc
-----Original Message-----
From: Edward Capriolo
Sent: Thursday, September 24, 2009 7:50 AM
To: [email protected]
Subject: Re: Task process exit with nonzero status of 1
On Wed, Sep 23, 2009 at 2:06 PM, Marc Limotte wrote:
I'm seeing this error when I try to run my job.
java.io.IOException: Task process exit with nonzero status of 1.
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:418)
From what I can find by doing some Google searches, this means the mapred task JVM has crashed. Not many suggestions about what to do about it. Some suggestions about increasing max heap. I tried that, although I don't think that's the issue because it's not a particularly memory intensive process and I've even tried it with a super small input data set of only a few records. Still see the same issue.
Can't find anything else in the logs. I don't think my task even started, because there are no user logs created at all. Seems to fail during Job Setup.
A little more background. This job was working fine for weeks, running hourly, and then failed on Saturday morning and hasn't worked since. Obviously, I looked for something that changed at that point, but no one was working at that time... can't find anything that changed. I tried the job with different input data sets, doesn't seem to matter, unless I run it with no data at all. The job does run with no input data, but if I have even a few input records it fails-doesn't seem to matter which records. I suspected some corruption in HDFS, but I was able to extract the data from HDFS (hadoop dfs -get ...) and the data looks ok. I also copied this data set to our TEST cluster and ran the job there... and it WORKED!
Ran one of our other jobs and it failed as well, so it doesn't seem to be job specific either; looks like every job fails the same way.
Did a complete reboot of the cluster-no impact.
We're using Hadoop 0.20.0, and Java 1.6 update 16 on CentOS 5.2 64bit.
Any suggestions on what could be wrong or where to look for more information would be appreciated.
Marc Limotte
Feeva Technology
PRIVATE AND CONFIDENTIAL - NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT FOR ONLY THE INTENDED RECIPIENT OF THE TRANSMISSION, AND MAY BE A COMMUNICATION PRIVILEGE BY LAW. IF YOU RECEIVED THIS E-MAIL IN ERROR, ANY REVIEW, USE, DISSEMINATION, DISTRIBUTION, OR COPYING OF THIS EMAIL IS STRICTLY PROHIBITED. PLEASE NOTIFY US IMMEDIATELY OF THE ERROR BY RETURN E-MAIL AND PLEASE DELETE THIS MESSAGE FROM YOUR SYSTEM.
Just a shot in the dark....I'm seeing this error when I try to run my job.
java.io.IOException: Task process exit with nonzero status of 1.
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:418)
From what I can find by doing some Google searches, this means the mapred task JVM has crashed. Not many suggestions about what to do about it. Some suggestions about increasing max heap. I tried that, although I don't think that's the issue because it's not a particularly memory intensive process and I've even tried it with a super small input data set of only a few records. Still see the same issue.
Can't find anything else in the logs. I don't think my task even started, because there are no user logs created at all. Seems to fail during Job Setup.
A little more background. This job was working fine for weeks, running hourly, and then failed on Saturday morning and hasn't worked since. Obviously, I looked for something that changed at that point, but no one was working at that time... can't find anything that changed. I tried the job with different input data sets, doesn't seem to matter, unless I run it with no data at all. The job does run with no input data, but if I have even a few input records it fails-doesn't seem to matter which records. I suspected some corruption in HDFS, but I was able to extract the data from HDFS (hadoop dfs -get ...) and the data looks ok. I also copied this data set to our TEST cluster and ran the job there... and it WORKED!
Ran one of our other jobs and it failed as well, so it doesn't seem to be job specific either; looks like every job fails the same way.
Did a complete reboot of the cluster-no impact.
We're using Hadoop 0.20.0, and Java 1.6 update 16 on CentOS 5.2 64bit.
Any suggestions on what could be wrong or where to look for more information would be appreciated.
Marc Limotte
Feeva Technology
PRIVATE AND CONFIDENTIAL - NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT FOR ONLY THE INTENDED RECIPIENT OF THE TRANSMISSION, AND MAY BE A COMMUNICATION PRIVILEGE BY LAW. IF YOU RECEIVED THIS E-MAIL IN ERROR, ANY REVIEW, USE, DISSEMINATION, DISTRIBUTION, OR COPYING OF THIS EMAIL IS STRICTLY PROHIBITED. PLEASE NOTIFY US IMMEDIATELY OF THE ERROR BY RETURN E-MAIL AND PLEASE DELETE THIS MESSAGE FROM YOUR SYSTEM.
Did you update java recently
http://www.koopman.me/2009/04/hadoop-0183-could-not-create-the-java-virtual-machine/
PRIVATE AND CONFIDENTIAL - NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT FOR ONLY THE INTENDED RECIPIENT OF THE TRANSMISSION, AND MAY BE A COMMUNICATION PRIVILEGE BY LAW. IF YOU RECEIVED THIS E-MAIL IN ERROR, ANY REVIEW, USE, DISSEMINATION, DISTRIBUTION, OR COPYING OF THIS EMAIL IS STRICTLY PROHIBITED. PLEASE NOTIFY US IMMEDIATELY OF THE ERROR BY RETURN E-MAIL AND PLEASE DELETE THIS MESSAGE FROM YOUR SYSTEM.