FYI:
The issue was happening due to a bloated distributed cache. I was reusing
the same Configuration object between the jobs and adding the same numbr of jar
files to the distrubuted cache in every job. Once I reduced the number of cached
jar files, it ran more jobs before failing (Earlier it was failing in 3rd job).
I tried DistributedCache.purgeCache(conf) between jobs, but it didnot fix the
problem, could be a bug some where ? The work around was to use a differnt
Configuration object per job.
Thanks,
Murali Krishna
________________________________
From: Murali Krishna. P <
[email protected]>
To:
[email protected]Sent: Fri, 15 October, 2010 6:01:10 PM
Subject: Re: Hadoop starting extra map tasks and eventually failing
Thanks Amareshwari,
That explains the spurious extra tasks n the log. However I am not getting
the userlogs for the failed setup task because the jvm it tries to run in fails
immediately.
I get only tasktracker log like this:
2010-10-15 03:46:53,397 INFO org.apache.hadoop.mapred.JvmManager: In JvmRunner
constructed JVM ID: jvm_201010140533_0157_m_-1758278022
2010-10-15 03:46:53,398 INFO org.apache.hadoop.mapred.JvmManager: JVM Runner
jvm_201010140533_0157_m_-1758278022 spawned.
2010-10-15 03:46:53,918 INFO org.apache.hadoop.mapred.JvmManager: JVM :
jvm_201010140533_0157_m_-1758278022 exited. Number of tasks it ran: 0
2010-10-15 03:46:56,946 INFO org.apache.hadoop.mapred.TaskRunner:
attempt_201010140533_0157_m_000005_1 done; removing files.
2010-10-15 03:46:56,946 INFO org.apache.hadoop.mapred.TaskTracker: addFreeSlot :
current free slots : 2
2010-10-15 03:46:58,050 INFO org.apache.hadoop.mapred.TaskTracker:
LaunchTaskAction (registerTask): attempt_201010140533_0157_m_000005_2 task's
state:UNASSIG
NED
It is tough to figure out what is going wrong in my setup task without userlogs
. It is a series of same job with different input. Usually the first 2 job
succeeds and the 3rd job fails. What exactly gets run in setup task? I guess the
split calculation etc. Since the jvm is getting exited with in few milliseconds
according to the above log, I am not sure whether it is reaching the
application's code at all.
Thanks,
Murali Krishna
________________________________
From: Amareshwari Sri Ramadasu <
[email protected]>
To: "common-us
[email protected]" <
[email protected]>
Sent: Fri, 15 October, 2010 5:17:38 PM
Subject: Re: Hadoop starting extra map tasks and eventually failing
These extra tasks are job-setup and job-cleanup tasks which use map/reduce slots
to run.
Looks like job-setup task failed for your second job even after retries, so no
maps are scheduled. But you should see tasklogs for the failed tasks.
Thanks
Amareshwari
On 10/15/10 5:11 PM, "Murali Krishna. P" wrote:
Hi,
I have attached the relevant part of jobtracker log. The job1 had 3 splits,
but it started 5 map tasks, m_00000 through m_00004. ( I have the speculative
execution turned off). The job some how succededs, the log files for 4th and 5th
task didnt get any records. Hovewer the next job again has 3 splits but this
time it schedules only m_00003 m_00004 and both of them fail. There is no
userlogs created for these 2 tasks. The tasktracker log mentions that the jvm
has spawned and exited immediately. And it doesnot schedule the first 3 map
tasks and the job fails since 4th and 5th task fail even after retries.
Why is extra tasks gettin scheduled ?
How did those tasks pass in the first case?
Why the right tasks are not scheduled in the second job?
This is easily reproducible, please take a look at JT log and advise.
Thanks,
Murali Krishna