Hi,
We are using Hadoop 0.19 on a small cluster of 7 machines (7 datanodes, 4
task trackers), and we typically have 3-4 jobs running at a time. We have
been facing the following error on the Jobtracker:
java.io.IOException: java.lang.OutOfMemoryError: GC overhead limit exceeded
It seems to be thrown by RunningJob.killJob() and usually occurs after a day
or so of starting up the cluster.
In the Jobtracker's output file:
Exception in thread "initJobs" java.lang.OutOfMemoryError: GC overhead limit
exceeded
at java.lang.String.substring(String.java:1939)
at java.lang.String.substring(String.java:1904)
at org.apache.hadoop.fs.Path.getName(Path.java:188)
at
org.apache.hadoop.fs.ChecksumFileSystem.isChecksumFile(ChecksumFileSystem.java:70)
at
org.apache.hadoop.fs.ChecksumFileSystem$1.accept(ChecksumFileSystem.java:442)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:726)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:748)
at
org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:457)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:723)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:748)
at
org.apache.hadoop.mapred.JobHistory$JobInfo.getJobHistoryFileName(JobHistory.java:660)
at
org.apache.hadoop.mapred.JobHistory$JobInfo.finalizeRecovery(JobHistory.java:746)
at org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:1532)
at
org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:2232)
at
org.apache.hadoop.mapred.JobInProgress.terminateJob(JobInProgress.java:1938)
at
org.apache.hadoop.mapred.JobInProgress.terminate(JobInProgress.java:1953)
at org.apache.hadoop.mapred.JobInProgress.fail(JobInProgress.java:2012)
at
org.apache.hadoop.mapred.EagerTaskInitializationListener$JobInitThread.run(EagerTaskInitializationListener.java:62)
at java.lang.Thread.run(Thread.java:619)
Please help!
Thanks,
Meghana