FAQ
Hi,

We are using Hadoop 0.19 on a small cluster of 7 machines (7 datanodes, 4
task trackers), and we typically have 3-4 jobs running at a time. We have
been facing the following error on the Jobtracker:

java.io.IOException: java.lang.OutOfMemoryError: GC overhead limit exceeded

It seems to be thrown by RunningJob.killJob() and usually occurs after a day
or so of starting up the cluster.

In the Jobtracker's output file:
Exception in thread "initJobs" java.lang.OutOfMemoryError: GC overhead limit
exceeded
at java.lang.String.substring(String.java:1939)
at java.lang.String.substring(String.java:1904)
at org.apache.hadoop.fs.Path.getName(Path.java:188)
at
org.apache.hadoop.fs.ChecksumFileSystem.isChecksumFile(ChecksumFileSystem.java:70)
at
org.apache.hadoop.fs.ChecksumFileSystem$1.accept(ChecksumFileSystem.java:442)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:726)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:748)
at
org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:457)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:723)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:748)
at
org.apache.hadoop.mapred.JobHistory$JobInfo.getJobHistoryFileName(JobHistory.java:660)
at
org.apache.hadoop.mapred.JobHistory$JobInfo.finalizeRecovery(JobHistory.java:746)
at org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:1532)
at
org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:2232)
at
org.apache.hadoop.mapred.JobInProgress.terminateJob(JobInProgress.java:1938)
at
org.apache.hadoop.mapred.JobInProgress.terminate(JobInProgress.java:1953)
at org.apache.hadoop.mapred.JobInProgress.fail(JobInProgress.java:2012)
at
org.apache.hadoop.mapred.EagerTaskInitializationListener$JobInitThread.run(EagerTaskInitializationListener.java:62)
at java.lang.Thread.run(Thread.java:619)


Please help!

Thanks,

Meghana

Search Discussions

  • Arun C Murthy at Jul 5, 2011 at 7:31 am
    Meghana,

    What is the heapsize for your JT? Try increasing that.

    Also, we've fixed a huge number of issues in the JT (and Hadoop overall) since 0.19. Can you upgrade to 0.20.203, the latest stable release?

    thanks,
    Arun

    Sent from my iPhone
    On Jul 4, 2011, at 11:10 PM, Meghana wrote:

    Hi,

    We are using Hadoop 0.19 on a small cluster of 7 machines (7 datanodes, 4
    task trackers), and we typically have 3-4 jobs running at a time. We have
    been facing the following error on the Jobtracker:

    java.io.IOException: java.lang.OutOfMemoryError: GC overhead limit exceeded

    It seems to be thrown by RunningJob.killJob() and usually occurs after a day
    or so of starting up the cluster.

    In the Jobtracker's output file:
    Exception in thread "initJobs" java.lang.OutOfMemoryError: GC overhead limit
    exceeded
    at java.lang.String.substring(String.java:1939)
    at java.lang.String.substring(String.java:1904)
    at org.apache.hadoop.fs.Path.getName(Path.java:188)
    at
    org.apache.hadoop.fs.ChecksumFileSystem.isChecksumFile(ChecksumFileSystem.java:70)
    at
    org.apache.hadoop.fs.ChecksumFileSystem$1.accept(ChecksumFileSystem.java:442)
    at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:726)
    at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:748)
    at
    org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:457)
    at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:723)
    at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:748)
    at
    org.apache.hadoop.mapred.JobHistory$JobInfo.getJobHistoryFileName(JobHistory.java:660)
    at
    org.apache.hadoop.mapred.JobHistory$JobInfo.finalizeRecovery(JobHistory.java:746)
    at org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:1532)
    at
    org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:2232)
    at
    org.apache.hadoop.mapred.JobInProgress.terminateJob(JobInProgress.java:1938)
    at
    org.apache.hadoop.mapred.JobInProgress.terminate(JobInProgress.java:1953)
    at org.apache.hadoop.mapred.JobInProgress.fail(JobInProgress.java:2012)
    at
    org.apache.hadoop.mapred.EagerTaskInitializationListener$JobInitThread.run(EagerTaskInitializationListener.java:62)
    at java.lang.Thread.run(Thread.java:619)


    Please help!

    Thanks,

    Meghana
  • Meghana at Jul 5, 2011 at 8:39 am
    Hey Arun,

    The JT heapsize (Xmx) is 512m. Will try increasing it, thanks!

    Yes, migrating to 0.20 is definitely on my to-do list, but some urgent
    issues have taken priority for now :(

    Thanks,

    ..meghana

    On 5 July 2011 12:25, Arun C Murthy wrote:

    Meghana,

    What is the heapsize for your JT? Try increasing that.

    Also, we've fixed a huge number of issues in the JT (and Hadoop overall)
    since 0.19. Can you upgrade to 0.20.203, the latest stable release?

    thanks,
    Arun

    Sent from my iPhone
    On Jul 4, 2011, at 11:10 PM, Meghana wrote:

    Hi,

    We are using Hadoop 0.19 on a small cluster of 7 machines (7 datanodes, 4
    task trackers), and we typically have 3-4 jobs running at a time. We have
    been facing the following error on the Jobtracker:

    java.io.IOException: java.lang.OutOfMemoryError: GC overhead limit exceeded
    It seems to be thrown by RunningJob.killJob() and usually occurs after a day
    or so of starting up the cluster.

    In the Jobtracker's output file:
    Exception in thread "initJobs" java.lang.OutOfMemoryError: GC overhead limit
    exceeded
    at java.lang.String.substring(String.java:1939)
    at java.lang.String.substring(String.java:1904)
    at org.apache.hadoop.fs.Path.getName(Path.java:188)
    at
    org.apache.hadoop.fs.ChecksumFileSystem.isChecksumFile(ChecksumFileSystem.java:70)
    at
    org.apache.hadoop.fs.ChecksumFileSystem$1.accept(ChecksumFileSystem.java:442)
    at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:726)
    at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:748)
    at
    org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:457)
    at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:723)
    at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:748)
    at
    org.apache.hadoop.mapred.JobHistory$JobInfo.getJobHistoryFileName(JobHistory.java:660)
    at
    org.apache.hadoop.mapred.JobHistory$JobInfo.finalizeRecovery(JobHistory.java:746)
    at
    org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:1532)
    at
    org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:2232)
    at
    org.apache.hadoop.mapred.JobInProgress.terminateJob(JobInProgress.java:1938)
    at
    org.apache.hadoop.mapred.JobInProgress.terminate(JobInProgress.java:1953)
    at
    org.apache.hadoop.mapred.JobInProgress.fail(JobInProgress.java:2012)
    at
    org.apache.hadoop.mapred.EagerTaskInitializationListener$JobInitThread.run(EagerTaskInitializationListener.java:62)
    at java.lang.Thread.run(Thread.java:619)


    Please help!

    Thanks,

    Meghana

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedJul 5, '11 at 6:12a
activeJul 5, '11 at 8:39a
posts3
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Meghana: 2 posts Arun C Murthy: 1 post

People

Translate

site design / logo © 2022 Grokbase