FAQ
Hi All,

I am running my mapred program in local mode by setting
mapred.jobtracker.local to local mode so that I can debug my code.
The mapred program is a direct porting of my original sequential code. There
is no reduce phase.
Basically, I have just put my program in the map class.

My program takes around 1-2 min. in instantiating the data objects which are
present in the constructor of Map class(it loads some data model files,
therefore it takes some time). After the instantiation part in the
constrcutor of Map class the map function is supposed to process the input
split.

The problem is that the data objects do not get instantiated completely and
in between(whlie it is still in constructor) the program stops giving the
exceptions pasted at bottom.
The program runs fine without mapreduce and does not require more than 2GB
memory, but in mapreduce even after doing export HADOOP_HEAPSIZE=2500(I am
working on machines with 16GB RAM), the program fails. I have also set
HADOOP_OPTS="-server -XX:-UseGCOverheadLimit" as sometimes I was getting GC
Overhead Limit Exceeded exceptions also.

Somebody, please help me with this problem: I have trying to debug it for
the last 3 days, but unsuccessful. Thanks!

java.lang.OutOfMemoryError: Java heap space
at sun.misc.FloatingDecimal.toJavaFormatString(FloatingDecimal.java:889)
at java.lang.Double.toString(Double.java:179)
at java.text.DigitList.set(DigitList.java:272)
at java.text.DecimalFormat.format(DecimalFormat.java:584)
at java.text.DecimalFormat.format(DecimalFormat.java:507)
at java.text.NumberFormat.format(NumberFormat.java:269)
at org.apache.hadoop.util.StringUtils.formatPercent(StringUtils.java:110)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1147)
at LbjTagger.NerTagger.main(NerTagger.java:109)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

09/06/16 12:34:41 WARN mapred.LocalJobRunner: job_local_0001
java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:81)
at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:83)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:79)
... 5 more
Caused by: java.lang.ThreadDeath
at java.lang.Thread.stop(Thread.java:715)
at org.apache.hadoop.mapred.LocalJobRunner.killJob(LocalJobRunner.java:310)
at
org.apache.hadoop.mapred.JobClient$NetworkedJob.killJob(JobClient.java:315)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1224)
at LbjTagger.NerTagger.main(NerTagger.java:109)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

--
View this message in context: http://www.nabble.com/Nor-%22OOM-Java-Heap-Space%22-neither-%22GC-OverHead-Limit-Exeeceded%22-tp24059508p24059508.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Search Discussions

  • Jason hadoop at Jun 17, 2009 at 2:26 am
    Is it possible that your map class is an inner class and not static?
    On Tue, Jun 16, 2009 at 10:51 AM, akhil1988 wrote:


    Hi All,

    I am running my mapred program in local mode by setting
    mapred.jobtracker.local to local mode so that I can debug my code.
    The mapred program is a direct porting of my original sequential code.
    There
    is no reduce phase.
    Basically, I have just put my program in the map class.

    My program takes around 1-2 min. in instantiating the data objects which
    are
    present in the constructor of Map class(it loads some data model files,
    therefore it takes some time). After the instantiation part in the
    constrcutor of Map class the map function is supposed to process the input
    split.

    The problem is that the data objects do not get instantiated completely and
    in between(whlie it is still in constructor) the program stops giving the
    exceptions pasted at bottom.
    The program runs fine without mapreduce and does not require more than 2GB
    memory, but in mapreduce even after doing export HADOOP_HEAPSIZE=2500(I am
    working on machines with 16GB RAM), the program fails. I have also set
    HADOOP_OPTS="-server -XX:-UseGCOverheadLimit" as sometimes I was getting GC
    Overhead Limit Exceeded exceptions also.

    Somebody, please help me with this problem: I have trying to debug it for
    the last 3 days, but unsuccessful. Thanks!

    java.lang.OutOfMemoryError: Java heap space
    at
    sun.misc.FloatingDecimal.toJavaFormatString(FloatingDecimal.java:889)
    at java.lang.Double.toString(Double.java:179)
    at java.text.DigitList.set(DigitList.java:272)
    at java.text.DecimalFormat.format(DecimalFormat.java:584)
    at java.text.DecimalFormat.format(DecimalFormat.java:507)
    at java.text.NumberFormat.format(NumberFormat.java:269)
    at
    org.apache.hadoop.util.StringUtils.formatPercent(StringUtils.java:110)
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1147)
    at LbjTagger.NerTagger.main(NerTagger.java:109)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at

    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at

    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
    at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

    09/06/16 12:34:41 WARN mapred.LocalJobRunner: job_local_0001
    java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
    at
    org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:81)
    at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
    at
    org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
    at
    org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:83)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328)
    at
    org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)
    Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
    Method)
    at

    sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
    at

    sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
    at
    org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:79)
    ... 5 more
    Caused by: java.lang.ThreadDeath
    at java.lang.Thread.stop(Thread.java:715)
    at
    org.apache.hadoop.mapred.LocalJobRunner.killJob(LocalJobRunner.java:310)
    at
    org.apache.hadoop.mapred.JobClient$NetworkedJob.killJob(JobClient.java:315)
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1224)
    at LbjTagger.NerTagger.main(NerTagger.java:109)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at

    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at

    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
    at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

    --
    View this message in context:
    http://www.nabble.com/Nor-%22OOM-Java-Heap-Space%22-neither-%22GC-OverHead-Limit-Exeeceded%22-tp24059508p24059508.html
    Sent from the Hadoop core-user mailing list archive at Nabble.com.

    --
    Pro Hadoop, a book to guide you from beginner to hadoop mastery,
    http://www.amazon.com/dp/1430219424?tag=jewlerymall
    www.prohadoopbook.com a community for Hadoop Professionals
  • Akhil1988 at Jun 17, 2009 at 3:28 am
    Thank you Jason for your reply.

    My Map class is an inner class and it is a static class. Here is the
    structure of my code.

    public class NerTagger {

    public static class Map extends MapReduceBase implements
    Mapper<LongWritable, Text, Text, Text>{
    private Text word = new Text();
    private static NETaggerLevel1 tagger1 = new
    NETaggerLevel1();
    private static NETaggerLevel2 tagger2 = new
    NETaggerLevel2();

    Map(){
    System.out.println("HI2\n");

    Parameters.readConfigAndLoadExternalData("Config/allLayer1.config");
    System.out.println("HI3\n");

    Parameters.forceNewSentenceOnLineBreaks=Boolean.parseBoolean("true");

    System.out.println("loading the tagger");

    tagger1=(NETaggerLevel1)Classifier.binaryRead(Parameters.pathToModelFile+".level1");
    System.out.println("HI5\n");

    tagger2=(NETaggerLevel2)Classifier.binaryRead(Parameters.pathToModelFile+".level2");
    System.out.println("Done- loading the tagger");
    }

    public void map(LongWritable key, Text value,
    OutputCollector<Text, Text> output, Reporter reporter ) throws IOException {
    String inputline = value.toString();

    /* Processing of the input pair is done here */
    }


    public static void main(String [] args) throws Exception {
    JobConf conf = new JobConf(NerTagger.class);
    conf.setJobName("NerTagger");

    conf.setOutputKeyClass(Text.class);
    conf.setOutputValueClass(IntWritable.class);

    conf.setMapperClass(Map.class);
    conf.setNumReduceTasks(0);

    conf.setInputFormat(TextInputFormat.class);
    conf.setOutputFormat(TextOutputFormat.class);

    conf.set("mapred.job.tracker", "local");
    conf.set("fs.default.name", "file:///");

    DistributedCache.addCacheFile(new
    URI("/home/akhil1988/Ner/OriginalNer/Data/"), conf);
    DistributedCache.addCacheFile(new
    URI("/home/akhil1988/Ner/OriginalNer/Config/"), conf);
    DistributedCache.createSymlink(conf);


    conf.set("mapred.child.java.opts","-Xmx4096m");

    FileInputFormat.setInputPaths(conf, new Path(args[0]));
    FileOutputFormat.setOutputPath(conf, new Path(args[1]));

    System.out.println("HI1\n");

    JobClient.runJob(conf);
    }

    Jason, when the program executes HI1 and HI2 are printed but it does not
    reaches HI3. In the statement
    Parameters.readConfigAndLoadExternalData("Config/allLayer1.config"); it is
    able to access Config/allLayer1.config file (as while executing this
    statement, it prints some messages like which data it is loading, etc.) but
    it gets stuck there(while loading some classifier) and never reaches HI3.

    This program runs fine when executed normally(without mapreduce).

    Thanks, Akhil




    jason hadoop wrote:
    Is it possible that your map class is an inner class and not static?
    On Tue, Jun 16, 2009 at 10:51 AM, akhil1988 wrote:


    Hi All,

    I am running my mapred program in local mode by setting
    mapred.jobtracker.local to local mode so that I can debug my code.
    The mapred program is a direct porting of my original sequential code.
    There
    is no reduce phase.
    Basically, I have just put my program in the map class.

    My program takes around 1-2 min. in instantiating the data objects which
    are
    present in the constructor of Map class(it loads some data model files,
    therefore it takes some time). After the instantiation part in the
    constrcutor of Map class the map function is supposed to process the
    input
    split.

    The problem is that the data objects do not get instantiated completely
    and
    in between(whlie it is still in constructor) the program stops giving the
    exceptions pasted at bottom.
    The program runs fine without mapreduce and does not require more than
    2GB
    memory, but in mapreduce even after doing export HADOOP_HEAPSIZE=2500(I
    am
    working on machines with 16GB RAM), the program fails. I have also set
    HADOOP_OPTS="-server -XX:-UseGCOverheadLimit" as sometimes I was getting
    GC
    Overhead Limit Exceeded exceptions also.

    Somebody, please help me with this problem: I have trying to debug it for
    the last 3 days, but unsuccessful. Thanks!

    java.lang.OutOfMemoryError: Java heap space
    at
    sun.misc.FloatingDecimal.toJavaFormatString(FloatingDecimal.java:889)
    at java.lang.Double.toString(Double.java:179)
    at java.text.DigitList.set(DigitList.java:272)
    at java.text.DecimalFormat.format(DecimalFormat.java:584)
    at java.text.DecimalFormat.format(DecimalFormat.java:507)
    at java.text.NumberFormat.format(NumberFormat.java:269)
    at
    org.apache.hadoop.util.StringUtils.formatPercent(StringUtils.java:110)
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1147)
    at LbjTagger.NerTagger.main(NerTagger.java:109)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at

    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at

    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
    at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

    09/06/16 12:34:41 WARN mapred.LocalJobRunner: job_local_0001
    java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
    at
    org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:81)
    at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
    at
    org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
    at
    org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:83)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328)
    at
    org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)
    Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
    Method)
    at

    sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
    at

    sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
    at
    org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:79)
    ... 5 more
    Caused by: java.lang.ThreadDeath
    at java.lang.Thread.stop(Thread.java:715)
    at
    org.apache.hadoop.mapred.LocalJobRunner.killJob(LocalJobRunner.java:310)
    at
    org.apache.hadoop.mapred.JobClient$NetworkedJob.killJob(JobClient.java:315)
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1224)
    at LbjTagger.NerTagger.main(NerTagger.java:109)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at

    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at

    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
    at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

    --
    View this message in context:
    http://www.nabble.com/Nor-%22OOM-Java-Heap-Space%22-neither-%22GC-OverHead-Limit-Exeeceded%22-tp24059508p24059508.html
    Sent from the Hadoop core-user mailing list archive at Nabble.com.

    --
    Pro Hadoop, a book to guide you from beginner to hadoop mastery,
    http://www.amazon.com/dp/1430219424?tag=jewlerymall
    www.prohadoopbook.com a community for Hadoop Professionals
    --
    View this message in context: http://www.nabble.com/Nor-%22OOM-Java-Heap-Space%22-neither-%22GC-OverHead-Limit-Exeeceded%22-tp24059508p24066385.html
    Sent from the Hadoop core-user mailing list archive at Nabble.com.
  • Akhil1988 at Jun 17, 2009 at 3:32 am
    One more thing, finally it terminates there (after some time) by giving the
    final Exception:

    java.io.IOException: Job failed!
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1217)
    at LbjTagger.NerTagger.main(NerTagger.java:109)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
    at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)


    akhil1988 wrote:
    Thank you Jason for your reply.

    My Map class is an inner class and it is a static class. Here is the
    structure of my code.

    public class NerTagger {

    public static class Map extends MapReduceBase implements
    Mapper<LongWritable, Text, Text, Text>{
    private Text word = new Text();
    private static NETaggerLevel1 tagger1 = new
    NETaggerLevel1();
    private static NETaggerLevel2 tagger2 = new
    NETaggerLevel2();

    Map(){
    System.out.println("HI2\n");

    Parameters.readConfigAndLoadExternalData("Config/allLayer1.config");
    System.out.println("HI3\n");

    Parameters.forceNewSentenceOnLineBreaks=Boolean.parseBoolean("true");

    System.out.println("loading the tagger");

    tagger1=(NETaggerLevel1)Classifier.binaryRead(Parameters.pathToModelFile+".level1");
    System.out.println("HI5\n");

    tagger2=(NETaggerLevel2)Classifier.binaryRead(Parameters.pathToModelFile+".level2");
    System.out.println("Done- loading the tagger");
    }

    public void map(LongWritable key, Text value,
    OutputCollector<Text, Text> output, Reporter reporter ) throws IOException
    {
    String inputline = value.toString();

    /* Processing of the input pair is done here */
    }


    public static void main(String [] args) throws Exception {
    JobConf conf = new JobConf(NerTagger.class);
    conf.setJobName("NerTagger");

    conf.setOutputKeyClass(Text.class);
    conf.setOutputValueClass(IntWritable.class);

    conf.setMapperClass(Map.class);
    conf.setNumReduceTasks(0);

    conf.setInputFormat(TextInputFormat.class);
    conf.setOutputFormat(TextOutputFormat.class);

    conf.set("mapred.job.tracker", "local");
    conf.set("fs.default.name", "file:///");

    DistributedCache.addCacheFile(new
    URI("/home/akhil1988/Ner/OriginalNer/Data/"), conf);
    DistributedCache.addCacheFile(new
    URI("/home/akhil1988/Ner/OriginalNer/Config/"), conf);
    DistributedCache.createSymlink(conf);


    conf.set("mapred.child.java.opts","-Xmx4096m");

    FileInputFormat.setInputPaths(conf, new Path(args[0]));
    FileOutputFormat.setOutputPath(conf, new Path(args[1]));

    System.out.println("HI1\n");

    JobClient.runJob(conf);
    }

    Jason, when the program executes HI1 and HI2 are printed but it does not
    reaches HI3. In the statement
    Parameters.readConfigAndLoadExternalData("Config/allLayer1.config"); it is
    able to access Config/allLayer1.config file (as while executing this
    statement, it prints some messages like which data it is loading, etc.)
    but it gets stuck there(while loading some classifier) and never reaches
    HI3.

    This program runs fine when executed normally(without mapreduce).

    Thanks, Akhil




    jason hadoop wrote:
    Is it possible that your map class is an inner class and not static?
    On Tue, Jun 16, 2009 at 10:51 AM, akhil1988 wrote:


    Hi All,

    I am running my mapred program in local mode by setting
    mapred.jobtracker.local to local mode so that I can debug my code.
    The mapred program is a direct porting of my original sequential code.
    There
    is no reduce phase.
    Basically, I have just put my program in the map class.

    My program takes around 1-2 min. in instantiating the data objects which
    are
    present in the constructor of Map class(it loads some data model files,
    therefore it takes some time). After the instantiation part in the
    constrcutor of Map class the map function is supposed to process the
    input
    split.

    The problem is that the data objects do not get instantiated completely
    and
    in between(whlie it is still in constructor) the program stops giving
    the
    exceptions pasted at bottom.
    The program runs fine without mapreduce and does not require more than
    2GB
    memory, but in mapreduce even after doing export HADOOP_HEAPSIZE=2500(I
    am
    working on machines with 16GB RAM), the program fails. I have also set
    HADOOP_OPTS="-server -XX:-UseGCOverheadLimit" as sometimes I was getting
    GC
    Overhead Limit Exceeded exceptions also.

    Somebody, please help me with this problem: I have trying to debug it
    for
    the last 3 days, but unsuccessful. Thanks!

    java.lang.OutOfMemoryError: Java heap space
    at
    sun.misc.FloatingDecimal.toJavaFormatString(FloatingDecimal.java:889)
    at java.lang.Double.toString(Double.java:179)
    at java.text.DigitList.set(DigitList.java:272)
    at java.text.DecimalFormat.format(DecimalFormat.java:584)
    at java.text.DecimalFormat.format(DecimalFormat.java:507)
    at java.text.NumberFormat.format(NumberFormat.java:269)
    at
    org.apache.hadoop.util.StringUtils.formatPercent(StringUtils.java:110)
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1147)
    at LbjTagger.NerTagger.main(NerTagger.java:109)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at

    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at

    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
    at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

    09/06/16 12:34:41 WARN mapred.LocalJobRunner: job_local_0001
    java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
    at
    org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:81)
    at
    org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
    at
    org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
    at
    org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:83)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328)
    at
    org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)
    Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
    Method)
    at

    sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
    at

    sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
    at
    java.lang.reflect.Constructor.newInstance(Constructor.java:513)
    at
    org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:79)
    ... 5 more
    Caused by: java.lang.ThreadDeath
    at java.lang.Thread.stop(Thread.java:715)
    at
    org.apache.hadoop.mapred.LocalJobRunner.killJob(LocalJobRunner.java:310)
    at
    org.apache.hadoop.mapred.JobClient$NetworkedJob.killJob(JobClient.java:315)
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1224)
    at LbjTagger.NerTagger.main(NerTagger.java:109)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at

    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at

    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
    at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

    --
    View this message in context:
    http://www.nabble.com/Nor-%22OOM-Java-Heap-Space%22-neither-%22GC-OverHead-Limit-Exeeceded%22-tp24059508p24059508.html
    Sent from the Hadoop core-user mailing list archive at Nabble.com.

    --
    Pro Hadoop, a book to guide you from beginner to hadoop mastery,
    http://www.amazon.com/dp/1430219424?tag=jewlerymall
    www.prohadoopbook.com a community for Hadoop Professionals
    --
    View this message in context: http://www.nabble.com/Nor-%22OOM-Java-Heap-Space%22-neither-%22GC-OverHead-Limit-Exeeceded%22-tp24059508p24066426.html
    Sent from the Hadoop core-user mailing list archive at Nabble.com.
  • Jason hadoop at Jun 17, 2009 at 4:43 am
    Something is happening inside of your (Parameters.
    readConfigAndLoadExternalData("Config/allLayer1.config");)
    code, and the framework is killing the job for not heartbeating for 600
    seconds
    On Tue, Jun 16, 2009 at 8:32 PM, akhil1988 wrote:


    One more thing, finally it terminates there (after some time) by giving the
    final Exception:

    java.io.IOException: Job failed!
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1217)
    at LbjTagger.NerTagger.main(NerTagger.java:109)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at

    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at

    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
    at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)


    akhil1988 wrote:
    Thank you Jason for your reply.

    My Map class is an inner class and it is a static class. Here is the
    structure of my code.

    public class NerTagger {

    public static class Map extends MapReduceBase implements
    Mapper<LongWritable, Text, Text, Text>{
    private Text word = new Text();
    private static NETaggerLevel1 tagger1 = new
    NETaggerLevel1();
    private static NETaggerLevel2 tagger2 = new
    NETaggerLevel2();

    Map(){
    System.out.println("HI2\n");

    Parameters.readConfigAndLoadExternalData("Config/allLayer1.config");
    System.out.println("HI3\n");

    Parameters.forceNewSentenceOnLineBreaks=Boolean.parseBoolean("true");

    System.out.println("loading the tagger");

    tagger1=(NETaggerLevel1)Classifier.binaryRead(Parameters.pathToModelFile+".level1");
    System.out.println("HI5\n");

    tagger2=(NETaggerLevel2)Classifier.binaryRead(Parameters.pathToModelFile+".level2");
    System.out.println("Done- loading the tagger");
    }

    public void map(LongWritable key, Text value,
    OutputCollector<Text, Text> output, Reporter reporter ) throws
    IOException
    {
    String inputline = value.toString();

    /* Processing of the input pair is done here */
    }


    public static void main(String [] args) throws Exception {
    JobConf conf = new JobConf(NerTagger.class);
    conf.setJobName("NerTagger");

    conf.setOutputKeyClass(Text.class);
    conf.setOutputValueClass(IntWritable.class);

    conf.setMapperClass(Map.class);
    conf.setNumReduceTasks(0);

    conf.setInputFormat(TextInputFormat.class);
    conf.setOutputFormat(TextOutputFormat.class);

    conf.set("mapred.job.tracker", "local");
    conf.set("fs.default.name", "file:///");

    DistributedCache.addCacheFile(new
    URI("/home/akhil1988/Ner/OriginalNer/Data/"), conf);
    DistributedCache.addCacheFile(new
    URI("/home/akhil1988/Ner/OriginalNer/Config/"), conf);
    DistributedCache.createSymlink(conf);


    conf.set("mapred.child.java.opts","-Xmx4096m");

    FileInputFormat.setInputPaths(conf, new Path(args[0]));
    FileOutputFormat.setOutputPath(conf, new Path(args[1]));

    System.out.println("HI1\n");

    JobClient.runJob(conf);
    }

    Jason, when the program executes HI1 and HI2 are printed but it does not
    reaches HI3. In the statement
    Parameters.readConfigAndLoadExternalData("Config/allLayer1.config"); it is
    able to access Config/allLayer1.config file (as while executing this
    statement, it prints some messages like which data it is loading, etc.)
    but it gets stuck there(while loading some classifier) and never reaches
    HI3.

    This program runs fine when executed normally(without mapreduce).

    Thanks, Akhil




    jason hadoop wrote:
    Is it possible that your map class is an inner class and not static?
    On Tue, Jun 16, 2009 at 10:51 AM, akhil1988 wrote:


    Hi All,

    I am running my mapred program in local mode by setting
    mapred.jobtracker.local to local mode so that I can debug my code.
    The mapred program is a direct porting of my original sequential code.
    There
    is no reduce phase.
    Basically, I have just put my program in the map class.

    My program takes around 1-2 min. in instantiating the data objects
    which
    are
    present in the constructor of Map class(it loads some data model files,
    therefore it takes some time). After the instantiation part in the
    constrcutor of Map class the map function is supposed to process the
    input
    split.

    The problem is that the data objects do not get instantiated completely
    and
    in between(whlie it is still in constructor) the program stops giving
    the
    exceptions pasted at bottom.
    The program runs fine without mapreduce and does not require more than
    2GB
    memory, but in mapreduce even after doing export HADOOP_HEAPSIZE=2500(I
    am
    working on machines with 16GB RAM), the program fails. I have also set
    HADOOP_OPTS="-server -XX:-UseGCOverheadLimit" as sometimes I was
    getting
    GC
    Overhead Limit Exceeded exceptions also.

    Somebody, please help me with this problem: I have trying to debug it
    for
    the last 3 days, but unsuccessful. Thanks!

    java.lang.OutOfMemoryError: Java heap space
    at
    sun.misc.FloatingDecimal.toJavaFormatString(FloatingDecimal.java:889)
    at java.lang.Double.toString(Double.java:179)
    at java.text.DigitList.set(DigitList.java:272)
    at java.text.DecimalFormat.format(DecimalFormat.java:584)
    at java.text.DecimalFormat.format(DecimalFormat.java:507)
    at java.text.NumberFormat.format(NumberFormat.java:269)
    at
    org.apache.hadoop.util.StringUtils.formatPercent(StringUtils.java:110)
    at
    org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1147)
    at LbjTagger.NerTagger.main(NerTagger.java:109)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
    at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

    09/06/16 12:34:41 WARN mapred.LocalJobRunner: job_local_0001
    java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
    at
    org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:81)
    at
    org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
    at
    org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
    at
    org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:83)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328)
    at
    org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)
    Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
    Method)
    at
    sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
    at
    sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
    at
    java.lang.reflect.Constructor.newInstance(Constructor.java:513)
    at
    org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:79)
    ... 5 more
    Caused by: java.lang.ThreadDeath
    at java.lang.Thread.stop(Thread.java:715)
    at
    org.apache.hadoop.mapred.LocalJobRunner.killJob(LocalJobRunner.java:310)
    at
    org.apache.hadoop.mapred.JobClient$NetworkedJob.killJob(JobClient.java:315)
    at
    org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1224)
    at LbjTagger.NerTagger.main(NerTagger.java:109)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
    at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

    --
    View this message in context:
    http://www.nabble.com/Nor-%22OOM-Java-Heap-Space%22-neither-%22GC-OverHead-Limit-Exeeceded%22-tp24059508p24059508.html
    Sent from the Hadoop core-user mailing list archive at Nabble.com.

    --
    Pro Hadoop, a book to guide you from beginner to hadoop mastery,
    http://www.amazon.com/dp/1430219424?tag=jewlerymall
    www.prohadoopbook.com a community for Hadoop Professionals
    --
    View this message in context:
    http://www.nabble.com/Nor-%22OOM-Java-Heap-Space%22-neither-%22GC-OverHead-Limit-Exeeceded%22-tp24059508p24066426.html
    Sent from the Hadoop core-user mailing list archive at Nabble.com.

    --
    Pro Hadoop, a book to guide you from beginner to hadoop mastery,
    http://www.amazon.com/dp/1430219424?tag=jewlerymall
    www.prohadoopbook.com a community for Hadoop Professionals
  • Akhil1988 at Jun 17, 2009 at 1:56 pm
    Thanks Jason.

    I went inside the code of the statement and found out that it eventually
    makes some binaryRead function call to read a binary file and there it
    strucks.

    Do you know whether there is any problem in giving a binary file for
    addition to the distributed cache.
    In the statement DistributedCache.addCacheFile(new
    URI("/home/akhil1988/Ner/OriginalNer/Data/"), conf); Data is a directory
    which contains some text as well as some binary files. In the statement
    Parameters.readConfigAndLoadExternalData("Config/allLayer1.config"); I can
    see(in the output messages) that it is able to read the text files but it
    gets struck at the binary files.

    So, I think here the problem is: it is not able to read the binary files
    which either have not been transferred to the cache or a binary file cannot
    be read.

    Do you know the solution to this?

    Thanks,
    Akhil


    jason hadoop wrote:
    Something is happening inside of your (Parameters.
    readConfigAndLoadExternalData("Config/allLayer1.config");)
    code, and the framework is killing the job for not heartbeating for 600
    seconds
    On Tue, Jun 16, 2009 at 8:32 PM, akhil1988 wrote:


    One more thing, finally it terminates there (after some time) by giving
    the
    final Exception:

    java.io.IOException: Job failed!
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1217)
    at LbjTagger.NerTagger.main(NerTagger.java:109)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at

    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at

    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
    at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)


    akhil1988 wrote:
    Thank you Jason for your reply.

    My Map class is an inner class and it is a static class. Here is the
    structure of my code.

    public class NerTagger {

    public static class Map extends MapReduceBase implements
    Mapper<LongWritable, Text, Text, Text>{
    private Text word = new Text();
    private static NETaggerLevel1 tagger1 = new
    NETaggerLevel1();
    private static NETaggerLevel2 tagger2 = new
    NETaggerLevel2();

    Map(){
    System.out.println("HI2\n");

    Parameters.readConfigAndLoadExternalData("Config/allLayer1.config");
    System.out.println("HI3\n");

    Parameters.forceNewSentenceOnLineBreaks=Boolean.parseBoolean("true");

    System.out.println("loading the tagger");

    tagger1=(NETaggerLevel1)Classifier.binaryRead(Parameters.pathToModelFile+".level1");
    System.out.println("HI5\n");

    tagger2=(NETaggerLevel2)Classifier.binaryRead(Parameters.pathToModelFile+".level2");
    System.out.println("Done- loading the tagger");
    }

    public void map(LongWritable key, Text value,
    OutputCollector<Text, Text> output, Reporter reporter ) throws
    IOException
    {
    String inputline = value.toString();

    /* Processing of the input pair is done here */
    }


    public static void main(String [] args) throws Exception {
    JobConf conf = new JobConf(NerTagger.class);
    conf.setJobName("NerTagger");

    conf.setOutputKeyClass(Text.class);
    conf.setOutputValueClass(IntWritable.class);

    conf.setMapperClass(Map.class);
    conf.setNumReduceTasks(0);

    conf.setInputFormat(TextInputFormat.class);
    conf.setOutputFormat(TextOutputFormat.class);

    conf.set("mapred.job.tracker", "local");
    conf.set("fs.default.name", "file:///");

    DistributedCache.addCacheFile(new
    URI("/home/akhil1988/Ner/OriginalNer/Data/"), conf);
    DistributedCache.addCacheFile(new
    URI("/home/akhil1988/Ner/OriginalNer/Config/"), conf);
    DistributedCache.createSymlink(conf);


    conf.set("mapred.child.java.opts","-Xmx4096m");

    FileInputFormat.setInputPaths(conf, new Path(args[0]));
    FileOutputFormat.setOutputPath(conf, new
    Path(args[1]));
    System.out.println("HI1\n");

    JobClient.runJob(conf);
    }

    Jason, when the program executes HI1 and HI2 are printed but it does not
    reaches HI3. In the statement
    Parameters.readConfigAndLoadExternalData("Config/allLayer1.config"); it is
    able to access Config/allLayer1.config file (as while executing this
    statement, it prints some messages like which data it is loading, etc.)
    but it gets stuck there(while loading some classifier) and never reaches
    HI3.

    This program runs fine when executed normally(without mapreduce).

    Thanks, Akhil




    jason hadoop wrote:
    Is it possible that your map class is an inner class and not static?

    On Tue, Jun 16, 2009 at 10:51 AM, akhil1988 <[email protected]>
    wrote:
    Hi All,

    I am running my mapred program in local mode by setting
    mapred.jobtracker.local to local mode so that I can debug my code.
    The mapred program is a direct porting of my original sequential
    code.
    There
    is no reduce phase.
    Basically, I have just put my program in the map class.

    My program takes around 1-2 min. in instantiating the data objects
    which
    are
    present in the constructor of Map class(it loads some data model
    files,
    therefore it takes some time). After the instantiation part in the
    constrcutor of Map class the map function is supposed to process the
    input
    split.

    The problem is that the data objects do not get instantiated
    completely
    and
    in between(whlie it is still in constructor) the program stops giving
    the
    exceptions pasted at bottom.
    The program runs fine without mapreduce and does not require more
    than
    2GB
    memory, but in mapreduce even after doing export
    HADOOP_HEAPSIZE=2500(I
    am
    working on machines with 16GB RAM), the program fails. I have also
    set
    HADOOP_OPTS="-server -XX:-UseGCOverheadLimit" as sometimes I was
    getting
    GC
    Overhead Limit Exceeded exceptions also.

    Somebody, please help me with this problem: I have trying to debug it
    for
    the last 3 days, but unsuccessful. Thanks!

    java.lang.OutOfMemoryError: Java heap space
    at
    sun.misc.FloatingDecimal.toJavaFormatString(FloatingDecimal.java:889)
    at java.lang.Double.toString(Double.java:179)
    at java.text.DigitList.set(DigitList.java:272)
    at java.text.DecimalFormat.format(DecimalFormat.java:584)
    at java.text.DecimalFormat.format(DecimalFormat.java:507)
    at java.text.NumberFormat.format(NumberFormat.java:269)
    at
    org.apache.hadoop.util.StringUtils.formatPercent(StringUtils.java:110)
    at
    org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1147)
    at LbjTagger.NerTagger.main(NerTagger.java:109)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
    at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

    09/06/16 12:34:41 WARN mapred.LocalJobRunner: job_local_0001
    java.lang.RuntimeException:
    java.lang.reflect.InvocationTargetException
    at
    org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:81)
    at
    org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
    at
    org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
    at
    org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:83)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328)
    at
    org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)
    Caused by: java.lang.reflect.InvocationTargetException
    at
    sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
    Method)
    at
    sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
    at
    sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
    at
    java.lang.reflect.Constructor.newInstance(Constructor.java:513)
    at
    org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:79)
    ... 5 more
    Caused by: java.lang.ThreadDeath
    at java.lang.Thread.stop(Thread.java:715)
    at
    org.apache.hadoop.mapred.LocalJobRunner.killJob(LocalJobRunner.java:310)
    at
    org.apache.hadoop.mapred.JobClient$NetworkedJob.killJob(JobClient.java:315)
    at
    org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1224)
    at LbjTagger.NerTagger.main(NerTagger.java:109)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
    at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

    --
    View this message in context:
    http://www.nabble.com/Nor-%22OOM-Java-Heap-Space%22-neither-%22GC-OverHead-Limit-Exeeceded%22-tp24059508p24059508.html
    Sent from the Hadoop core-user mailing list archive at Nabble.com.

    --
    Pro Hadoop, a book to guide you from beginner to hadoop mastery,
    http://www.amazon.com/dp/1430219424?tag=jewlerymall
    www.prohadoopbook.com a community for Hadoop Professionals
    --
    View this message in context:
    http://www.nabble.com/Nor-%22OOM-Java-Heap-Space%22-neither-%22GC-OverHead-Limit-Exeeceded%22-tp24059508p24066426.html
    Sent from the Hadoop core-user mailing list archive at Nabble.com.

    --
    Pro Hadoop, a book to guide you from beginner to hadoop mastery,
    http://www.amazon.com/dp/1430219424?tag=jewlerymall
    www.prohadoopbook.com a community for Hadoop Professionals
    --
    View this message in context: http://www.nabble.com/Nor-%22OOM-Java-Heap-Space%22-neither-%22GC-OverHead-Limit-Exeeceded%22-tp24059508p24074211.html
    Sent from the Hadoop core-user mailing list archive at Nabble.com.
  • Jason hadoop at Jun 17, 2009 at 2:12 pm
    I have only ever used the distributed cache to add files, including binary
    files such as shared libraries.
    It looks like you are adding a directory.

    The DistributedCache is not generally used for passing data, but for passing
    file names.
    The files must be stored in a shared file system (hdfs for simplicity)
    already.

    The distributed cache makes the names available to the tasks, and the the
    files are extracted from hdfs and stored in the task local work area on each
    task tracker node.
    It looks like you may be storing the contents of your files in the
    distributed cache.
    On Wed, Jun 17, 2009 at 6:56 AM, akhil1988 wrote:


    Thanks Jason.

    I went inside the code of the statement and found out that it eventually
    makes some binaryRead function call to read a binary file and there it
    strucks.

    Do you know whether there is any problem in giving a binary file for
    addition to the distributed cache.
    In the statement DistributedCache.addCacheFile(new
    URI("/home/akhil1988/Ner/OriginalNer/Data/"), conf); Data is a directory
    which contains some text as well as some binary files. In the statement
    Parameters.readConfigAndLoadExternalData("Config/allLayer1.config"); I can
    see(in the output messages) that it is able to read the text files but it
    gets struck at the binary files.

    So, I think here the problem is: it is not able to read the binary files
    which either have not been transferred to the cache or a binary file cannot
    be read.

    Do you know the solution to this?

    Thanks,
    Akhil


    jason hadoop wrote:
    Something is happening inside of your (Parameters.
    readConfigAndLoadExternalData("Config/allLayer1.config");)
    code, and the framework is killing the job for not heartbeating for 600
    seconds
    On Tue, Jun 16, 2009 at 8:32 PM, akhil1988 wrote:


    One more thing, finally it terminates there (after some time) by giving
    the
    final Exception:

    java.io.IOException: Job failed!
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1217)
    at LbjTagger.NerTagger.main(NerTagger.java:109)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
    at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)


    akhil1988 wrote:
    Thank you Jason for your reply.

    My Map class is an inner class and it is a static class. Here is the
    structure of my code.

    public class NerTagger {

    public static class Map extends MapReduceBase implements
    Mapper<LongWritable, Text, Text, Text>{
    private Text word = new Text();
    private static NETaggerLevel1 tagger1 = new
    NETaggerLevel1();
    private static NETaggerLevel2 tagger2 = new
    NETaggerLevel2();

    Map(){
    System.out.println("HI2\n");

    Parameters.readConfigAndLoadExternalData("Config/allLayer1.config");
    System.out.println("HI3\n");

    Parameters.forceNewSentenceOnLineBreaks=Boolean.parseBoolean("true");

    System.out.println("loading the tagger");
    tagger1=(NETaggerLevel1)Classifier.binaryRead(Parameters.pathToModelFile+".level1");
    System.out.println("HI5\n");
    tagger2=(NETaggerLevel2)Classifier.binaryRead(Parameters.pathToModelFile+".level2");
    System.out.println("Done- loading the
    tagger");
    }

    public void map(LongWritable key, Text value,
    OutputCollector<Text, Text> output, Reporter reporter ) throws
    IOException
    {
    String inputline = value.toString();

    /* Processing of the input pair is done here
    */
    }


    public static void main(String [] args) throws Exception {
    JobConf conf = new JobConf(NerTagger.class);
    conf.setJobName("NerTagger");

    conf.setOutputKeyClass(Text.class);
    conf.setOutputValueClass(IntWritable.class);

    conf.setMapperClass(Map.class);
    conf.setNumReduceTasks(0);

    conf.setInputFormat(TextInputFormat.class);
    conf.setOutputFormat(TextOutputFormat.class);

    conf.set("mapred.job.tracker", "local");
    conf.set("fs.default.name", "file:///");

    DistributedCache.addCacheFile(new
    URI("/home/akhil1988/Ner/OriginalNer/Data/"), conf);
    DistributedCache.addCacheFile(new
    URI("/home/akhil1988/Ner/OriginalNer/Config/"), conf);
    DistributedCache.createSymlink(conf);


    conf.set("mapred.child.java.opts","-Xmx4096m");

    FileInputFormat.setInputPaths(conf, new
    Path(args[0]));
    FileOutputFormat.setOutputPath(conf, new
    Path(args[1]));
    System.out.println("HI1\n");

    JobClient.runJob(conf);
    }

    Jason, when the program executes HI1 and HI2 are printed but it does not
    reaches HI3. In the statement
    Parameters.readConfigAndLoadExternalData("Config/allLayer1.config");
    it
    is
    able to access Config/allLayer1.config file (as while executing this
    statement, it prints some messages like which data it is loading,
    etc.)
    but it gets stuck there(while loading some classifier) and never reaches
    HI3.

    This program runs fine when executed normally(without mapreduce).

    Thanks, Akhil




    jason hadoop wrote:
    Is it possible that your map class is an inner class and not static?

    On Tue, Jun 16, 2009 at 10:51 AM, akhil1988 <[email protected]>
    wrote:
    Hi All,

    I am running my mapred program in local mode by setting
    mapred.jobtracker.local to local mode so that I can debug my code.
    The mapred program is a direct porting of my original sequential
    code.
    There
    is no reduce phase.
    Basically, I have just put my program in the map class.

    My program takes around 1-2 min. in instantiating the data objects
    which
    are
    present in the constructor of Map class(it loads some data model
    files,
    therefore it takes some time). After the instantiation part in the
    constrcutor of Map class the map function is supposed to process the
    input
    split.

    The problem is that the data objects do not get instantiated
    completely
    and
    in between(whlie it is still in constructor) the program stops
    giving
    the
    exceptions pasted at bottom.
    The program runs fine without mapreduce and does not require more
    than
    2GB
    memory, but in mapreduce even after doing export
    HADOOP_HEAPSIZE=2500(I
    am
    working on machines with 16GB RAM), the program fails. I have also
    set
    HADOOP_OPTS="-server -XX:-UseGCOverheadLimit" as sometimes I was
    getting
    GC
    Overhead Limit Exceeded exceptions also.

    Somebody, please help me with this problem: I have trying to debug
    it
    for
    the last 3 days, but unsuccessful. Thanks!

    java.lang.OutOfMemoryError: Java heap space
    at
    sun.misc.FloatingDecimal.toJavaFormatString(FloatingDecimal.java:889)
    at java.lang.Double.toString(Double.java:179)
    at java.text.DigitList.set(DigitList.java:272)
    at java.text.DecimalFormat.format(DecimalFormat.java:584)
    at java.text.DecimalFormat.format(DecimalFormat.java:507)
    at java.text.NumberFormat.format(NumberFormat.java:269)
    at
    org.apache.hadoop.util.StringUtils.formatPercent(StringUtils.java:110)
    at
    org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1147)
    at LbjTagger.NerTagger.main(NerTagger.java:109)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
    Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
    at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

    09/06/16 12:34:41 WARN mapred.LocalJobRunner: job_local_0001
    java.lang.RuntimeException:
    java.lang.reflect.InvocationTargetException
    at
    org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:81)
    at
    org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
    at
    org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
    at
    org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:83)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328)
    at
    org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)
    Caused by: java.lang.reflect.InvocationTargetException
    at
    sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
    Method)
    at
    sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
    at
    sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
    at
    java.lang.reflect.Constructor.newInstance(Constructor.java:513)
    at
    org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:79)
    ... 5 more
    Caused by: java.lang.ThreadDeath
    at java.lang.Thread.stop(Thread.java:715)
    at
    org.apache.hadoop.mapred.LocalJobRunner.killJob(LocalJobRunner.java:310)
    at
    org.apache.hadoop.mapred.JobClient$NetworkedJob.killJob(JobClient.java:315)
    at
    org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1224)
    at LbjTagger.NerTagger.main(NerTagger.java:109)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
    Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
    at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

    --
    View this message in context:
    http://www.nabble.com/Nor-%22OOM-Java-Heap-Space%22-neither-%22GC-OverHead-Limit-Exeeceded%22-tp24059508p24059508.html
    Sent from the Hadoop core-user mailing list archive at Nabble.com.

    --
    Pro Hadoop, a book to guide you from beginner to hadoop mastery,
    http://www.amazon.com/dp/1430219424?tag=jewlerymall
    www.prohadoopbook.com a community for Hadoop Professionals
    --
    View this message in context:
    http://www.nabble.com/Nor-%22OOM-Java-Heap-Space%22-neither-%22GC-OverHead-Limit-Exeeceded%22-tp24059508p24066426.html
    Sent from the Hadoop core-user mailing list archive at Nabble.com.

    --
    Pro Hadoop, a book to guide you from beginner to hadoop mastery,
    http://www.amazon.com/dp/1430219424?tag=jewlerymall
    www.prohadoopbook.com a community for Hadoop Professionals
    --
    View this message in context:
    http://www.nabble.com/Nor-%22OOM-Java-Heap-Space%22-neither-%22GC-OverHead-Limit-Exeeceded%22-tp24059508p24074211.html
    Sent from the Hadoop core-user mailing list archive at Nabble.com.

    --
    Pro Hadoop, a book to guide you from beginner to hadoop mastery,
    http://www.amazon.com/dp/1430219424?tag=jewlerymall
    www.prohadoopbook.com a community for Hadoop Professionals
  • Akhil1988 at Jun 17, 2009 at 7:07 pm
    Hi Jason!

    Thanks for going with me to solve my problem.

    To restate things and make it more easier to understand: I am working in
    local mode in the directory which contains the job jar and also the Config
    and Data directories.

    I just removed the following three statements from my code:
    DistributedCache.addCacheFile(new
    URI("/home/akhil1988/Ner/OriginalNer/Data/"), conf);
    DistributedCache.addCacheFile(new
    URI("/home/akhil1988/Ner/OriginalNer/Config/"), conf);
    DistributedCache.createSymlink(conf);
    The program executes till the same point as before now also and terminates.
    That means the above three statements are of no use while working in local
    mode. In local mode, the working directory for the map&reduce tasks becomes
    the current woking direcotry in which you started the hadoop command to
    execute the job.

    Since I have removed the DistributedCache.add..... statements there should
    be no issue whether I am giving a file name or a directory name as argument
    to it. Now it seems to me that there is some problem in reading the binary
    file using binaryRead.

    Please let me know if I am going wrong anywhere.

    Thanks,
    Akhil





    jason hadoop wrote:
    I have only ever used the distributed cache to add files, including binary
    files such as shared libraries.
    It looks like you are adding a directory.

    The DistributedCache is not generally used for passing data, but for
    passing
    file names.
    The files must be stored in a shared file system (hdfs for simplicity)
    already.

    The distributed cache makes the names available to the tasks, and the the
    files are extracted from hdfs and stored in the task local work area on
    each
    task tracker node.
    It looks like you may be storing the contents of your files in the
    distributed cache.
    On Wed, Jun 17, 2009 at 6:56 AM, akhil1988 wrote:


    Thanks Jason.

    I went inside the code of the statement and found out that it eventually
    makes some binaryRead function call to read a binary file and there it
    strucks.

    Do you know whether there is any problem in giving a binary file for
    addition to the distributed cache.
    In the statement DistributedCache.addCacheFile(new
    URI("/home/akhil1988/Ner/OriginalNer/Data/"), conf); Data is a directory
    which contains some text as well as some binary files. In the statement
    Parameters.readConfigAndLoadExternalData("Config/allLayer1.config"); I
    can
    see(in the output messages) that it is able to read the text files but it
    gets struck at the binary files.

    So, I think here the problem is: it is not able to read the binary files
    which either have not been transferred to the cache or a binary file
    cannot
    be read.

    Do you know the solution to this?

    Thanks,
    Akhil


    jason hadoop wrote:
    Something is happening inside of your (Parameters.
    readConfigAndLoadExternalData("Config/allLayer1.config");)
    code, and the framework is killing the job for not heartbeating for 600
    seconds

    On Tue, Jun 16, 2009 at 8:32 PM, akhil1988 <[email protected]>
    wrote:
    One more thing, finally it terminates there (after some time) by
    giving
    the
    final Exception:

    java.io.IOException: Job failed!
    at
    org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1217)
    at LbjTagger.NerTagger.main(NerTagger.java:109)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
    at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)


    akhil1988 wrote:
    Thank you Jason for your reply.

    My Map class is an inner class and it is a static class. Here is the
    structure of my code.

    public class NerTagger {

    public static class Map extends MapReduceBase implements
    Mapper<LongWritable, Text, Text, Text>{
    private Text word = new Text();
    private static NETaggerLevel1 tagger1 = new
    NETaggerLevel1();
    private static NETaggerLevel2 tagger2 = new
    NETaggerLevel2();

    Map(){
    System.out.println("HI2\n");

    Parameters.readConfigAndLoadExternalData("Config/allLayer1.config");
    System.out.println("HI3\n");
    Parameters.forceNewSentenceOnLineBreaks=Boolean.parseBoolean("true");
    System.out.println("loading the tagger");
    tagger1=(NETaggerLevel1)Classifier.binaryRead(Parameters.pathToModelFile+".level1");
    System.out.println("HI5\n");
    tagger2=(NETaggerLevel2)Classifier.binaryRead(Parameters.pathToModelFile+".level2");
    System.out.println("Done- loading the
    tagger");
    }

    public void map(LongWritable key, Text value,
    OutputCollector<Text, Text> output, Reporter reporter ) throws
    IOException
    {
    String inputline = value.toString();

    /* Processing of the input pair is done here
    */
    }


    public static void main(String [] args) throws Exception {
    JobConf conf = new JobConf(NerTagger.class);
    conf.setJobName("NerTagger");

    conf.setOutputKeyClass(Text.class);
    conf.setOutputValueClass(IntWritable.class);

    conf.setMapperClass(Map.class);
    conf.setNumReduceTasks(0);

    conf.setInputFormat(TextInputFormat.class);
    conf.setOutputFormat(TextOutputFormat.class);

    conf.set("mapred.job.tracker", "local");
    conf.set("fs.default.name", "file:///");

    DistributedCache.addCacheFile(new
    URI("/home/akhil1988/Ner/OriginalNer/Data/"), conf);
    DistributedCache.addCacheFile(new
    URI("/home/akhil1988/Ner/OriginalNer/Config/"), conf);
    DistributedCache.createSymlink(conf);


    conf.set("mapred.child.java.opts","-Xmx4096m");

    FileInputFormat.setInputPaths(conf, new
    Path(args[0]));
    FileOutputFormat.setOutputPath(conf, new
    Path(args[1]));
    System.out.println("HI1\n");

    JobClient.runJob(conf);
    }

    Jason, when the program executes HI1 and HI2 are printed but it does not
    reaches HI3. In the statement
    Parameters.readConfigAndLoadExternalData("Config/allLayer1.config");
    it
    is
    able to access Config/allLayer1.config file (as while executing this
    statement, it prints some messages like which data it is loading,
    etc.)
    but it gets stuck there(while loading some classifier) and never reaches
    HI3.

    This program runs fine when executed normally(without mapreduce).

    Thanks, Akhil




    jason hadoop wrote:
    Is it possible that your map class is an inner class and not
    static?
    On Tue, Jun 16, 2009 at 10:51 AM, akhil1988 <[email protected]>
    wrote:
    Hi All,

    I am running my mapred program in local mode by setting
    mapred.jobtracker.local to local mode so that I can debug my code.
    The mapred program is a direct porting of my original sequential
    code.
    There
    is no reduce phase.
    Basically, I have just put my program in the map class.

    My program takes around 1-2 min. in instantiating the data objects
    which
    are
    present in the constructor of Map class(it loads some data model
    files,
    therefore it takes some time). After the instantiation part in the
    constrcutor of Map class the map function is supposed to process
    the
    input
    split.

    The problem is that the data objects do not get instantiated
    completely
    and
    in between(whlie it is still in constructor) the program stops
    giving
    the
    exceptions pasted at bottom.
    The program runs fine without mapreduce and does not require more
    than
    2GB
    memory, but in mapreduce even after doing export
    HADOOP_HEAPSIZE=2500(I
    am
    working on machines with 16GB RAM), the program fails. I have also
    set
    HADOOP_OPTS="-server -XX:-UseGCOverheadLimit" as sometimes I was
    getting
    GC
    Overhead Limit Exceeded exceptions also.

    Somebody, please help me with this problem: I have trying to debug
    it
    for
    the last 3 days, but unsuccessful. Thanks!

    java.lang.OutOfMemoryError: Java heap space
    at
    sun.misc.FloatingDecimal.toJavaFormatString(FloatingDecimal.java:889)
    at java.lang.Double.toString(Double.java:179)
    at java.text.DigitList.set(DigitList.java:272)
    at java.text.DecimalFormat.format(DecimalFormat.java:584)
    at java.text.DecimalFormat.format(DecimalFormat.java:507)
    at java.text.NumberFormat.format(NumberFormat.java:269)
    at
    org.apache.hadoop.util.StringUtils.formatPercent(StringUtils.java:110)
    at
    org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1147)
    at LbjTagger.NerTagger.main(NerTagger.java:109)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
    Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
    at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
    at
    org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at
    org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

    09/06/16 12:34:41 WARN mapred.LocalJobRunner: job_local_0001
    java.lang.RuntimeException:
    java.lang.reflect.InvocationTargetException
    at
    org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:81)
    at
    org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
    at
    org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
    at
    org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:83)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328)
    at
    org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)
    Caused by: java.lang.reflect.InvocationTargetException
    at
    sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
    Method)
    at
    sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
    at
    sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
    at
    java.lang.reflect.Constructor.newInstance(Constructor.java:513)
    at
    org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:79)
    ... 5 more
    Caused by: java.lang.ThreadDeath
    at java.lang.Thread.stop(Thread.java:715)
    at
    org.apache.hadoop.mapred.LocalJobRunner.killJob(LocalJobRunner.java:310)
    at
    org.apache.hadoop.mapred.JobClient$NetworkedJob.killJob(JobClient.java:315)
    at
    org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1224)
    at LbjTagger.NerTagger.main(NerTagger.java:109)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
    Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
    at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
    at
    org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at
    org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

    --
    View this message in context:
    http://www.nabble.com/Nor-%22OOM-Java-Heap-Space%22-neither-%22GC-OverHead-Limit-Exeeceded%22-tp24059508p24059508.html
    Sent from the Hadoop core-user mailing list archive at Nabble.com.

    --
    Pro Hadoop, a book to guide you from beginner to hadoop mastery,
    http://www.amazon.com/dp/1430219424?tag=jewlerymall
    www.prohadoopbook.com a community for Hadoop Professionals
    --
    View this message in context:
    http://www.nabble.com/Nor-%22OOM-Java-Heap-Space%22-neither-%22GC-OverHead-Limit-Exeeceded%22-tp24059508p24066426.html
    Sent from the Hadoop core-user mailing list archive at Nabble.com.

    --
    Pro Hadoop, a book to guide you from beginner to hadoop mastery,
    http://www.amazon.com/dp/1430219424?tag=jewlerymall
    www.prohadoopbook.com a community for Hadoop Professionals
    --
    View this message in context:
    http://www.nabble.com/Nor-%22OOM-Java-Heap-Space%22-neither-%22GC-OverHead-Limit-Exeeceded%22-tp24059508p24074211.html
    Sent from the Hadoop core-user mailing list archive at Nabble.com.

    --
    Pro Hadoop, a book to guide you from beginner to hadoop mastery,
    http://www.amazon.com/dp/1430219424?tag=jewlerymall
    www.prohadoopbook.com a community for Hadoop Professionals
    --
    View this message in context: http://www.nabble.com/Nor-%22OOM-Java-Heap-Space%22-neither-%22GC-OverHead-Limit-Exeeceded%22-tp24059508p24080025.html
    Sent from the Hadoop core-user mailing list archive at Nabble.com.
  • Akhil1988 at Jun 18, 2009 at 8:45 pm
    Hi Jason!

    I finally found out that there was some problem in reserving the HEAPSIZE
    which I have resolved now. Actually we cannot change the HADOOP_HEAPSIZE
    using export from our user account, after we have started the Hadoop. It has
    to changed by the root.

    I have a user account on the cluster and I was trying to change the
    Hadoop_heapsize from my user account using 'export' which had no effect.
    So I had to request my cluster administrator to increase the HADOOP_HEAPSIZE
    in hadoop-env.sh and then restart hadoop. Now the program is running
    absolutely fine. Thanks for your help.

    One thing that I would like to ask you is that can we use DistributerCache
    for transferring directories to the local cache of the tasks?

    Thanks,
    Akhil



    akhil1988 wrote:
    Hi Jason!

    Thanks for going with me to solve my problem.

    To restate things and make it more easier to understand: I am working in
    local mode in the directory which contains the job jar and also the Config
    and Data directories.

    I just removed the following three statements from my code:
    DistributedCache.addCacheFile(new
    URI("/home/akhil1988/Ner/OriginalNer/Data/"), conf);
    DistributedCache.addCacheFile(new
    URI("/home/akhil1988/Ner/OriginalNer/Config/"), conf);
    DistributedCache.createSymlink(conf);
    The program executes till the same point as before now also and
    terminates. That means the above three statements are of no use while
    working in local mode. In local mode, the working directory for the
    map&reduce tasks becomes the current woking direcotry in which you started
    the hadoop command to execute the job.

    Since I have removed the DistributedCache.add..... statements there should
    be no issue whether I am giving a file name or a directory name as
    argument to it. Now it seems to me that there is some problem in reading
    the binary file using binaryRead.

    Please let me know if I am going wrong anywhere.

    Thanks,
    Akhil





    jason hadoop wrote:
    I have only ever used the distributed cache to add files, including
    binary
    files such as shared libraries.
    It looks like you are adding a directory.

    The DistributedCache is not generally used for passing data, but for
    passing
    file names.
    The files must be stored in a shared file system (hdfs for simplicity)
    already.

    The distributed cache makes the names available to the tasks, and the the
    files are extracted from hdfs and stored in the task local work area on
    each
    task tracker node.
    It looks like you may be storing the contents of your files in the
    distributed cache.
    On Wed, Jun 17, 2009 at 6:56 AM, akhil1988 wrote:


    Thanks Jason.

    I went inside the code of the statement and found out that it eventually
    makes some binaryRead function call to read a binary file and there it
    strucks.

    Do you know whether there is any problem in giving a binary file for
    addition to the distributed cache.
    In the statement DistributedCache.addCacheFile(new
    URI("/home/akhil1988/Ner/OriginalNer/Data/"), conf); Data is a directory
    which contains some text as well as some binary files. In the statement
    Parameters.readConfigAndLoadExternalData("Config/allLayer1.config"); I
    can
    see(in the output messages) that it is able to read the text files but
    it
    gets struck at the binary files.

    So, I think here the problem is: it is not able to read the binary files
    which either have not been transferred to the cache or a binary file
    cannot
    be read.

    Do you know the solution to this?

    Thanks,
    Akhil


    jason hadoop wrote:
    Something is happening inside of your (Parameters.
    readConfigAndLoadExternalData("Config/allLayer1.config");)
    code, and the framework is killing the job for not heartbeating for 600
    seconds

    On Tue, Jun 16, 2009 at 8:32 PM, akhil1988 <[email protected]>
    wrote:
    One more thing, finally it terminates there (after some time) by
    giving
    the
    final Exception:

    java.io.IOException: Job failed!
    at
    org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1217)
    at LbjTagger.NerTagger.main(NerTagger.java:109)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
    at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)


    akhil1988 wrote:
    Thank you Jason for your reply.

    My Map class is an inner class and it is a static class. Here is
    the
    structure of my code.

    public class NerTagger {

    public static class Map extends MapReduceBase implements
    Mapper<LongWritable, Text, Text, Text>{
    private Text word = new Text();
    private static NETaggerLevel1 tagger1 = new
    NETaggerLevel1();
    private static NETaggerLevel2 tagger2 = new
    NETaggerLevel2();

    Map(){
    System.out.println("HI2\n");
    Parameters.readConfigAndLoadExternalData("Config/allLayer1.config");
    System.out.println("HI3\n");
    Parameters.forceNewSentenceOnLineBreaks=Boolean.parseBoolean("true");
    System.out.println("loading the tagger");
    tagger1=(NETaggerLevel1)Classifier.binaryRead(Parameters.pathToModelFile+".level1");
    System.out.println("HI5\n");
    tagger2=(NETaggerLevel2)Classifier.binaryRead(Parameters.pathToModelFile+".level2");
    System.out.println("Done- loading the
    tagger");
    }

    public void map(LongWritable key, Text value,
    OutputCollector<Text, Text> output, Reporter reporter ) throws
    IOException
    {
    String inputline = value.toString();

    /* Processing of the input pair is done
    here
    */
    }


    public static void main(String [] args) throws Exception {
    JobConf conf = new JobConf(NerTagger.class);
    conf.setJobName("NerTagger");

    conf.setOutputKeyClass(Text.class);
    conf.setOutputValueClass(IntWritable.class);

    conf.setMapperClass(Map.class);
    conf.setNumReduceTasks(0);

    conf.setInputFormat(TextInputFormat.class);
    conf.setOutputFormat(TextOutputFormat.class);

    conf.set("mapred.job.tracker", "local");
    conf.set("fs.default.name", "file:///");

    DistributedCache.addCacheFile(new
    URI("/home/akhil1988/Ner/OriginalNer/Data/"), conf);
    DistributedCache.addCacheFile(new
    URI("/home/akhil1988/Ner/OriginalNer/Config/"), conf);
    DistributedCache.createSymlink(conf);


    conf.set("mapred.child.java.opts","-Xmx4096m");

    FileInputFormat.setInputPaths(conf, new
    Path(args[0]));
    FileOutputFormat.setOutputPath(conf, new
    Path(args[1]));
    System.out.println("HI1\n");

    JobClient.runJob(conf);
    }

    Jason, when the program executes HI1 and HI2 are printed but it
    does
    not
    reaches HI3. In the statement
    Parameters.readConfigAndLoadExternalData("Config/allLayer1.config");
    it
    is
    able to access Config/allLayer1.config file (as while executing
    this
    statement, it prints some messages like which data it is loading,
    etc.)
    but it gets stuck there(while loading some classifier) and never reaches
    HI3.

    This program runs fine when executed normally(without mapreduce).

    Thanks, Akhil




    jason hadoop wrote:
    Is it possible that your map class is an inner class and not
    static?
    On Tue, Jun 16, 2009 at 10:51 AM, akhil1988 <[email protected]>
    wrote:
    Hi All,

    I am running my mapred program in local mode by setting
    mapred.jobtracker.local to local mode so that I can debug my
    code.
    The mapred program is a direct porting of my original sequential
    code.
    There
    is no reduce phase.
    Basically, I have just put my program in the map class.

    My program takes around 1-2 min. in instantiating the data
    objects
    which
    are
    present in the constructor of Map class(it loads some data model
    files,
    therefore it takes some time). After the instantiation part in
    the
    constrcutor of Map class the map function is supposed to process
    the
    input
    split.

    The problem is that the data objects do not get instantiated
    completely
    and
    in between(whlie it is still in constructor) the program stops
    giving
    the
    exceptions pasted at bottom.
    The program runs fine without mapreduce and does not require more
    than
    2GB
    memory, but in mapreduce even after doing export
    HADOOP_HEAPSIZE=2500(I
    am
    working on machines with 16GB RAM), the program fails. I have
    also
    set
    HADOOP_OPTS="-server -XX:-UseGCOverheadLimit" as sometimes I was
    getting
    GC
    Overhead Limit Exceeded exceptions also.

    Somebody, please help me with this problem: I have trying to
    debug
    it
    for
    the last 3 days, but unsuccessful. Thanks!

    java.lang.OutOfMemoryError: Java heap space
    at
    sun.misc.FloatingDecimal.toJavaFormatString(FloatingDecimal.java:889)
    at java.lang.Double.toString(Double.java:179)
    at java.text.DigitList.set(DigitList.java:272)
    at java.text.DecimalFormat.format(DecimalFormat.java:584)
    at java.text.DecimalFormat.format(DecimalFormat.java:507)
    at java.text.NumberFormat.format(NumberFormat.java:269)
    at
    org.apache.hadoop.util.StringUtils.formatPercent(StringUtils.java:110)
    at
    org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1147)
    at LbjTagger.NerTagger.main(NerTagger.java:109)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
    Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
    at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
    at
    org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at
    org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at
    org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
    09/06/16 12:34:41 WARN mapred.LocalJobRunner: job_local_0001
    java.lang.RuntimeException:
    java.lang.reflect.InvocationTargetException
    at
    org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:81)
    at
    org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
    at
    org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
    at
    org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:83)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328)
    at
    org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)
    Caused by: java.lang.reflect.InvocationTargetException
    at
    sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
    Method)
    at
    sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
    at
    sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
    at
    java.lang.reflect.Constructor.newInstance(Constructor.java:513)
    at
    org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:79)
    ... 5 more
    Caused by: java.lang.ThreadDeath
    at java.lang.Thread.stop(Thread.java:715)
    at
    org.apache.hadoop.mapred.LocalJobRunner.killJob(LocalJobRunner.java:310)
    at
    org.apache.hadoop.mapred.JobClient$NetworkedJob.killJob(JobClient.java:315)
    at
    org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1224)
    at LbjTagger.NerTagger.main(NerTagger.java:109)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
    Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
    at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
    at
    org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at
    org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at
    org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
    --
    View this message in context:
    http://www.nabble.com/Nor-%22OOM-Java-Heap-Space%22-neither-%22GC-OverHead-Limit-Exeeceded%22-tp24059508p24059508.html
    Sent from the Hadoop core-user mailing list archive at
    Nabble.com.

    --
    Pro Hadoop, a book to guide you from beginner to hadoop mastery,
    http://www.amazon.com/dp/1430219424?tag=jewlerymall
    www.prohadoopbook.com a community for Hadoop Professionals
    --
    View this message in context:
    http://www.nabble.com/Nor-%22OOM-Java-Heap-Space%22-neither-%22GC-OverHead-Limit-Exeeceded%22-tp24059508p24066426.html
    Sent from the Hadoop core-user mailing list archive at Nabble.com.

    --
    Pro Hadoop, a book to guide you from beginner to hadoop mastery,
    http://www.amazon.com/dp/1430219424?tag=jewlerymall
    www.prohadoopbook.com a community for Hadoop Professionals
    --
    View this message in context:
    http://www.nabble.com/Nor-%22OOM-Java-Heap-Space%22-neither-%22GC-OverHead-Limit-Exeeceded%22-tp24059508p24074211.html
    Sent from the Hadoop core-user mailing list archive at Nabble.com.

    --
    Pro Hadoop, a book to guide you from beginner to hadoop mastery,
    http://www.amazon.com/dp/1430219424?tag=jewlerymall
    www.prohadoopbook.com a community for Hadoop Professionals
    --
    View this message in context: http://www.nabble.com/Nor-%22OOM-Java-Heap-Space%22-neither-%22GC-OverHead-Limit-Exeeceded%22-tp24059508p24099611.html
    Sent from the Hadoop core-user mailing list archive at Nabble.com.
  • Jason hadoop at Jun 19, 2009 at 3:07 pm
    You can pass the -D mapred.child.java.opts=-Xmx[some value] as part of your
    job, or set it in your job conf before your task is submitted. THen the per
    task jvm's will use that string as part of the jvm initialization paramter
    set

    The distributed cache is used for making files and archives that are stored
    in hdfs, available in the local file system working area of your tasks.

    The GenericOptionsParser class that most Hadoop user interfaces use,
    provides a couple of command line arguments that allow you to specify local
    file system files which are copied into hdfs and then made avilable as
    stated above
    -files and libjars are the to arguments.

    My book has a solid discussion and example set for the distributed cache in
    chapter 5.

    On Thu, Jun 18, 2009 at 1:45 PM, akhil1988 wrote:


    Hi Jason!

    I finally found out that there was some problem in reserving the HEAPSIZE
    which I have resolved now. Actually we cannot change the HADOOP_HEAPSIZE
    using export from our user account, after we have started the Hadoop. It
    has
    to changed by the root.

    I have a user account on the cluster and I was trying to change the
    Hadoop_heapsize from my user account using 'export' which had no effect.
    So I had to request my cluster administrator to increase the
    HADOOP_HEAPSIZE
    in hadoop-env.sh and then restart hadoop. Now the program is running
    absolutely fine. Thanks for your help.

    One thing that I would like to ask you is that can we use DistributerCache
    for transferring directories to the local cache of the tasks?

    Thanks,
    Akhil



    akhil1988 wrote:
    Hi Jason!

    Thanks for going with me to solve my problem.

    To restate things and make it more easier to understand: I am working in
    local mode in the directory which contains the job jar and also the Config
    and Data directories.

    I just removed the following three statements from my code:
    DistributedCache.addCacheFile(new
    URI("/home/akhil1988/Ner/OriginalNer/Data/"), conf);
    DistributedCache.addCacheFile(new
    URI("/home/akhil1988/Ner/OriginalNer/Config/"), conf);
    DistributedCache.createSymlink(conf);
    The program executes till the same point as before now also and
    terminates. That means the above three statements are of no use while
    working in local mode. In local mode, the working directory for the
    map&reduce tasks becomes the current woking direcotry in which you started
    the hadoop command to execute the job.

    Since I have removed the DistributedCache.add..... statements there should
    be no issue whether I am giving a file name or a directory name as
    argument to it. Now it seems to me that there is some problem in reading
    the binary file using binaryRead.

    Please let me know if I am going wrong anywhere.

    Thanks,
    Akhil





    jason hadoop wrote:
    I have only ever used the distributed cache to add files, including
    binary
    files such as shared libraries.
    It looks like you are adding a directory.

    The DistributedCache is not generally used for passing data, but for
    passing
    file names.
    The files must be stored in a shared file system (hdfs for simplicity)
    already.

    The distributed cache makes the names available to the tasks, and the
    the
    files are extracted from hdfs and stored in the task local work area on
    each
    task tracker node.
    It looks like you may be storing the contents of your files in the
    distributed cache.
    On Wed, Jun 17, 2009 at 6:56 AM, akhil1988 wrote:


    Thanks Jason.

    I went inside the code of the statement and found out that it
    eventually
    makes some binaryRead function call to read a binary file and there it
    strucks.

    Do you know whether there is any problem in giving a binary file for
    addition to the distributed cache.
    In the statement DistributedCache.addCacheFile(new
    URI("/home/akhil1988/Ner/OriginalNer/Data/"), conf); Data is a
    directory
    which contains some text as well as some binary files. In the statement
    Parameters.readConfigAndLoadExternalData("Config/allLayer1.config"); I
    can
    see(in the output messages) that it is able to read the text files but
    it
    gets struck at the binary files.

    So, I think here the problem is: it is not able to read the binary
    files
    which either have not been transferred to the cache or a binary file
    cannot
    be read.

    Do you know the solution to this?

    Thanks,
    Akhil


    jason hadoop wrote:
    Something is happening inside of your (Parameters.
    readConfigAndLoadExternalData("Config/allLayer1.config");)
    code, and the framework is killing the job for not heartbeating for 600
    seconds

    On Tue, Jun 16, 2009 at 8:32 PM, akhil1988 <[email protected]>
    wrote:
    One more thing, finally it terminates there (after some time) by
    giving
    the
    final Exception:

    java.io.IOException: Job failed!
    at
    org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1217)
    at LbjTagger.NerTagger.main(NerTagger.java:109)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
    Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
    at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)


    akhil1988 wrote:
    Thank you Jason for your reply.

    My Map class is an inner class and it is a static class. Here is
    the
    structure of my code.

    public class NerTagger {

    public static class Map extends MapReduceBase implements
    Mapper<LongWritable, Text, Text, Text>{
    private Text word = new Text();
    private static NETaggerLevel1 tagger1 = new
    NETaggerLevel1();
    private static NETaggerLevel2 tagger2 = new
    NETaggerLevel2();

    Map(){
    System.out.println("HI2\n");
    Parameters.readConfigAndLoadExternalData("Config/allLayer1.config");
    System.out.println("HI3\n");
    Parameters.forceNewSentenceOnLineBreaks=Boolean.parseBoolean("true");
    System.out.println("loading the tagger");
    tagger1=(NETaggerLevel1)Classifier.binaryRead(Parameters.pathToModelFile+".level1");
    System.out.println("HI5\n");
    tagger2=(NETaggerLevel2)Classifier.binaryRead(Parameters.pathToModelFile+".level2");
    System.out.println("Done- loading the
    tagger");
    }

    public void map(LongWritable key, Text value,
    OutputCollector<Text, Text> output, Reporter reporter ) throws
    IOException
    {
    String inputline = value.toString();

    /* Processing of the input pair is done
    here
    */
    }


    public static void main(String [] args) throws Exception {
    JobConf conf = new JobConf(NerTagger.class);
    conf.setJobName("NerTagger");

    conf.setOutputKeyClass(Text.class);
    conf.setOutputValueClass(IntWritable.class);

    conf.setMapperClass(Map.class);
    conf.setNumReduceTasks(0);

    conf.setInputFormat(TextInputFormat.class);
    conf.setOutputFormat(TextOutputFormat.class);

    conf.set("mapred.job.tracker", "local");
    conf.set("fs.default.name", "file:///");

    DistributedCache.addCacheFile(new
    URI("/home/akhil1988/Ner/OriginalNer/Data/"), conf);
    DistributedCache.addCacheFile(new
    URI("/home/akhil1988/Ner/OriginalNer/Config/"), conf);
    DistributedCache.createSymlink(conf);


    conf.set("mapred.child.java.opts","-Xmx4096m");

    FileInputFormat.setInputPaths(conf, new
    Path(args[0]));
    FileOutputFormat.setOutputPath(conf, new
    Path(args[1]));
    System.out.println("HI1\n");

    JobClient.runJob(conf);
    }

    Jason, when the program executes HI1 and HI2 are printed but it
    does
    not
    reaches HI3. In the statement
    Parameters.readConfigAndLoadExternalData("Config/allLayer1.config");
    it
    is
    able to access Config/allLayer1.config file (as while executing
    this
    statement, it prints some messages like which data it is loading,
    etc.)
    but it gets stuck there(while loading some classifier) and never reaches
    HI3.

    This program runs fine when executed normally(without mapreduce).

    Thanks, Akhil




    jason hadoop wrote:
    Is it possible that your map class is an inner class and not
    static?
    On Tue, Jun 16, 2009 at 10:51 AM, akhil1988 <
    [email protected]>
    wrote:
    Hi All,

    I am running my mapred program in local mode by setting
    mapred.jobtracker.local to local mode so that I can debug my
    code.
    The mapred program is a direct porting of my original sequential
    code.
    There
    is no reduce phase.
    Basically, I have just put my program in the map class.

    My program takes around 1-2 min. in instantiating the data
    objects
    which
    are
    present in the constructor of Map class(it loads some data model
    files,
    therefore it takes some time). After the instantiation part in
    the
    constrcutor of Map class the map function is supposed to process
    the
    input
    split.

    The problem is that the data objects do not get instantiated
    completely
    and
    in between(whlie it is still in constructor) the program stops
    giving
    the
    exceptions pasted at bottom.
    The program runs fine without mapreduce and does not require
    more
    than
    2GB
    memory, but in mapreduce even after doing export
    HADOOP_HEAPSIZE=2500(I
    am
    working on machines with 16GB RAM), the program fails. I have
    also
    set
    HADOOP_OPTS="-server -XX:-UseGCOverheadLimit" as sometimes I was
    getting
    GC
    Overhead Limit Exceeded exceptions also.

    Somebody, please help me with this problem: I have trying to
    debug
    it
    for
    the last 3 days, but unsuccessful. Thanks!

    java.lang.OutOfMemoryError: Java heap space
    at
    sun.misc.FloatingDecimal.toJavaFormatString(FloatingDecimal.java:889)
    at java.lang.Double.toString(Double.java:179)
    at java.text.DigitList.set(DigitList.java:272)
    at java.text.DecimalFormat.format(DecimalFormat.java:584)
    at java.text.DecimalFormat.format(DecimalFormat.java:507)
    at java.text.NumberFormat.format(NumberFormat.java:269)
    at
    org.apache.hadoop.util.StringUtils.formatPercent(StringUtils.java:110)
    at
    org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1147)
    at LbjTagger.NerTagger.main(NerTagger.java:109)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
    Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
    at
    org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
    at
    org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at
    org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at
    org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
    09/06/16 12:34:41 WARN mapred.LocalJobRunner: job_local_0001
    java.lang.RuntimeException:
    java.lang.reflect.InvocationTargetException
    at
    org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:81)
    at
    org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
    at
    org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
    at
    org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:83)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328)
    at
    org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)
    Caused by: java.lang.reflect.InvocationTargetException
    at
    sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
    Method)
    at
    sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
    at
    sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
    at
    java.lang.reflect.Constructor.newInstance(Constructor.java:513)
    at
    org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:79)
    ... 5 more
    Caused by: java.lang.ThreadDeath
    at java.lang.Thread.stop(Thread.java:715)
    at
    org.apache.hadoop.mapred.LocalJobRunner.killJob(LocalJobRunner.java:310)
    at
    org.apache.hadoop.mapred.JobClient$NetworkedJob.killJob(JobClient.java:315)
    at
    org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1224)
    at LbjTagger.NerTagger.main(NerTagger.java:109)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
    Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
    at
    org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
    at
    org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at
    org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at
    org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
    --
    View this message in context:
    http://www.nabble.com/Nor-%22OOM-Java-Heap-Space%22-neither-%22GC-OverHead-Limit-Exeeceded%22-tp24059508p24059508.html
    Sent from the Hadoop core-user mailing list archive at
    Nabble.com.

    --
    Pro Hadoop, a book to guide you from beginner to hadoop mastery,
    http://www.amazon.com/dp/1430219424?tag=jewlerymall
    www.prohadoopbook.com a community for Hadoop Professionals
    --
    View this message in context:
    http://www.nabble.com/Nor-%22OOM-Java-Heap-Space%22-neither-%22GC-OverHead-Limit-Exeeceded%22-tp24059508p24066426.html
    Sent from the Hadoop core-user mailing list archive at Nabble.com.

    --
    Pro Hadoop, a book to guide you from beginner to hadoop mastery,
    http://www.amazon.com/dp/1430219424?tag=jewlerymall
    www.prohadoopbook.com a community for Hadoop Professionals
    --
    View this message in context:
    http://www.nabble.com/Nor-%22OOM-Java-Heap-Space%22-neither-%22GC-OverHead-Limit-Exeeceded%22-tp24059508p24074211.html
    Sent from the Hadoop core-user mailing list archive at Nabble.com.

    --
    Pro Hadoop, a book to guide you from beginner to hadoop mastery,
    http://www.amazon.com/dp/1430219424?tag=jewlerymall
    www.prohadoopbook.com a community for Hadoop Professionals
    --
    View this message in context:
    http://www.nabble.com/Nor-%22OOM-Java-Heap-Space%22-neither-%22GC-OverHead-Limit-Exeeceded%22-tp24059508p24099611.html
    Sent from the Hadoop core-user mailing list archive at Nabble.com.

    --
    Pro Hadoop, a book to guide you from beginner to hadoop mastery,
    http://www.amazon.com/dp/1430219424?tag=jewlerymall
    www.prohadoopbook.com a community for Hadoop Professionals

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedJun 16, '09 at 5:51p
activeJun 19, '09 at 3:07p
posts10
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Akhil1988: 6 posts Jason hadoop: 4 posts

People

Translate

site design / logo © 2023 Grokbase