FAQ
Hi everyone
Here is my problem:
First of all I'm working with single node configuration
I'm developing an application and I made just one map function, in this map
function I call like 10 functions,
the application reads from a csv file and process a certain column, I
already made the jar file and everything so when I run this app with a csv
with 4000 rows on windows (windows 7) (using cygwin) on a 4 GB RAM machine,
the application works fine, but when I run it on linux- ubuntu on a 2 GB RAM
machine, it process some rows but then it throws a "Java heap space" error,
or sometimes the thread is killed.

For the linux:
I already tried to change the hadoop export HEAP_SIZE and also the Xmx and
Xms parameters on the app and it made some difference but not too much, the
error still happening...

Do you know why it s happening? its because the 4GB and 2GB of RAM
difference between machines?

Thanks
--
View this message in context: http://old.nabble.com/hadoop-map-reduce-windows--linux-heap-space-tp30108246p30108246.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Search Discussions

  • Harsh J at Nov 1, 2010 at 8:33 pm

    On Tue, Nov 2, 2010 at 1:51 AM, KoRnE wrote:
    Hi everyone
    Here is my problem:
    First of all I'm working with single node configuration
    I'm developing an application and I made just one map function, in this map
    function I call like 10 functions,
    the application reads from a csv file and process a certain column, I
    already made the jar file and everything so when I run this app with a csv
    with 4000 rows on windows (windows 7) (using cygwin) on a 4 GB RAM machine,
    the application works fine, but when I run it on linux- ubuntu on a 2 GB RAM
    machine, it process some rows but then it throws a "Java heap space" error,
    or sometimes the thread is killed.

    For the linux:
    I already tried to change the hadoop export HEAP_SIZE and also the Xmx and
    Xms parameters on the app and it made some difference but not too much, the
    error still happening...
    Assuming you have already tuned mapred.child.java.opts (which defaults
    to 200 MB per mapper) inside mapred-site.xml; you need to check your
    mapper's memory utilization. Things like reducing the amount of values
    you cache inside the mapper (or removing unnecessary ones in time)
    could help.

    Another solution would be to reduce the input split sizes for the CSV
    files to each mapper (Reduce its block size, or fiddle with
    mapred.min.split.size). This way your mapper should 'naturally'
    consume less memory if you're caching values in its run.
    Do you know why it s happening?  its because the 4GB and 2GB of RAM
    difference between machines?
    A mapper would consume only "mapred.child.java.opts" worth of memory,
    so it is not the difference of RAM in the machines definitely. I run
    CDH for development purposes on an ArchLinux desktop with 2 GB RAM,
    stuff runs smooth on it.
    Thanks
    --
    View this message in context: http://old.nabble.com/hadoop-map-reduce-windows--linux-heap-space-tp30108246p30108246.html
    Sent from the Hadoop core-user mailing list archive at Nabble.com.


    --
    Harsh J
    www.harshj.com

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedNov 1, '10 at 8:21p
activeNov 1, '10 at 8:33p
posts2
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Harsh J: 1 post KoRnE: 1 post

People

Translate

site design / logo © 2021 Grokbase