FAQ
Hi,
I have a 4 node hadoop 0.15.3 cluster. I am using the default config
files. I am running a map reduce job to process 40 GB log data.
Some reduce tasks are failing with the following errors:
1)
stderr
Exception in thread "org.apache.hadoop.io.ObjectWritable Connection
Culler" Exception in thread
"org.apache.hadoop.dfs.DFSClient$LeaseChecker@1b3f8f6"
java.lang.OutOfMemoryError: Java heap space
Exception in thread "IPC Client connection to /127.0.0.1:34691"
java.lang.OutOfMemoryError: Java heap space
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

2)
stderr
Exception in thread "org.apache.hadoop.io.ObjectWritable Connection
Culler" java.lang.OutOfMemoryError: Java heap space

syslog:
2008-04-22 19:32:50,784 INFO org.apache.hadoop.mapred.ReduceTask:
task_200804212359_0007_r_000004_0 Merge of the 19 files in
InMemoryFileSystem complete. Local file is /data/hadoop-im2/mapred/loca
l/task_200804212359_0007_r_000004_0/map_22600.out
2008-04-22 20:34:16,012 INFO org.apache.hadoop.ipc.Client:
java.net.SocketException: Socket closed
at java.net.SocketInputStream.read(SocketInputStream.java:162)
at java.io.FilterInputStream.read(FilterInputStream.java:111)
at org.apache.hadoop.ipc.Client$Connection$1.read(Client.java:181)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read(BufferedInputStream.java:235)
at java.io.DataInputStream.readInt(DataInputStream.java:353)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:258)

2008-04-22 20:34:16,032 WARN org.apache.hadoop.mapred.TaskTracker: Error
running child
java.lang.OutOfMemoryError: Java heap space
2008-04-22 20:34:16,031 INFO org.apache.hadoop.mapred.TaskRunner:
Communication exception: java.lang.OutOfMemoryError: Java heap space

Has anyone experienced similar problem ? Is there any configuration
change that can help resolve this issue.

Regards,
aj

Search Discussions

  • Harish Mallipeddi at Apr 23, 2008 at 7:11 am
    Memory settings are in conf/hadoop-default.xml. You can override them in
    conf/hadoop-site.xml.

    Specifically I think you would want to change mapred.child.java.opts
    On Wed, Apr 23, 2008 at 2:40 PM, Apurva Jadhav wrote:

    Hi,
    I have a 4 node hadoop 0.15.3 cluster. I am using the default config
    files. I am running a map reduce job to process 40 GB log data.
    Some reduce tasks are failing with the following errors:
    1)
    stderr
    Exception in thread "org.apache.hadoop.io.ObjectWritable Connection
    Culler" Exception in thread
    "org.apache.hadoop.dfs.DFSClient$LeaseChecker@1b3f8f6"
    java.lang.OutOfMemoryError: Java heap space
    Exception in thread "IPC Client connection to /127.0.0.1:34691"
    java.lang.OutOfMemoryError: Java heap space
    Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

    2)
    stderr
    Exception in thread "org.apache.hadoop.io.ObjectWritable Connection
    Culler" java.lang.OutOfMemoryError: Java heap space

    syslog:
    2008-04-22 19:32:50,784 INFO org.apache.hadoop.mapred.ReduceTask:
    task_200804212359_0007_r_000004_0 Merge of the 19 files in
    InMemoryFileSystem complete. Local file is /data/hadoop-im2/mapred/loca
    l/task_200804212359_0007_r_000004_0/map_22600.out
    2008-04-22 20:34:16,012 INFO org.apache.hadoop.ipc.Client:
    java.net.SocketException: Socket closed
    at java.net.SocketInputStream.read(SocketInputStream.java:162)
    at java.io.FilterInputStream.read(FilterInputStream.java:111)
    at org.apache.hadoop.ipc.Client$Connection$1.read(Client.java:181)
    at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:235)
    at java.io.DataInputStream.readInt(DataInputStream.java:353)
    at org.apache.hadoop.ipc.Client$Connection.run(Client.java:258)

    2008-04-22 20:34:16,032 WARN org.apache.hadoop.mapred.TaskTracker: Error
    running child
    java.lang.OutOfMemoryError: Java heap space
    2008-04-22 20:34:16,031 INFO org.apache.hadoop.mapred.TaskRunner:
    Communication exception: java.lang.OutOfMemoryError: Java heap space

    Has anyone experienced similar problem ? Is there any configuration
    change that can help resolve this issue.

    Regards,
    aj



    --
    Harish Mallipeddi
    circos.com : poundbang.in/blog/
  • Amar Kamat at Apr 23, 2008 at 8:11 am

    Apurva Jadhav wrote:
    Hi,
    I have a 4 node hadoop 0.15.3 cluster. I am using the default config
    files. I am running a map reduce job to process 40 GB log data.
    How many maps and reducers are there? Make sure that there are
    sufficient number of reducers. Look at conf/hadoop-default.xml (see
    mapred.child.java.opts parameter) to change the heap settings.
    Amar
    Some reduce tasks are failing with the following errors:
    1)
    stderr
    Exception in thread "org.apache.hadoop.io.ObjectWritable Connection
    Culler" Exception in thread
    "org.apache.hadoop.dfs.DFSClient$LeaseChecker@1b3f8f6"
    java.lang.OutOfMemoryError: Java heap space
    Exception in thread "IPC Client connection to /127.0.0.1:34691"
    java.lang.OutOfMemoryError: Java heap space
    Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

    2)
    stderr
    Exception in thread "org.apache.hadoop.io.ObjectWritable Connection
    Culler" java.lang.OutOfMemoryError: Java heap space

    syslog:
    2008-04-22 19:32:50,784 INFO org.apache.hadoop.mapred.ReduceTask:
    task_200804212359_0007_r_000004_0 Merge of the 19 files in
    InMemoryFileSystem complete. Local file is /data/hadoop-im2/mapred/loca
    l/task_200804212359_0007_r_000004_0/map_22600.out
    2008-04-22 20:34:16,012 INFO org.apache.hadoop.ipc.Client:
    java.net.SocketException: Socket closed
    at java.net.SocketInputStream.read(SocketInputStream.java:162)
    at java.io.FilterInputStream.read(FilterInputStream.java:111)
    at org.apache.hadoop.ipc.Client$Connection$1.read(Client.java:181)
    at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:235)
    at java.io.DataInputStream.readInt(DataInputStream.java:353)
    at org.apache.hadoop.ipc.Client$Connection.run(Client.java:258)

    2008-04-22 20:34:16,032 WARN org.apache.hadoop.mapred.TaskTracker:
    Error running child
    java.lang.OutOfMemoryError: Java heap space
    2008-04-22 20:34:16,031 INFO org.apache.hadoop.mapred.TaskRunner:
    Communication exception: java.lang.OutOfMemoryError: Java heap space

    Has anyone experienced similar problem ? Is there any configuration
    change that can help resolve this issue.

    Regards,
    aj

  • Apurva Jadhav at Apr 23, 2008 at 2:51 pm
    There are six reducers and 24000 mappers because there are 24000 files.
    The number of tasks per node is 2.
    mapred.child.java opts is the default value 200m. What is a good value
    for this.? My mappers and reducers are fairly simple and do not make
    large allocations.
    Regards,
    aj

    Amar Kamat wrote:
    Apurva Jadhav wrote:
    Hi,
    I have a 4 node hadoop 0.15.3 cluster. I am using the default config
    files. I am running a map reduce job to process 40 GB log data.
    How many maps and reducers are there? Make sure that there are
    sufficient number of reducers. Look at conf/hadoop-default.xml (see
    mapred.child.java.opts parameter) to change the heap settings.
    Amar
    Some reduce tasks are failing with the following errors:
    1)
    stderr
    Exception in thread "org.apache.hadoop.io.ObjectWritable Connection
    Culler" Exception in thread
    "org.apache.hadoop.dfs.DFSClient$LeaseChecker@1b3f8f6"
    java.lang.OutOfMemoryError: Java heap space
    Exception in thread "IPC Client connection to /127.0.0.1:34691"
    java.lang.OutOfMemoryError: Java heap space
    Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

    2)
    stderr
    Exception in thread "org.apache.hadoop.io.ObjectWritable Connection
    Culler" java.lang.OutOfMemoryError: Java heap space

    syslog:
    2008-04-22 19:32:50,784 INFO org.apache.hadoop.mapred.ReduceTask:
    task_200804212359_0007_r_000004_0 Merge of the 19 files in
    InMemoryFileSystem complete. Local file is /data/hadoop-im2/mapred/loca
    l/task_200804212359_0007_r_000004_0/map_22600.out
    2008-04-22 20:34:16,012 INFO org.apache.hadoop.ipc.Client:
    java.net.SocketException: Socket closed
    at java.net.SocketInputStream.read(SocketInputStream.java:162)
    at java.io.FilterInputStream.read(FilterInputStream.java:111)
    at
    org.apache.hadoop.ipc.Client$Connection$1.read(Client.java:181)
    at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:235)
    at java.io.DataInputStream.readInt(DataInputStream.java:353)
    at org.apache.hadoop.ipc.Client$Connection.run(Client.java:258)

    2008-04-22 20:34:16,032 WARN org.apache.hadoop.mapred.TaskTracker:
    Error running child
    java.lang.OutOfMemoryError: Java heap space
    2008-04-22 20:34:16,031 INFO org.apache.hadoop.mapred.TaskRunner:
    Communication exception: java.lang.OutOfMemoryError: Java heap space

    Has anyone experienced similar problem ? Is there any configuration
    change that can help resolve this issue.

    Regards,
    aj

  • Arun C Murthy at Apr 24, 2008 at 7:08 am

    On Apr 23, 2008, at 7:51 AM, Apurva Jadhav wrote:

    There are six reducers and 24000 mappers because there are 24000
    files.
    The number of tasks per node is 2.
    mapred.child.java opts is the default value 200m. What is a good
    value for this.? My mappers and reducers are fairly simple and do
    not make large allocations.
    Try upping that to 512M.

    Arun
    Regards,
    aj

    Amar Kamat wrote:
    Apurva Jadhav wrote:
    Hi,
    I have a 4 node hadoop 0.15.3 cluster. I am using the default
    config files. I am running a map reduce job to process 40 GB log
    data.
    How many maps and reducers are there? Make sure that there are
    sufficient number of reducers. Look at conf/hadoop-default.xml
    (see mapred.child.java.opts parameter) to change the heap settings.
    Amar
    Some reduce tasks are failing with the following errors:
    1)
    stderr
    Exception in thread "org.apache.hadoop.io.ObjectWritable
    Connection Culler" Exception in thread
    "org.apache.hadoop.dfs.DFSClient$LeaseChecker@1b3f8f6"
    java.lang.OutOfMemoryError: Java heap space
    Exception in thread "IPC Client connection to /127.0.0.1:34691"
    java.lang.OutOfMemoryError: Java heap space
    Exception in thread "main" java.lang.OutOfMemoryError: Java heap
    space

    2)
    stderr
    Exception in thread "org.apache.hadoop.io.ObjectWritable
    Connection Culler" java.lang.OutOfMemoryError: Java heap space

    syslog:
    2008-04-22 19:32:50,784 INFO org.apache.hadoop.mapred.ReduceTask:
    task_200804212359_0007_r_000004_0 Merge of the 19 files in
    InMemoryFileSystem complete. Local file is /data/hadoop-im2/
    mapred/loca
    l/task_200804212359_0007_r_000004_0/map_22600.out
    2008-04-22 20:34:16,012 INFO org.apache.hadoop.ipc.Client:
    java.net.SocketException: Socket closed
    at java.net.SocketInputStream.read(SocketInputStream.java:
    162)
    at java.io.FilterInputStream.read(FilterInputStream.java:111)
    at org.apache.hadoop.ipc.Client$Connection$1.read
    (Client.java:181)
    at java.io.BufferedInputStream.fill
    (BufferedInputStream.java:218)
    at java.io.BufferedInputStream.read
    (BufferedInputStream.java:235)
    at java.io.DataInputStream.readInt(DataInputStream.java:353)
    at org.apache.hadoop.ipc.Client$Connection.run(Client.java:
    258)

    2008-04-22 20:34:16,032 WARN
    org.apache.hadoop.mapred.TaskTracker: Error running child
    java.lang.OutOfMemoryError: Java heap space
    2008-04-22 20:34:16,031 INFO org.apache.hadoop.mapred.TaskRunner:
    Communication exception: java.lang.OutOfMemoryError: Java heap space

    Has anyone experienced similar problem ? Is there any
    configuration change that can help resolve this issue.

    Regards,
    aj

  • Ted Dunning at Apr 24, 2008 at 4:07 pm
    If these files are small, you will have a significant (but not massive) hit
    in performance due to having so many files.

    On 4/24/08 12:07 AM, "Arun C Murthy" wrote:

    On Apr 23, 2008, at 7:51 AM, Apurva Jadhav wrote:

    There are six reducers and 24000 mappers because there are 24000
    files.
    The number of tasks per node is 2.
    mapred.child.java opts is the default value 200m. What is a good
    value for this.? My mappers and reducers are fairly simple and do
    not make large allocations.
    Try upping that to 512M.

    Arun
    Regards,
    aj

    Amar Kamat wrote:
    Apurva Jadhav wrote:
    Hi,
    I have a 4 node hadoop 0.15.3 cluster. I am using the default
    config files. I am running a map reduce job to process 40 GB log
    data.
    How many maps and reducers are there? Make sure that there are
    sufficient number of reducers. Look at conf/hadoop-default.xml
    (see mapred.child.java.opts parameter) to change the heap settings.
    Amar
    Some reduce tasks are failing with the following errors:
    1)
    stderr
    Exception in thread "org.apache.hadoop.io.ObjectWritable
    Connection Culler" Exception in thread
    "org.apache.hadoop.dfs.DFSClient$LeaseChecker@1b3f8f6"
    java.lang.OutOfMemoryError: Java heap space
    Exception in thread "IPC Client connection to /127.0.0.1:34691"
    java.lang.OutOfMemoryError: Java heap space
    Exception in thread "main" java.lang.OutOfMemoryError: Java heap
    space

    2)
    stderr
    Exception in thread "org.apache.hadoop.io.ObjectWritable
    Connection Culler" java.lang.OutOfMemoryError: Java heap space

    syslog:
    2008-04-22 19:32:50,784 INFO org.apache.hadoop.mapred.ReduceTask:
    task_200804212359_0007_r_000004_0 Merge of the 19 files in
    InMemoryFileSystem complete. Local file is /data/hadoop-im2/
    mapred/loca
    l/task_200804212359_0007_r_000004_0/map_22600.out
    2008-04-22 20:34:16,012 INFO org.apache.hadoop.ipc.Client:
    java.net.SocketException: Socket closed
    at java.net.SocketInputStream.read(SocketInputStream.java:
    162)
    at java.io.FilterInputStream.read(FilterInputStream.java:111)
    at org.apache.hadoop.ipc.Client$Connection$1.read
    (Client.java:181)
    at java.io.BufferedInputStream.fill
    (BufferedInputStream.java:218)
    at java.io.BufferedInputStream.read
    (BufferedInputStream.java:235)
    at java.io.DataInputStream.readInt(DataInputStream.java:353)
    at org.apache.hadoop.ipc.Client$Connection.run(Client.java:
    258)

    2008-04-22 20:34:16,032 WARN
    org.apache.hadoop.mapred.TaskTracker: Error running child
    java.lang.OutOfMemoryError: Java heap space
    2008-04-22 20:34:16,031 INFO org.apache.hadoop.mapred.TaskRunner:
    Communication exception: java.lang.OutOfMemoryError: Java heap space

    Has anyone experienced similar problem ? Is there any
    configuration change that can help resolve this issue.

    Regards,
    aj

  • Apurva Jadhav at Apr 24, 2008 at 4:47 pm
    I made two changes:
    1) Increased mapred.child.java opts tp 768m
    2) Coalesced the files into smaller number of larger files.

    This has resolved my problem and reduced the running time by a factor of 3.

    Thanks for all the suggestions.


    Ted Dunning wrote:
    If these files are small, you will have a significant (but not massive) hit
    in performance due to having so many files.


    On 4/24/08 12:07 AM, "Arun C Murthy" wrote:

    On Apr 23, 2008, at 7:51 AM, Apurva Jadhav wrote:

    There are six reducers and 24000 mappers because there are 24000
    files.
    The number of tasks per node is 2.
    mapred.child.java opts is the default value 200m. What is a good
    value for this.? My mappers and reducers are fairly simple and do
    not make large allocations.
    Try upping that to 512M.

    Arun

    Regards,
    aj

    Amar Kamat wrote:
    Apurva Jadhav wrote:
    Hi,
    I have a 4 node hadoop 0.15.3 cluster. I am using the default
    config files. I am running a map reduce job to process 40 GB log
    data.
    How many maps and reducers are there? Make sure that there are
    sufficient number of reducers. Look at conf/hadoop-default.xml
    (see mapred.child.java.opts parameter) to change the heap settings.
    Amar
    Some reduce tasks are failing with the following errors:
    1)
    stderr
    Exception in thread "org.apache.hadoop.io.ObjectWritable
    Connection Culler" Exception in thread
    "org.apache.hadoop.dfs.DFSClient$LeaseChecker@1b3f8f6"
    java.lang.OutOfMemoryError: Java heap space
    Exception in thread "IPC Client connection to /127.0.0.1:34691"
    java.lang.OutOfMemoryError: Java heap space
    Exception in thread "main" java.lang.OutOfMemoryError: Java heap
    space

    2)
    stderr
    Exception in thread "org.apache.hadoop.io.ObjectWritable
    Connection Culler" java.lang.OutOfMemoryError: Java heap space

    syslog:
    2008-04-22 19:32:50,784 INFO org.apache.hadoop.mapred.ReduceTask:
    task_200804212359_0007_r_000004_0 Merge of the 19 files in
    InMemoryFileSystem complete. Local file is /data/hadoop-im2/
    mapred/loca
    l/task_200804212359_0007_r_000004_0/map_22600.out
    2008-04-22 20:34:16,012 INFO org.apache.hadoop.ipc.Client:
    java.net.SocketException: Socket closed
    at java.net.SocketInputStream.read(SocketInputStream.java:
    162)
    at java.io.FilterInputStream.read(FilterInputStream.java:111)
    at org.apache.hadoop.ipc.Client$Connection$1.read
    (Client.java:181)
    at java.io.BufferedInputStream.fill
    (BufferedInputStream.java:218)
    at java.io.BufferedInputStream.read
    (BufferedInputStream.java:235)
    at java.io.DataInputStream.readInt(DataInputStream.java:353)
    at org.apache.hadoop.ipc.Client$Connection.run(Client.java:
    258)

    2008-04-22 20:34:16,032 WARN
    org.apache.hadoop.mapred.TaskTracker: Error running child
    java.lang.OutOfMemoryError: Java heap space
    2008-04-22 20:34:16,031 INFO org.apache.hadoop.mapred.TaskRunner:
    Communication exception: java.lang.OutOfMemoryError: Java heap space

    Has anyone experienced similar problem ? Is there any
    configuration change that can help resolve this issue.

    Regards,
    aj


  • Arun C Murthy at Apr 24, 2008 at 7:09 am

    On Apr 23, 2008, at 7:51 AM, Apurva Jadhav wrote:

    There are six reducers and 24000 mappers because there are 24000
    files.
    The number of tasks per node is 2.
    mapred.child.java opts is the default value 200m. What is a good
    value for this.? My mappers and reducers are fairly simple and do
    not make large allocations.
    Also, how much RAM do you have on each machine? If you have
    sufficient RAM, 512M is a good value (if you can afford 512*4 i.e. 2
    maps and 2 reduces).

    Arun
    Regards,
    aj

    Amar Kamat wrote:
    Apurva Jadhav wrote:
    Hi,
    I have a 4 node hadoop 0.15.3 cluster. I am using the default
    config files. I am running a map reduce job to process 40 GB log
    data.
    How many maps and reducers are there? Make sure that there are
    sufficient number of reducers. Look at conf/hadoop-default.xml
    (see mapred.child.java.opts parameter) to change the heap settings.
    Amar
    Some reduce tasks are failing with the following errors:
    1)
    stderr
    Exception in thread "org.apache.hadoop.io.ObjectWritable
    Connection Culler" Exception in thread
    "org.apache.hadoop.dfs.DFSClient$LeaseChecker@1b3f8f6"
    java.lang.OutOfMemoryError: Java heap space
    Exception in thread "IPC Client connection to /127.0.0.1:34691"
    java.lang.OutOfMemoryError: Java heap space
    Exception in thread "main" java.lang.OutOfMemoryError: Java heap
    space

    2)
    stderr
    Exception in thread "org.apache.hadoop.io.ObjectWritable
    Connection Culler" java.lang.OutOfMemoryError: Java heap space

    syslog:
    2008-04-22 19:32:50,784 INFO org.apache.hadoop.mapred.ReduceTask:
    task_200804212359_0007_r_000004_0 Merge of the 19 files in
    InMemoryFileSystem complete. Local file is /data/hadoop-im2/
    mapred/loca
    l/task_200804212359_0007_r_000004_0/map_22600.out
    2008-04-22 20:34:16,012 INFO org.apache.hadoop.ipc.Client:
    java.net.SocketException: Socket closed
    at java.net.SocketInputStream.read(SocketInputStream.java:
    162)
    at java.io.FilterInputStream.read(FilterInputStream.java:111)
    at org.apache.hadoop.ipc.Client$Connection$1.read
    (Client.java:181)
    at java.io.BufferedInputStream.fill
    (BufferedInputStream.java:218)
    at java.io.BufferedInputStream.read
    (BufferedInputStream.java:235)
    at java.io.DataInputStream.readInt(DataInputStream.java:353)
    at org.apache.hadoop.ipc.Client$Connection.run(Client.java:
    258)

    2008-04-22 20:34:16,032 WARN
    org.apache.hadoop.mapred.TaskTracker: Error running child
    java.lang.OutOfMemoryError: Java heap space
    2008-04-22 20:34:16,031 INFO org.apache.hadoop.mapred.TaskRunner:
    Communication exception: java.lang.OutOfMemoryError: Java heap space

    Has anyone experienced similar problem ? Is there any
    configuration change that can help resolve this issue.

    Regards,
    aj

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedApr 23, '08 at 6:41a
activeApr 24, '08 at 4:47p
posts8
users5
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase