Hi Guys, I need to restart discussion around

I saw the same OOM error in my map-reduce job in the map phase.

1. I tried changing mapred.child.java.opts (bumped to 600M)
2. io.sort.mb was kept at 100MB.

I see the same errors still.

I checked with debug the size of "keyValBuffer" in collect(), that is always
less than io.sort.mb and is spilled to disk properly.

I tried changing the map.task number to a very high number so that the input
is split into smaller chunks. It helps for a while as the map job went a bit
far (56% from 5%) but still see the problem.

I tried bumping mapred.child.java.opts to 1000M , still got the same error.

I also tried using the -verbose:gc -Xloggc:/tmp/@taskid@.gc value in opts to
get the gc.log but didnt got any log??

I tried using 'jmap -histo pid' to see the heap information, it didnt gave
me any meaningful or obvious problem point.

What are the other possible memory hog during mapper phase ?? Is the input
file chunk kept fully in memory ??


My map-reduce job is running with about 2G of input. in the Mapper phase I
read each line and output [5-500] (key,value) pair. so the intermediate data
should be really blown up. will that be a problem.

The Error file is attached
http://www.nabble.com/file/p16628181/error.txt error.txt
View this message in context: http://www.nabble.com/Mapper-OutOfMemoryError-Revisited-%21%21-tp16628181p16628181.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.

Search Discussions

Discussion Posts


Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 2 of 3 | next ›
Discussion Overview
groupcommon-user @
postedApr 11, '08 at 5:41p
activeApr 11, '08 at 7:08p

2 users in discussion

Bhupesh bansal: 2 posts Devaraj Das: 1 post



site design / logo © 2022 Grokbase