FAQ
I am trying to run 8 map tasks with 2 reduce on 3 machines. Each task
runs on a 6 MB text file and 500 such files. The monitoring page shows
very few number of Map tasks running than intended. Sometimes some nodes
doesn't even get any tasks assigned though there are large number of
files remaining needs to be scheduled for map operation. Is it due to
distributing the files across nodes? In fact, my file system is set to
local.

Some important parameters are listed below
Io.sort.factor=100
Io.sort.mb = 1000
Io.file.buffer.size = 4096000
Io.bytes.checksum=128

Mapred.map.tasks=16
Mapred.reduce.tasks=2
Mapred.tasktracker.tasks.maximum=4
Mapred.combine.buffer.size=100000


Is there any parameter I am missing to maximize the use of all CPUS?


Thanks,
VJ

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedMay 12, '06 at 1:14a
activeMay 12, '06 at 1:14a
posts1
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Vijay Murthi: 1 post

People

Translate

site design / logo © 2023 Grokbase