Hi,
The optimization of one Hadoop job I'm running would benefit from knowing
the
maximum number of map slots in the Hadoop cluster.
This number can be obtained (if my understanding is correct) by:
* parsing the mapred-site.xml file to get
the mapred.tasktracker.map.tasks.maximum value (assuming it is set of
course)
* parsing the slaves file to get the maximum number of compute nodes in the
cluster
* multiplying the 2 values
My question is:
I would like to learn about *all* possible ways to get this information
through API calls (either the Hadoop Common API or the Hadoop MapReduce
API), i.e. obtaining it through a Job object, through a Configuration
object,...
Thanks in advance and have a great day,
Cyril