FAQ
Hello all,
We have a 100 node hadoop cluster that is used for multiple purposes. I want to run few mapred jobs and I know 4 to 5 slaves should be enough. Is there anyway to restrict my jobs to use only 4 slaves instead of all 100. I noticed that more the number of slaves more overhead there is.

Also can I pass in hadoop parameters like mapred.child.java.opts so that the actual child processes gets the specified value for max heap size. I want to set the heap size to 2G instead of going with the default..

Thanks
Praveen

Search Discussions

  • Jun Young Kim at Feb 16, 2011 at 5:23 am
    you can use a fair-scheduler library to use only some parts of nodes you
    have to run a job.

    by using max/min map/reduce job counts.

    here is the documentation you can reference.

    http://hadoop.apache.org/mapreduce/docs/r0.21.0/fair_scheduler.html

    Junyoung Kim (juneng603@gmail.com)
    On 02/16/2011 06:33 AM, praveen.peddi@nokia.com wrote:
    Hello all,
    We have a 100 node hadoop cluster that is used for multiple purposes. I want to run few mapred jobs and I know 4 to 5 slaves should be enough. Is there anyway to restrict my jobs to use only 4 slaves instead of all 100. I noticed that more the number of slaves more overhead there is.

    Also can I pass in hadoop parameters like mapred.child.java.opts so that the actual child processes gets the specified value for max heap size. I want to set the heap size to 2G instead of going with the default..

    Thanks
    Praveen

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedFeb 15, '11 at 9:33p
activeFeb 16, '11 at 5:23a
posts2
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Jun Young Kim: 1 post Praveen Peddi: 1 post

People

Translate

site design / logo © 2022 Grokbase