I don't think the first two options can work, even you stop the tasktracker these to-be-retired nodes are still connected to the namenode.
Option 3 can work. You only need to add this exclude file on the namenode, and it is an regular file. Add a key named dfs.hosts.exclude to your conf/hadoop-site.xml file,The value associated with this key provides the full path to a file on the NameNode's local file system which contains a list of machines which are not permitted to connect to HDFS.
Then you can run the command bin/hadoop dfsadmin -refreshNodes, then the cluster will decommission the nodes in the exclude file.This might take a period of time as the cluster need to move data from those retired nodes to left nodes.
After this you can use these retired nodes as a new cluster.But remember to remove those nodes from the slave nodes file and you can delete the exclude file afterward.
2010-11-04
shangan
发件人: Raj V
发送时间: 2010-11-04 10:05:44
收件人: common-user
抄送:
主题: Two questions.
1. I have a 512 node cluster. I need to have 32 nodes do something else. They
can be datanodes but I cannot run any map or reduce jobs on them. So I see three
options.
1. Stop the tasktracker on those nodes. leave the datanode running.
2. Set mapred.tasktracker.reduce.tasks.maximum and
mapred.tasktracker.map.tasks.maximum to 0 on these nodes and make these final.
3. Use the parameter mapred.hosts.exclude.
I am assuming that any of the three methods would work. To start with, I went
with option 3. I used a local file /home/hadoop/myjob.exclude and the file
myjob.exclude had the hostname of one host per line ( hadoop-480 .. hadoop-511.
But I see both map and reduce jobs being scheduled to all the 511 nodes.
I understand there is an inherent inefficieny by running only the data node on
these 32 nodess.
Here are my questions.
1. Will all three methods work?
2. If I choose method 3, does this file exist as a dfs file or a regular file.
If regular file , does it need to exist on all the nodes or only the node where
teh job is submitted?
Many thanks in advance/
Raj
__________ Information from ESET NOD32 Antivirus, version of virus signature database 5574 (20101029) __________
The message was checked by ESET NOD32 Antivirus.
http://www.eset.com