FAQ
Hey hdfs gurus -

One of my clusters is going through disk upgrades and not all machines have
a homogenous disk layout during the transition period. At first I started
looking into auto-generating dfs.data.dir based on the current machine
profile, but then looked at how disks are actually made available to the
datanode.

Looking at makeInstance() we see each disk listed in dfs.data.dir is tested,
and if usable, added as a disk to use. If there are disks to use a new
datanode is started with the usable disks.

Does it seem reasonable to push a config to all hosts with the new number of
disks? As machines are upgraded and mount points exist the disks will be
used. Machines not yet upgraded will simply ignore the missing directories
(the datanode will not have permissions to create the missing dirs).

public static DataNode makeInstance(String[] dataDirs, Configuration conf)
throws IOException {
ArrayList<File> dirs = new ArrayList<File>();
for (int i = 0; i < dataDirs.length; i++) {
File data = new File(dataDirs[i]);
try {
DiskChecker.checkDir(data);
dirs.add(data);
} catch(DiskErrorException e) {
LOG.warn("Invalid directory in dfs.data.dir: " + e.getMessage());
}
}
if (dirs.size() > 0)
return new DataNode(conf, dirs);
LOG.error("All directories in dfs.data.dir are invalid.");
return null;
}

Thoughts?

--travis

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouphdfs-user @
categorieshadoop
postedJan 27, '11 at 10:55p
activeJan 27, '11 at 10:55p
posts1
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Travis Crawford: 1 post

People

Translate

site design / logo © 2022 Grokbase