When files are being loaded into HDFS, if the node has multiple entries for
dfs.data.dir, how does Hadoop pick which directory to store files in? Does
it intelligently pick the partition that has the most amount of space
or is it round robin, or perhaps random?
We keep running into a problem where a DataNode keeps running out of space
because the data was being written to the partition with less space
Here's some info about the cluster:
7 nodes, all identical hardware, running Hadoop 0.18.3.
Any feedback would be greatly appreciated.