logs, which will grow very large in heavily-used clusters (userlogs in
particular).
/ for OS and logs
/mount* for mapred.local.dir and dfs.data.dir
Hope this helps.
Alex
On Wed, Jul 7, 2010 at 10:38 AM, A Levine wrote:
I am trying to configure a large install and I have a question about
the configuration of Data Nodes. Each data node has multiple drives.
Each drive is 1TB in size. In the hdfs-site.xml, I can have multiple
directories (which will be mounted drives) specified as shown by:
<property>
<name>dfs.data.dir</name>
<value>/mount1,/mount2,/mount3,....</value>
<final>true</final>
</property>
For the drive that has the OS, only 100G will be used for the OS. Is
it good practice to have a partition on the drive that has the OS used
for the dfs.data.dir? Will this slow things down? Will the size
difference available to each directory be a problem? Also, if it is
not a good idea to use the OS drive, then how about pointing logs to
that drive?
andrew
I am trying to configure a large install and I have a question about
the configuration of Data Nodes. Each data node has multiple drives.
Each drive is 1TB in size. In the hdfs-site.xml, I can have multiple
directories (which will be mounted drives) specified as shown by:
<property>
<name>dfs.data.dir</name>
<value>/mount1,/mount2,/mount3,....</value>
<final>true</final>
</property>
For the drive that has the OS, only 100G will be used for the OS. Is
it good practice to have a partition on the drive that has the OS used
for the dfs.data.dir? Will this slow things down? Will the size
difference available to each directory be a problem? Also, if it is
not a good idea to use the OS drive, then how about pointing logs to
that drive?
andrew