On Tue, Mar 8, 2011 at 11:16 AM, Ankita Kalantri
wrote:
1) In a Hadoop cluster, when I write a file into HDFS, the replication
factor that's followed - is it corresponding to the dfs.replication
parameter set in hdfs-site.xml file in the master ? Or should
something else be done about that (like changing the dfs.replication
parameter of all the nodes in the cluster by editing the xml file ?)
Having the replication config on the master should be sufficient (itfactor that's followed - is it corresponding to the dfs.replication
parameter set in hdfs-site.xml file in the master ? Or should
something else be done about that (like changing the dfs.replication
parameter of all the nodes in the cluster by editing the xml file ?)
is used at the NameNode, for default value purposes). But replication
is a file-level property, and can be controlled at the code level
while creating files and may be reset anytime to a different values
for existing files also (Try `hadoop dfs -setrep` for an example).
2) dfs.replication = 2 does that mean there are 2 replicates and a
total of 3 copies / does that mean total number of copies = 2 ?
Replication=2 means One original block + One replica. It is the total.total of 3 copies / does that mean total number of copies = 2 ?
--
Harsh J
www.harshj.com