Hi all,
I have installed CDH4.2 with MRv1 through Cloudera Manager Free Edition 4.5
in a cluster composed by 2 VMs. All the services work fine in the cluster
the only problem is that the only DataNode that works is the one in the
master.
The cluster has one machine (reporting-00) which is the master and has the
NameNode, Secondary NameNode and 1 DataNode. The other machine in the
cluster (reporting-01) has only 1 NameNode. The machines have an IP in the
172.x.x.x range configured in their eth0 and communicate between each other
through a NAT network with IPs in the range 10.x.x.x.
When HDFS is started the following exception is thrown in the slave machine
DataNode:
Initialization failed for block pool Block pool BP-317827939-172.16.70.45-1362082408283 (storage id DS-725878305-172.16.70.60-50010-1362082420714) service to reporting-00/10.70.7.200:8020
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException): Datanode denied communication with namenode: DatanodeRegistration(10.70.8.2, storageID=DS-725878305-172.16.70.60-50010-1362082420714, infoPort=50075, ipcPort=50020, storageInfo=lv=-40;cid=cluster12;nsid=365875204;c=0)
at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:570)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:3447)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:868)
at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:91)
...
During the installation everything went fine, there were a couple of
warnings about some IP differences but I ignored them and continued with
the installation and everything finished fine.
The line definition for each machine in the corresponding /etc/hosts file
is in the form:
172.x.x.x reporting-0y (y= number of the correspondig machine)
All the values in the Hadoop configuration files are the default.
One additional detail that might be important is that I have an additional
cluster with the same configuration running plain Hadoop 1.0.3 and it works
fine.
Any ideas on how to approach this?
Thanks.