FAQ
Hi All, I'm trying to setup Hadoop Cluster using 4 machines[4 x Ubuntu 12.04 x_64]. Using the following doc: 1. http://titan.softnet.tuc.gr:8082/User:xenia/Page_Title/Hadoop_Cluster_Setup_Tutorial
I'm able to setup hadoop clusters with required configurations. I can see that all the required services on master and on slaves nodes are running as required[please see below JPS command output ]. The problem, I'm facing is that, the HDFS and Mapreduce daemons running on Master and can be accessed from Master only, and not from the slave machines. Note that, I've added these ports in the EC2 security group to open them. And I can browse the master machines UI from web browser, using: http://<machine ip>:50070/dfshealth.jsp

Now, the problem which I'm facing is , the HDFS as well the jobtracker both are accessible from the master machine[I'm using master as both Namenode and Datanode] but both the ports[hdfs: 54310 and mapreduce: 54320] used for these two are not accessible from other slave nodes.
I did: netstat -puntl on master machine and got this:
hadoop@nutchcluster1:~/hadoop$ netstat -puntl(Not all processes could be identified, non-owned process info will not be shown, you would have to be root to see it all.)Active Internet connections (only servers)Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program nametcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN - tcp6 0 0 :::50020 :::* LISTEN 6224/java tcp6 0 0 127.0.0.1:54310 :::* LISTEN 6040/java tcp6 0 0 127.0.0.1:32776 :::* LISTEN 6723/java tcp6 0 0 :::57065 :::* LISTEN 6040/java tcp6 0 0 :::50090 :::* LISTEN 6401/java tcp6 0 0 :::50060 :::* LISTEN 6723/java tcp6 0 0 :::50030 :::* LISTEN 6540/java tcp6 0 0 127.0.0.1:54320 :::* LISTEN 6540/java tcp6 0 0 :::45747 :::* LISTEN 6401/java tcp6 0 0 :::33174 :::* LISTEN 6540/java tcp6 0 0 :::50070 :::* LISTEN 6040/java tcp6 0 0 :::22 :::* LISTEN - tcp6 0 0 :::54424 :::* LISTEN 6224/java tcp6 0 0 :::50010 :::* LISTEN 6224/java tcp6 0 0 :::50075 :::* LISTEN 6224/java udp 0 0 0.0.0.0:68 0.0.0.0:* - hadoop@nutchcluster1:~/hadoop$

As can be seen in the output, both the HDFS daemon and mapreduce daemons are accessible, but only from 127.0.0.1 and not from 0.0.0.0 [any machine/slave machines]tcp6 0 0 127.0.0.1:54310 :::* LISTEN 6040/java tcp6 0 0 127.0.0.1:54320 :::* LISTEN 6540/java

To confirm, the same Idid this on master: adoop@nutchcluster1:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/Found 1 itemsdrwxr-xr-x - hadoop supergroup 0 2012-12-02 12:53 /homehadoop@nutchcluster1:~/hadoop$
But, when I ran the same command on slaves, I get this: hadoop@nutchcluster2:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/12/12/02 15:42:16 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 0 time(s).12/12/02 15:42:17 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 1 time(s).12/12/02 15:42:18 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 2 time(s).12/12/02 15:42:19 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 3 time(s).12/12/02 15:42:20 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 4 time(s).12/12/02 15:42:21 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 5 time(s).12/12/02 15:42:22 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 6 time(s).12/12/02 15:42:23 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 7 time(s).12/12/02 15:42:24 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 8 time(s).12/12/02 15:42:25 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 9 time(s).Bad connection to FS. command aborted. exception: Call to nutchcluster1/10.4.39.23:54310 failed on connection exception: java.net.ConnectException: Connection refusedhadoop@nutchcluster2:~/hadoop$


The configurations are as below:
--------------core-site.xml content is as below<property> <name>hadoop.tmp.dir</name> <value>/home/hadoop/hadoop/datastore/hadoop-${user.name}</value> <description>A base for other temporary directories</description></property>
<property> <name>fs.default.name</name> <value>hdfs://nutchcluster1:54310</value> <description>The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a filesystem.</description></property>

-----------------hdfs-site.xml content is as below:<configuration><property> <name>dfs.replication</name> <value>2</value> <description>Default block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time. </description></property></configuration>


------------------mapred-site.xml content is as: <configuration><property> <name>mapred.job.tracker</name> <value>nutchcluster1:54320</value> <description>The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task. </description></property><property> <name>mapred.map.tasks</name> <value>40</value> <description>As a rule of thumb, use 10x the number of slaves (i.e., number of tasktrackers). </description></property>
<property> <name>mapred.reduce.tasks</name> <value>40</value> <description>As a rule of thumb, use 2x the number of slave processors (i.e., number of tasktrackers). </description></property></configuration>

I replicated all the above on all the other 3 slave machines[1 master + 3 slaves]. My /etc/hosts content is as below on the master node. Note that, I've thee same content on slaves as well, the only difference is, its own IP is set to 127.0.0.1 and for others the exact IP is set:
------------------------------/etc/hosts conent:127.0.0.1 localhost127.0.0.1 nutchcluster110.111.59.96 nutchcluster210.201.223.79 nutchcluster310.190.117.68 nutchcluster4
# The following lines are desirable for IPv6 capable hosts::1 ip6-localhost ip6-loopbackfe00::0 ip6-localnetff00::0 ip6-mcastprefixff02::1 ip6-allnodesff02::2 ip6-allroutersff02::3 ip6-allhosts
-----------------/etc/hosts content ends here
File content for masters is :nutchcluster1
and file content for slaves is: nutchcluster1nutchcluster2nutchcluster3nutchcluster4
Then, I copied all the contents relevant to config[*-site.xml, *.env files] folder on all the slaves.
As per the steps, I'm starting the HDFS using: bin/start-dfs.sh and then I'm starting the mapreduce as : bin/start-mapred.sh After running the above two on my master machine[nuthccluster1] I can see the following jps output: adoop@nutchcluster1:~/hadoop$ jps6401 SecondaryNameNode6723 TaskTracker6224 DataNode6540 JobTracker7354 Jps6040 NameNodehadoop@nutchcluster1:~/hadoop$

and on the slaves, the jps output is: hadoop@nutchcluster2:~/hadoop$ jps8952 DataNode9104 TaskTracker9388 Jpshadoop@nutchcluster2:~/hadoop$

This clearly indicates that the port 54310 is accessible from the master only and not from the slaves. This is the point I'm stuck at and would appreciate if someone could point me what config is missing or what is wrong. Any comment/feedback, in this regard would be highly appreciated. Thanks in advance.

Regards, DW

Search Discussions

  • A Geek at Dec 2, 2012 at 4:39 pm
    Hi,Just to add the version details: I'm running Apache Hadoop release 1.0.4 with jdk1.6.0_37 . The underlying Ubuntu 12.04 machine has got 300GB disk space and has 1.7GB RAM and is a single core machine.
    Regards,DW
    From: dw.90@live.com
    To: user@hadoop.apache.org
    Subject: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based machines]
    Date: Sun, 2 Dec 2012 15:55:09 +0000






    Hi All, I'm trying to setup Hadoop Cluster using 4 machines[4 x Ubuntu 12.04 x_64]. Using the following doc: 1. http://titan.softnet.tuc.gr:8082/User:xenia/Page_Title/Hadoop_Cluster_Setup_Tutorial
    I'm able to setup hadoop clusters with required configurations. I can see that all the required services on master and on slaves nodes are running as required[please see below JPS command output ]. The problem, I'm facing is that, the HDFS and Mapreduce daemons running on Master and can be accessed from Master only, and not from the slave machines. Note that, I've added these ports in the EC2 security group to open them. And I can browse the master machines UI from web browser, using: http://<machine ip>:50070/dfshealth.jsp

    Now, the problem which I'm facing is , the HDFS as well the jobtracker both are accessible from the master machine[I'm using master as both Namenode and Datanode] but both the ports[hdfs: 54310 and mapreduce: 54320] used for these two are not accessible from other slave nodes.
    I did: netstat -puntl on master machine and got this:
    hadoop@nutchcluster1:~/hadoop$ netstat -puntl(Not all processes could be identified, non-owned process info will not be shown, you would have to be root to see it all.)Active Internet connections (only servers)Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program nametcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN - tcp6 0 0 :::50020 :::* LISTEN 6224/java tcp6 0 0 127.0.0.1:54310 :::* LISTEN 6040/java tcp6 0 0 127.0.0.1:32776 :::* LISTEN 6723/java tcp6 0 0 :::57065 :::* LISTEN 6040/java tcp6 0 0 :::50090 :::* LISTEN 6401/java tcp6 0 0 :::50060 :::* LISTEN 6723/java tcp6 0 0 :::50030 :::* LISTEN 6540/java tcp6 0 0 127.0.0.1:54320 :::* LISTEN 6540/java tcp6 0 0 :::45747 :::* LISTEN 6401/java tcp6 0 0 :::33174 :::* LISTEN 6540/java tcp6 0 0 :::50070 :::* LISTEN 6040/java tcp6 0 0 :::22 :::* LISTEN - tcp6 0 0 :::54424 :::* LISTEN 6224/java tcp6 0 0 :::50010 :::* LISTEN 6224/java tcp6 0 0 :::50075 :::* LISTEN 6224/java udp 0 0 0.0.0.0:68 0.0.0.0:* - hadoop@nutchcluster1:~/hadoop$

    As can be seen in the output, both the HDFS daemon and mapreduce daemons are accessible, but only from 127.0.0.1 and not from 0.0.0.0 [any machine/slave machines]tcp6 0 0 127.0.0.1:54310 :::* LISTEN 6040/java tcp6 0 0 127.0.0.1:54320 :::* LISTEN 6540/java

    To confirm, the same Idid this on master: adoop@nutchcluster1:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/Found 1 itemsdrwxr-xr-x - hadoop supergroup 0 2012-12-02 12:53 /homehadoop@nutchcluster1:~/hadoop$
    But, when I ran the same command on slaves, I get this: hadoop@nutchcluster2:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/12/12/02 15:42:16 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 0 time(s).12/12/02 15:42:17 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 1 time(s).12/12/02 15:42:18 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 2 time(s).12/12/02 15:42:19 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 3 time(s).12/12/02 15:42:20 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 4 time(s).12/12/02 15:42:21 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 5 time(s).12/12/02 15:42:22 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 6 time(s).12/12/02 15:42:23 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 7 time(s).12/12/02 15:42:24 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 8 time(s).12/12/02 15:42:25 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 9 time(s).Bad connection to FS. command aborted. exception: Call to nutchcluster1/10.4.39.23:54310 failed on connection exception: java.net.ConnectException: Connection refusedhadoop@nutchcluster2:~/hadoop$


    The configurations are as below:
    --------------core-site.xml content is as below<property> <name>hadoop.tmp.dir</name> <value>/home/hadoop/hadoop/datastore/hadoop-${user.name}</value> <description>A base for other temporary directories</description></property>
    <property> <name>fs.default.name</name> <value>hdfs://nutchcluster1:54310</value> <description>The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a filesystem.</description></property>

    -----------------hdfs-site.xml content is as below:<configuration><property> <name>dfs.replication</name> <value>2</value> <description>Default block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time. </description></property></configuration>


    ------------------mapred-site.xml content is as: <configuration><property> <name>mapred.job.tracker</name> <value>nutchcluster1:54320</value> <description>The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task. </description></property><property> <name>mapred.map.tasks</name> <value>40</value> <description>As a rule of thumb, use 10x the number of slaves (i.e., number of tasktrackers). </description></property>
    <property> <name>mapred.reduce.tasks</name> <value>40</value> <description>As a rule of thumb, use 2x the number of slave processors (i.e., number of tasktrackers). </description></property></configuration>

    I replicated all the above on all the other 3 slave machines[1 master + 3 slaves]. My /etc/hosts content is as below on the master node. Note that, I've thee same content on slaves as well, the only difference is, its own IP is set to 127.0.0.1 and for others the exact IP is set:
    ------------------------------/etc/hosts conent:127.0.0.1 localhost127.0.0.1 nutchcluster110.111.59.96 nutchcluster210.201.223.79 nutchcluster310.190.117.68 nutchcluster4
    # The following lines are desirable for IPv6 capable hosts::1 ip6-localhost ip6-loopbackfe00::0 ip6-localnetff00::0 ip6-mcastprefixff02::1 ip6-allnodesff02::2 ip6-allroutersff02::3 ip6-allhosts
    -----------------/etc/hosts content ends here
    File content for masters is :nutchcluster1
    and file content for slaves is: nutchcluster1nutchcluster2nutchcluster3nutchcluster4
    Then, I copied all the contents relevant to config[*-site.xml, *.env files] folder on all the slaves.
    As per the steps, I'm starting the HDFS using: bin/start-dfs.sh and then I'm starting the mapreduce as : bin/start-mapred.sh After running the above two on my master machine[nuthccluster1] I can see the following jps output: adoop@nutchcluster1:~/hadoop$ jps6401 SecondaryNameNode6723 TaskTracker6224 DataNode6540 JobTracker7354 Jps6040 NameNodehadoop@nutchcluster1:~/hadoop$

    and on the slaves, the jps output is: hadoop@nutchcluster2:~/hadoop$ jps8952 DataNode9104 TaskTracker9388 Jpshadoop@nutchcluster2:~/hadoop$

    This clearly indicates that the port 54310 is accessible from the master only and not from the slaves. This is the point I'm stuck at and would appreciate if someone could point me what config is missing or what is wrong. Any comment/feedback, in this regard would be highly appreciated. Thanks in advance.

    Regards, DW
  • Harsh J at Dec 2, 2012 at 4:43 pm
    Your problem is that your /etc/hosts file has the line:

    127.0.0.1 nutchcluster1

    Just delete that line, restart your services. You intend your hostname
    "nutchcluster1" to be externally accessible, so aliasing it to the
    loopback address (127.0.0.1) is not right.
    On Sun, Dec 2, 2012 at 10:08 PM, A Geek wrote:
    Hi,
    Just to add the version details: I'm running Apache Hadoop release 1.0.4
    with jdk1.6.0_37 . The underlying Ubuntu 12.04 machine has got 300GB disk
    space and has 1.7GB RAM and is a single core machine.

    Regards,
    DW

    ________________________________
    From: dw.90@live.com
    To: user@hadoop.apache.org
    Subject: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based
    machines]
    Date: Sun, 2 Dec 2012 15:55:09 +0000


    Hi All,
    I'm trying to setup Hadoop Cluster using 4 machines[4 x Ubuntu 12.04 x_64].
    Using the following doc:
    1.
    http://titan.softnet.tuc.gr:8082/User:xenia/Page_Title/Hadoop_Cluster_Setup_Tutorial

    I'm able to setup hadoop clusters with required configurations. I can see
    that all the required services on master and on slaves nodes are running as
    required[please see below JPS command output ]. The problem, I'm facing is
    that, the HDFS and Mapreduce daemons running on Master and can be accessed
    from Master only, and not from the slave machines. Note that, I've added
    these ports in the EC2 security group to open them. And I can browse the
    master machines UI from web browser, using: http://<machine
    ip>:50070/dfshealth.jsp


    Now, the problem which I'm facing is , the HDFS as well the jobtracker both
    are accessible from the master machine[I'm using master as both Namenode and
    Datanode] but both the ports[hdfs: 54310 and mapreduce: 54320] used for
    these two are not accessible from other slave nodes.

    I did: netstat -puntl on master machine and got this:

    hadoop@nutchcluster1:~/hadoop$ netstat -puntl
    (Not all processes could be identified, non-owned process info
    will not be shown, you would have to be root to see it all.)
    Active Internet connections (only servers)
    Proto Recv-Q Send-Q Local Address Foreign Address State
    PID/Program name
    tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN
    -
    tcp6 0 0 :::50020 :::* LISTEN
    6224/java
    tcp6 0 0 127.0.0.1:54310 :::* LISTEN
    6040/java
    tcp6 0 0 127.0.0.1:32776 :::* LISTEN
    6723/java
    tcp6 0 0 :::57065 :::* LISTEN
    6040/java
    tcp6 0 0 :::50090 :::* LISTEN
    6401/java
    tcp6 0 0 :::50060 :::* LISTEN
    6723/java
    tcp6 0 0 :::50030 :::* LISTEN
    6540/java
    tcp6 0 0 127.0.0.1:54320 :::* LISTEN
    6540/java
    tcp6 0 0 :::45747 :::* LISTEN
    6401/java
    tcp6 0 0 :::33174 :::* LISTEN
    6540/java
    tcp6 0 0 :::50070 :::* LISTEN
    6040/java
    tcp6 0 0 :::22 :::* LISTEN
    -
    tcp6 0 0 :::54424 :::* LISTEN
    6224/java
    tcp6 0 0 :::50010 :::* LISTEN
    6224/java
    tcp6 0 0 :::50075 :::* LISTEN
    6224/java
    udp 0 0 0.0.0.0:68 0.0.0.0:*
    -
    hadoop@nutchcluster1:~/hadoop$


    As can be seen in the output, both the HDFS daemon and mapreduce daemons are
    accessible, but only from 127.0.0.1 and not from 0.0.0.0 [any machine/slave
    machines]
    tcp6 0 0 127.0.0.1:54310 :::* LISTEN
    6040/java
    tcp6 0 0 127.0.0.1:54320 :::* LISTEN
    6540/java


    To confirm, the same Idid this on master:
    adoop@nutchcluster1:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/
    Found 1 items
    drwxr-xr-x - hadoop supergroup 0 2012-12-02 12:53 /home
    hadoop@nutchcluster1:~/hadoop$

    But, when I ran the same command on slaves, I get this:
    hadoop@nutchcluster2:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/
    12/12/02 15:42:16 INFO ipc.Client: Retrying connect to server:
    nutchcluster1/10.4.39.23:54310. Already tried 0 time(s).
    12/12/02 15:42:17 INFO ipc.Client: Retrying connect to server:
    nutchcluster1/10.4.39.23:54310. Already tried 1 time(s).
    12/12/02 15:42:18 INFO ipc.Client: Retrying connect to server:
    nutchcluster1/10.4.39.23:54310. Already tried 2 time(s).
    12/12/02 15:42:19 INFO ipc.Client: Retrying connect to server:
    nutchcluster1/10.4.39.23:54310. Already tried 3 time(s).
    12/12/02 15:42:20 INFO ipc.Client: Retrying connect to server:
    nutchcluster1/10.4.39.23:54310. Already tried 4 time(s).
    12/12/02 15:42:21 INFO ipc.Client: Retrying connect to server:
    nutchcluster1/10.4.39.23:54310. Already tried 5 time(s).
    12/12/02 15:42:22 INFO ipc.Client: Retrying connect to server:
    nutchcluster1/10.4.39.23:54310. Already tried 6 time(s).
    12/12/02 15:42:23 INFO ipc.Client: Retrying connect to server:
    nutchcluster1/10.4.39.23:54310. Already tried 7 time(s).
    12/12/02 15:42:24 INFO ipc.Client: Retrying connect to server:
    nutchcluster1/10.4.39.23:54310. Already tried 8 time(s).
    12/12/02 15:42:25 INFO ipc.Client: Retrying connect to server:
    nutchcluster1/10.4.39.23:54310. Already tried 9 time(s).
    Bad connection to FS. command aborted. exception: Call to
    nutchcluster1/10.4.39.23:54310 failed on connection exception:
    java.net.ConnectException: Connection refused
    hadoop@nutchcluster2:~/hadoop$



    The configurations are as below:

    --------------core-site.xml content is as below
    <property>
    <name>hadoop.tmp.dir</name>
    <value>/home/hadoop/hadoop/datastore/hadoop-${user.name}</value>
    <description>A base for other temporary directories</description>
    </property>

    <property>
    <name>fs.default.name</name>
    <value>hdfs://nutchcluster1:54310</value>
    <description>The name of the default file system. A URI whose
    scheme and authority determine the FileSystem implementation. The
    uri's scheme determines the config property (fs.SCHEME.impl) naming
    the FileSystem implementation class. The uri's authority is used to
    determine the host, port, etc. for a filesystem.</description>
    </property>


    -----------------hdfs-site.xml content is as below:
    <configuration>
    <property>
    <name>dfs.replication</name>
    <value>2</value>
    <description>Default block replication.
    The actual number of replications can be specified when the file is
    created.
    The default is used if replication is not specified in create time.
    </description>
    </property>
    </configuration>



    ------------------mapred-site.xml content is as:
    <configuration>
    <property>
    <name>mapred.job.tracker</name>
    <value>nutchcluster1:54320</value>
    <description>The host and port that the MapReduce job tracker runs
    at. If "local", then jobs are run in-process as a single map
    and reduce task.
    </description>
    </property>
    <property>
    <name>mapred.map.tasks</name>
    <value>40</value>
    <description>As a rule of thumb, use 10x the number of slaves (i.e.,
    number of tasktrackers).
    </description>
    </property>

    <property>
    <name>mapred.reduce.tasks</name>
    <value>40</value>
    <description>As a rule of thumb, use 2x the number of slave processors
    (i.e., number of tasktrackers).
    </description>
    </property>
    </configuration>


    I replicated all the above on all the other 3 slave machines[1 master + 3
    slaves]. My /etc/hosts content is as below on the master node. Note that,
    I've thee same content on slaves as well, the only difference is, its own IP
    is set to 127.0.0.1 and for others the exact IP is set:

    ------------------------------/etc/hosts conent:
    127.0.0.1 localhost
    127.0.0.1 nutchcluster1
    10.111.59.96 nutchcluster2
    10.201.223.79 nutchcluster3
    10.190.117.68 nutchcluster4

    # The following lines are desirable for IPv6 capable hosts
    ::1 ip6-localhost ip6-loopback
    fe00::0 ip6-localnet
    ff00::0 ip6-mcastprefix
    ff02::1 ip6-allnodes
    ff02::2 ip6-allrouters
    ff02::3 ip6-allhosts

    -----------------/etc/hosts content ends here

    File content for masters is :
    nutchcluster1

    and file content for slaves is:
    nutchcluster1
    nutchcluster2
    nutchcluster3
    nutchcluster4

    Then, I copied all the contents relevant to config[*-site.xml, *.env files]
    folder on all the slaves.

    As per the steps, I'm starting the HDFS using: bin/start-dfs.sh and then
    I'm starting the mapreduce as : bin/start-mapred.sh
    After running the above two on my master machine[nuthccluster1] I can see
    the following jps output:
    adoop@nutchcluster1:~/hadoop$ jps
    6401 SecondaryNameNode
    6723 TaskTracker
    6224 DataNode
    6540 JobTracker
    7354 Jps
    6040 NameNode
    hadoop@nutchcluster1:~/hadoop$


    and on the slaves, the jps output is:
    hadoop@nutchcluster2:~/hadoop$ jps
    8952 DataNode
    9104 TaskTracker
    9388 Jps
    hadoop@nutchcluster2:~/hadoop$


    This clearly indicates that the port 54310 is accessible from the master
    only and not from the slaves. This is the point I'm stuck at and would
    appreciate if someone could point me what config is missing or what is
    wrong. Any comment/feedback, in this regard would be highly appreciated.
    Thanks in advance.


    Regards,
    DW


    --
    Harsh J
  • Nitin Pawar at Dec 2, 2012 at 6:49 pm
    also,

    if you want to setup a hadoop cluster on aws, just try using whirr.
    Basically it does everything for you

    On Sun, Dec 2, 2012 at 10:12 PM, Harsh J wrote:

    Your problem is that your /etc/hosts file has the line:

    127.0.0.1 nutchcluster1

    Just delete that line, restart your services. You intend your hostname
    "nutchcluster1" to be externally accessible, so aliasing it to the
    loopback address (127.0.0.1) is not right.
    On Sun, Dec 2, 2012 at 10:08 PM, A Geek wrote:
    Hi,
    Just to add the version details: I'm running Apache Hadoop release 1.0.4
    with jdk1.6.0_37 . The underlying Ubuntu 12.04 machine has got 300GB disk
    space and has 1.7GB RAM and is a single core machine.

    Regards,
    DW

    ________________________________
    From: dw.90@live.com
    To: user@hadoop.apache.org
    Subject: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based
    machines]
    Date: Sun, 2 Dec 2012 15:55:09 +0000


    Hi All,
    I'm trying to setup Hadoop Cluster using 4 machines[4 x Ubuntu 12.04 x_64].
    Using the following doc:
    1.
    http://titan.softnet.tuc.gr:8082/User:xenia/Page_Title/Hadoop_Cluster_Setup_Tutorial
    I'm able to setup hadoop clusters with required configurations. I can see
    that all the required services on master and on slaves nodes are running as
    required[please see below JPS command output ]. The problem, I'm facing is
    that, the HDFS and Mapreduce daemons running on Master and can be accessed
    from Master only, and not from the slave machines. Note that, I've added
    these ports in the EC2 security group to open them. And I can browse the
    master machines UI from web browser, using: http://<machine
    ip>:50070/dfshealth.jsp


    Now, the problem which I'm facing is , the HDFS as well the jobtracker both
    are accessible from the master machine[I'm using master as both Namenode and
    Datanode] but both the ports[hdfs: 54310 and mapreduce: 54320] used for
    these two are not accessible from other slave nodes.

    I did: netstat -puntl on master machine and got this:

    hadoop@nutchcluster1:~/hadoop$ netstat -puntl
    (Not all processes could be identified, non-owned process info
    will not be shown, you would have to be root to see it all.)
    Active Internet connections (only servers)
    Proto Recv-Q Send-Q Local Address Foreign Address State
    PID/Program name
    tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN
    -
    tcp6 0 0 :::50020 :::* LISTEN
    6224/java
    tcp6 0 0 127.0.0.1:54310 :::* LISTEN
    6040/java
    tcp6 0 0 127.0.0.1:32776 :::* LISTEN
    6723/java
    tcp6 0 0 :::57065 :::* LISTEN
    6040/java
    tcp6 0 0 :::50090 :::* LISTEN
    6401/java
    tcp6 0 0 :::50060 :::* LISTEN
    6723/java
    tcp6 0 0 :::50030 :::* LISTEN
    6540/java
    tcp6 0 0 127.0.0.1:54320 :::* LISTEN
    6540/java
    tcp6 0 0 :::45747 :::* LISTEN
    6401/java
    tcp6 0 0 :::33174 :::* LISTEN
    6540/java
    tcp6 0 0 :::50070 :::* LISTEN
    6040/java
    tcp6 0 0 :::22 :::* LISTEN
    -
    tcp6 0 0 :::54424 :::* LISTEN
    6224/java
    tcp6 0 0 :::50010 :::* LISTEN
    6224/java
    tcp6 0 0 :::50075 :::* LISTEN
    6224/java
    udp 0 0 0.0.0.0:68 0.0.0.0:*
    -
    hadoop@nutchcluster1:~/hadoop$


    As can be seen in the output, both the HDFS daemon and mapreduce daemons are
    accessible, but only from 127.0.0.1 and not from 0.0.0.0 [any
    machine/slave
    machines]
    tcp6 0 0 127.0.0.1:54310 :::* LISTEN
    6040/java
    tcp6 0 0 127.0.0.1:54320 :::* LISTEN
    6540/java


    To confirm, the same Idid this on master:
    adoop@nutchcluster1:~/hadoop$ bin/hadoop fs -ls
    hdfs://nutchcluster1:54310/
    Found 1 items
    drwxr-xr-x - hadoop supergroup 0 2012-12-02 12:53 /home
    hadoop@nutchcluster1:~/hadoop$

    But, when I ran the same command on slaves, I get this:
    hadoop@nutchcluster2:~/hadoop$ bin/hadoop fs -ls
    hdfs://nutchcluster1:54310/
    12/12/02 15:42:16 INFO ipc.Client: Retrying connect to server:
    nutchcluster1/10.4.39.23:54310. Already tried 0 time(s).
    12/12/02 15:42:17 INFO ipc.Client: Retrying connect to server:
    nutchcluster1/10.4.39.23:54310. Already tried 1 time(s).
    12/12/02 15:42:18 INFO ipc.Client: Retrying connect to server:
    nutchcluster1/10.4.39.23:54310. Already tried 2 time(s).
    12/12/02 15:42:19 INFO ipc.Client: Retrying connect to server:
    nutchcluster1/10.4.39.23:54310. Already tried 3 time(s).
    12/12/02 15:42:20 INFO ipc.Client: Retrying connect to server:
    nutchcluster1/10.4.39.23:54310. Already tried 4 time(s).
    12/12/02 15:42:21 INFO ipc.Client: Retrying connect to server:
    nutchcluster1/10.4.39.23:54310. Already tried 5 time(s).
    12/12/02 15:42:22 INFO ipc.Client: Retrying connect to server:
    nutchcluster1/10.4.39.23:54310. Already tried 6 time(s).
    12/12/02 15:42:23 INFO ipc.Client: Retrying connect to server:
    nutchcluster1/10.4.39.23:54310. Already tried 7 time(s).
    12/12/02 15:42:24 INFO ipc.Client: Retrying connect to server:
    nutchcluster1/10.4.39.23:54310. Already tried 8 time(s).
    12/12/02 15:42:25 INFO ipc.Client: Retrying connect to server:
    nutchcluster1/10.4.39.23:54310. Already tried 9 time(s).
    Bad connection to FS. command aborted. exception: Call to
    nutchcluster1/10.4.39.23:54310 failed on connection exception:
    java.net.ConnectException: Connection refused
    hadoop@nutchcluster2:~/hadoop$



    The configurations are as below:

    --------------core-site.xml content is as below
    <property>
    <name>hadoop.tmp.dir</name>
    <value>/home/hadoop/hadoop/datastore/hadoop-${user.name}</value>
    <description>A base for other temporary directories</description>
    </property>

    <property>
    <name>fs.default.name</name>
    <value>hdfs://nutchcluster1:54310</value>
    <description>The name of the default file system. A URI whose
    scheme and authority determine the FileSystem implementation. The
    uri's scheme determines the config property (fs.SCHEME.impl) naming
    the FileSystem implementation class. The uri's authority is used to
    determine the host, port, etc. for a filesystem.</description>
    </property>


    -----------------hdfs-site.xml content is as below:
    <configuration>
    <property>
    <name>dfs.replication</name>
    <value>2</value>
    <description>Default block replication.
    The actual number of replications can be specified when the file is
    created.
    The default is used if replication is not specified in create time.
    </description>
    </property>
    </configuration>



    ------------------mapred-site.xml content is as:
    <configuration>
    <property>
    <name>mapred.job.tracker</name>
    <value>nutchcluster1:54320</value>
    <description>The host and port that the MapReduce job tracker runs
    at. If "local", then jobs are run in-process as a single map
    and reduce task.
    </description>
    </property>
    <property>
    <name>mapred.map.tasks</name>
    <value>40</value>
    <description>As a rule of thumb, use 10x the number of slaves (i.e.,
    number of tasktrackers).
    </description>
    </property>

    <property>
    <name>mapred.reduce.tasks</name>
    <value>40</value>
    <description>As a rule of thumb, use 2x the number of slave processors
    (i.e., number of tasktrackers).
    </description>
    </property>
    </configuration>


    I replicated all the above on all the other 3 slave machines[1 master + 3
    slaves]. My /etc/hosts content is as below on the master node. Note that,
    I've thee same content on slaves as well, the only difference is, its own IP
    is set to 127.0.0.1 and for others the exact IP is set:

    ------------------------------/etc/hosts conent:
    127.0.0.1 localhost
    127.0.0.1 nutchcluster1
    10.111.59.96 nutchcluster2
    10.201.223.79 nutchcluster3
    10.190.117.68 nutchcluster4

    # The following lines are desirable for IPv6 capable hosts
    ::1 ip6-localhost ip6-loopback
    fe00::0 ip6-localnet
    ff00::0 ip6-mcastprefix
    ff02::1 ip6-allnodes
    ff02::2 ip6-allrouters
    ff02::3 ip6-allhosts

    -----------------/etc/hosts content ends here

    File content for masters is :
    nutchcluster1

    and file content for slaves is:
    nutchcluster1
    nutchcluster2
    nutchcluster3
    nutchcluster4

    Then, I copied all the contents relevant to config[*-site.xml, *.env files]
    folder on all the slaves.

    As per the steps, I'm starting the HDFS using: bin/start-dfs.sh and then
    I'm starting the mapreduce as : bin/start-mapred.sh
    After running the above two on my master machine[nuthccluster1] I can see
    the following jps output:
    adoop@nutchcluster1:~/hadoop$ jps
    6401 SecondaryNameNode
    6723 TaskTracker
    6224 DataNode
    6540 JobTracker
    7354 Jps
    6040 NameNode
    hadoop@nutchcluster1:~/hadoop$


    and on the slaves, the jps output is:
    hadoop@nutchcluster2:~/hadoop$ jps
    8952 DataNode
    9104 TaskTracker
    9388 Jps
    hadoop@nutchcluster2:~/hadoop$


    This clearly indicates that the port 54310 is accessible from the master
    only and not from the slaves. This is the point I'm stuck at and would
    appreciate if someone could point me what config is missing or what is
    wrong. Any comment/feedback, in this regard would be highly appreciated.
    Thanks in advance.


    Regards,
    DW


    --
    Harsh J


    --
    Nitin Pawar
  • A Geek at Dec 3, 2012 at 3:41 am
    Thanks Harsh. As per your comments, I removed the loopback address for the hostname and added the LAN IP, and I copied the same content on all the 3 slave machines and everything started working.
    Thanks Nitin for pointing me to Whirr. I'd a quick look earlier at Whirr, but though it might be complex to setup the things and did everything manually. But now, it looks like Whirr is quite a useful tool. I'll take a look.
    Thanks to the Hadoop community, my cluster is now up and running.

    Regards, DW
    Date: Mon, 3 Dec 2012 00:18:48 +0530
    Subject: Re: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based machines]
    From: nitinpawar432@gmail.com
    To: user@hadoop.apache.org

    also,
    if you want to setup a hadoop cluster on aws, just try using whirr. Basically it does everything for you

    On Sun, Dec 2, 2012 at 10:12 PM, Harsh J wrote:

    Your problem is that your /etc/hosts file has the line:



    127.0.0.1 nutchcluster1



    Just delete that line, restart your services. You intend your hostname

    "nutchcluster1" to be externally accessible, so aliasing it to the

    loopback address (127.0.0.1) is not right.


    On Sun, Dec 2, 2012 at 10:08 PM, A Geek wrote:

    Hi,
    Just to add the version details: I'm running Apache Hadoop release 1.0.4
    with jdk1.6.0_37 . The underlying Ubuntu 12.04 machine has got 300GB disk
    space and has 1.7GB RAM and is a single core machine. >
    Regards,
    DW >
    ________________________________
    From: dw.90@live.com
    To: user@hadoop.apache.org
    Subject: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based
    machines]
    Date: Sun, 2 Dec 2012 15:55:09 +0000
    >

    >
    Hi All,
    I'm trying to setup Hadoop Cluster using 4 machines[4 x Ubuntu 12.04 x_64].
    Using the following doc:
    1.
    http://titan.softnet.tuc.gr:8082/User:xenia/Page_Title/Hadoop_Cluster_Setup_Tutorial >
    I'm able to setup hadoop clusters with required configurations. I can see
    that all the required services on master and on slaves nodes are running as
    required[please see below JPS command output ]. The problem, I'm facing is
    that, the HDFS and Mapreduce daemons running on Master and can be accessed
    from Master only, and not from the slave machines. Note that, I've added
    these ports in the EC2 security group to open them. And I can browse the
    master machines UI from web browser, using: http://<machine
    ip>:50070/dfshealth.jsp
    >

    >
    Now, the problem which I'm facing is , the HDFS as well the jobtracker both
    are accessible from the master machine[I'm using master as both Namenode and
    Datanode] but both the ports[hdfs: 54310 and mapreduce: 54320] used for
    these two are not accessible from other slave nodes. >
    I did: netstat -puntl on master machine and got this: >
    hadoop@nutchcluster1:~/hadoop$ netstat -puntl
    (Not all processes could be identified, non-owned process info
    will not be shown, you would have to be root to see it all.)
    Active Internet connections (only servers)
    Proto Recv-Q Send-Q Local Address Foreign Address State
    PID/Program name
    tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN
    -
    tcp6 0 0 :::50020 :::* LISTEN
    6224/java
    tcp6 0 0 127.0.0.1:54310 :::* LISTEN
    6040/java
    tcp6 0 0 127.0.0.1:32776 :::* LISTEN
    6723/java
    tcp6 0 0 :::57065 :::* LISTEN
    6040/java
    tcp6 0 0 :::50090 :::* LISTEN
    6401/java
    tcp6 0 0 :::50060 :::* LISTEN
    6723/java
    tcp6 0 0 :::50030 :::* LISTEN
    6540/java
    tcp6 0 0 127.0.0.1:54320 :::* LISTEN
    6540/java
    tcp6 0 0 :::45747 :::* LISTEN
    6401/java
    tcp6 0 0 :::33174 :::* LISTEN
    6540/java
    tcp6 0 0 :::50070 :::* LISTEN
    6040/java
    tcp6 0 0 :::22 :::* LISTEN
    -
    tcp6 0 0 :::54424 :::* LISTEN
    6224/java
    tcp6 0 0 :::50010 :::* LISTEN
    6224/java
    tcp6 0 0 :::50075 :::* LISTEN
    6224/java
    udp 0 0 0.0.0.0:68 0.0.0.0:*
    -
    hadoop@nutchcluster1:~/hadoop$
    >

    >
    As can be seen in the output, both the HDFS daemon and mapreduce daemons are
    accessible, but only from 127.0.0.1 and not from 0.0.0.0 [any machine/slave
    machines]
    tcp6 0 0 127.0.0.1:54310 :::* LISTEN
    6040/java
    tcp6 0 0 127.0.0.1:54320 :::* LISTEN
    6540/java
    >

    >
    To confirm, the same Idid this on master:
    adoop@nutchcluster1:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/
    Found 1 items
    drwxr-xr-x - hadoop supergroup 0 2012-12-02 12:53 /home
    hadoop@nutchcluster1:~/hadoop$ >
    But, when I ran the same command on slaves, I get this:
    hadoop@nutchcluster2:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/
    12/12/02 15:42:16 INFO ipc.Client: Retrying connect to server:
    nutchcluster1/10.4.39.23:54310. Already tried 0 time(s).
    12/12/02 15:42:17 INFO ipc.Client: Retrying connect to server:
    nutchcluster1/10.4.39.23:54310. Already tried 1 time(s).
    12/12/02 15:42:18 INFO ipc.Client: Retrying connect to server:
    nutchcluster1/10.4.39.23:54310. Already tried 2 time(s).
    12/12/02 15:42:19 INFO ipc.Client: Retrying connect to server:
    nutchcluster1/10.4.39.23:54310. Already tried 3 time(s).
    12/12/02 15:42:20 INFO ipc.Client: Retrying connect to server:
    nutchcluster1/10.4.39.23:54310. Already tried 4 time(s).
    12/12/02 15:42:21 INFO ipc.Client: Retrying connect to server:
    nutchcluster1/10.4.39.23:54310. Already tried 5 time(s).
    12/12/02 15:42:22 INFO ipc.Client: Retrying connect to server:
    nutchcluster1/10.4.39.23:54310. Already tried 6 time(s).
    12/12/02 15:42:23 INFO ipc.Client: Retrying connect to server:
    nutchcluster1/10.4.39.23:54310. Already tried 7 time(s).
    12/12/02 15:42:24 INFO ipc.Client: Retrying connect to server:
    nutchcluster1/10.4.39.23:54310. Already tried 8 time(s).
    12/12/02 15:42:25 INFO ipc.Client: Retrying connect to server:
    nutchcluster1/10.4.39.23:54310. Already tried 9 time(s).
    Bad connection to FS. command aborted. exception: Call to
    nutchcluster1/10.4.39.23:54310 failed on connection exception:
    java.net.ConnectException: Connection refused
    hadoop@nutchcluster2:~/hadoop$
    >

    >

    >
    The configurations are as below: >
    --------------core-site.xml content is as below
    <property>
    <name>hadoop.tmp.dir</name>
    <value>/home/hadoop/hadoop/datastore/hadoop-${user.name}</value>
    <description>A base for other temporary directories</description>
    </property> >
    <property>
    <name>fs.default.name</name>
    <value>hdfs://nutchcluster1:54310</value>
    <description>The name of the default file system. A URI whose
    scheme and authority determine the FileSystem implementation. The
    uri's scheme determines the config property (fs.SCHEME.impl) naming
    the FileSystem implementation class. The uri's authority is used to
    determine the host, port, etc. for a filesystem.</description>
    </property>
    >

    >
    -----------------hdfs-site.xml content is as below:
    <configuration>
    <property>
    <name>dfs.replication</name>
    <value>2</value>
    <description>Default block replication.
    The actual number of replications can be specified when the file is
    created.
    The default is used if replication is not specified in create time.
    </description>
    </property>
    </configuration>
    >

    >

    >
    ------------------mapred-site.xml content is as:
    <configuration>
    <property>
    <name>mapred.job.tracker</name>
    <value>nutchcluster1:54320</value>
    <description>The host and port that the MapReduce job tracker runs
    at. If "local", then jobs are run in-process as a single map
    and reduce task.
    </description>
    </property>
    <property>
    <name>mapred.map.tasks</name>
    <value>40</value>
    <description>As a rule of thumb, use 10x the number of slaves (i.e.,
    number of tasktrackers).
    </description>
    </property> >
    <property>
    <name>mapred.reduce.tasks</name>
    <value>40</value>
    <description>As a rule of thumb, use 2x the number of slave processors
    (i.e., number of tasktrackers).
    </description>
    </property>
    </configuration>
    >

    >
    I replicated all the above on all the other 3 slave machines[1 master + 3
    slaves]. My /etc/hosts content is as below on the master node. Note that,
    I've thee same content on slaves as well, the only difference is, its own IP
    is set to 127.0.0.1 and for others the exact IP is set: >
    ------------------------------/etc/hosts conent:
    127.0.0.1 localhost
    127.0.0.1 nutchcluster1
    10.111.59.96 nutchcluster2
    10.201.223.79 nutchcluster3
    10.190.117.68 nutchcluster4 >
    # The following lines are desirable for IPv6 capable hosts
    ::1 ip6-localhost ip6-loopback
    fe00::0 ip6-localnet
    ff00::0 ip6-mcastprefix
    ff02::1 ip6-allnodes
    ff02::2 ip6-allrouters
    ff02::3 ip6-allhosts >
    -----------------/etc/hosts content ends here >
    File content for masters is :
    nutchcluster1 >
    and file content for slaves is:
    nutchcluster1
    nutchcluster2
    nutchcluster3
    nutchcluster4 >
    Then, I copied all the contents relevant to config[*-site.xml, *.env files]
    folder on all the slaves. >
    As per the steps, I'm starting the HDFS using: bin/start-dfs.sh and then
    I'm starting the mapreduce as : bin/start-mapred.sh
    After running the above two on my master machine[nuthccluster1] I can see
    the following jps output:
    adoop@nutchcluster1:~/hadoop$ jps
    6401 SecondaryNameNode
    6723 TaskTracker
    6224 DataNode
    6540 JobTracker
    7354 Jps
    6040 NameNode
    hadoop@nutchcluster1:~/hadoop$
    >

    >
    and on the slaves, the jps output is:
    hadoop@nutchcluster2:~/hadoop$ jps
    8952 DataNode
    9104 TaskTracker
    9388 Jps
    hadoop@nutchcluster2:~/hadoop$
    >

    >
    This clearly indicates that the port 54310 is accessible from the master
    only and not from the slaves. This is the point I'm stuck at and would
    appreciate if someone could point me what config is missing or what is
    wrong. Any comment/feedback, in this regard would be highly appreciated.
    Thanks in advance.
    >

    >
    Regards,
    DW






    --

    Harsh J



    --
    Nitin Pawar

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouphdfs-user @
categorieshadoop
postedDec 2, '12 at 3:55p
activeDec 3, '12 at 3:41a
posts5
users3
websitehadoop.apache.org...
irc#hadoop

3 users in discussion

A Geek: 3 posts Nitin Pawar: 1 post Harsh J: 1 post

People

Translate

site design / logo © 2021 Grokbase