I have three nodes where situation is the same as described above.. Pinging
and ssh-ing to the master node works ok, however logs are full of the
following info:
"INFO org.apache.hadoop.ipc.Client: Retrying connect to server:
hadoopmaster/192.168.12.1:54310"
Jobtracker sees just one node (http://hadoopmaster:50030/jobtracker.jsp ->
probably the local one). Hadoopmaster node has the following process: "
127.0.0.1:54310 0.0.0.0:* LISTEN 1001
17909 5131/java " so the problem is probably elsewhere..
Do you have any idea what could be wrong?
On 11 August 2011 12:19, V@ni wrote:
Hi, This can be SSH configuration error. Reinstall SSH and try connecting
to
the master node. Hope it works
May be dis link might be useful
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
sachinites wrote:
View this message in context:
http://old.nabble.com/slave-nodes-could-not-connect-to-Master-.-tp32223105p32240913.html
Sent from the Hadoop core-dev mailing list archive at Nabble.com.
Hi, This can be SSH configuration error. Reinstall SSH and try connecting
to
the master node. Hope it works
May be dis link might be useful
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
sachinites wrote:
Sir , I tried everywhere on all forums , but could not resoleve this
problem . please help me out .
I followed your tutorial to run the hadoop , first running a datanode
locally on the same machine . and it all worked fine .
Then , i configured hadoop to run the namenode , secondarynamenode & and a
job tracker and a datanode on one machine , the master , and other two
machines as slaves . Total of three datanodes . start-all.sh successfully
start all daemons on all required nodes .
But is seems , that only the datanode running locally on my machine is
executing the whole job , the rest two slaves are starving for the
connection with the masters. Their tasktracker logs reads (same on both
slaves ):
2011-08-09 07:09:54,099 INFO org.apache.hadoop.ipc.Server: IPC Server
Responder: starting
2011-08-09 07:09:54,100 INFO org.apache.hadoop.ipc.Server: IPC Server
listener on 37874: starting
2011-08-09 07:09:54,103 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 0 on 37874: starting
2011-08-09 07:09:54,104 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 1 on 37874: starting
2011-08-09 07:09:54,104 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 2 on 37874: starting
2011-08-09 07:09:54,105 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 3 on 37874: starting
2011-08-09 07:09:54,105 INFO org.apache.hadoop.mapred.TaskTracker:
TaskTracker up at: localhost/127.0.0.1:37874
2011-08-09 07:09:54,105 INFO org.apache.hadoop.mapred.TaskTracker:
Starting tracker tracker_gislab-desktop:localhost/127.0.0.1:37874
2011-08-09 07:09:55,145 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /10.14.11.32:9001. Already tried 0 time(s).
2011-08-09 07:09:56,146 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /10.14.11.32:9001. Already tried 1 time(s).
2011-08-09 07:09:57,147 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /10.14.11.32:9001. Already tried 2 time(s).
2011-08-09 07:09:58,148 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /10.14.11.32:9001. Already tried 3 time(s).
2011-08-09 07:09:59,149 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /10.14.11.32:9001. Already tried 4 time(s).
2011-08-09 07:10:00,150 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /10.14.11.32:9001. Already tried 5 time(s).
2011-08-09 07:10:01,151 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /10.14.11.32:9001. Already tried 6 time(s).
2011-08-09 07:10:02,151 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /10.14.11.32:9001. Already tried 7 time(s).
2011-08-09 07:10:03,152 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /10.14.11.32:9001. Already tried 8 time(s).
2011-08-09 07:10:04,153 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /10.14.11.32:9001. Already tried 9 time(s).
2011-08-09 07:10:04,156 INFO org.apache.hadoop.ipc.RPC: Server at
/10.14.11.32:9001 not available yet, Zzzzz...
2011-08-09 07:10:06,158 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /10.14.11.32:9001. Already tried 0 time(s).
2011-08-09 07:10:07,159 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /10.14.11.32:9001. Already tried 1 time(s).
.... and continues. I tried re-enable the ipv6 address , but in vain . The
data in HDFS is distributed on all datanodes though. This confirms its not
a network problem also . I can password-lessely ssh in all directions .
Sir, please help . I shall be grateful to you.
Abhishek
Masters Student
IIT Bombay
INDIA
--problem . please help me out .
I followed your tutorial to run the hadoop , first running a datanode
locally on the same machine . and it all worked fine .
Then , i configured hadoop to run the namenode , secondarynamenode & and a
job tracker and a datanode on one machine , the master , and other two
machines as slaves . Total of three datanodes . start-all.sh successfully
start all daemons on all required nodes .
But is seems , that only the datanode running locally on my machine is
executing the whole job , the rest two slaves are starving for the
connection with the masters. Their tasktracker logs reads (same on both
slaves ):
2011-08-09 07:09:54,099 INFO org.apache.hadoop.ipc.Server: IPC Server
Responder: starting
2011-08-09 07:09:54,100 INFO org.apache.hadoop.ipc.Server: IPC Server
listener on 37874: starting
2011-08-09 07:09:54,103 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 0 on 37874: starting
2011-08-09 07:09:54,104 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 1 on 37874: starting
2011-08-09 07:09:54,104 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 2 on 37874: starting
2011-08-09 07:09:54,105 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 3 on 37874: starting
2011-08-09 07:09:54,105 INFO org.apache.hadoop.mapred.TaskTracker:
TaskTracker up at: localhost/127.0.0.1:37874
2011-08-09 07:09:54,105 INFO org.apache.hadoop.mapred.TaskTracker:
Starting tracker tracker_gislab-desktop:localhost/127.0.0.1:37874
2011-08-09 07:09:55,145 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /10.14.11.32:9001. Already tried 0 time(s).
2011-08-09 07:09:56,146 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /10.14.11.32:9001. Already tried 1 time(s).
2011-08-09 07:09:57,147 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /10.14.11.32:9001. Already tried 2 time(s).
2011-08-09 07:09:58,148 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /10.14.11.32:9001. Already tried 3 time(s).
2011-08-09 07:09:59,149 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /10.14.11.32:9001. Already tried 4 time(s).
2011-08-09 07:10:00,150 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /10.14.11.32:9001. Already tried 5 time(s).
2011-08-09 07:10:01,151 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /10.14.11.32:9001. Already tried 6 time(s).
2011-08-09 07:10:02,151 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /10.14.11.32:9001. Already tried 7 time(s).
2011-08-09 07:10:03,152 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /10.14.11.32:9001. Already tried 8 time(s).
2011-08-09 07:10:04,153 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /10.14.11.32:9001. Already tried 9 time(s).
2011-08-09 07:10:04,156 INFO org.apache.hadoop.ipc.RPC: Server at
/10.14.11.32:9001 not available yet, Zzzzz...
2011-08-09 07:10:06,158 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /10.14.11.32:9001. Already tried 0 time(s).
2011-08-09 07:10:07,159 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /10.14.11.32:9001. Already tried 1 time(s).
.... and continues. I tried re-enable the ipv6 address , but in vain . The
data in HDFS is distributed on all datanodes though. This confirms its not
a network problem also . I can password-lessely ssh in all directions .
Sir, please help . I shall be grateful to you.
Abhishek
Masters Student
IIT Bombay
INDIA
View this message in context:
http://old.nabble.com/slave-nodes-could-not-connect-to-Master-.-tp32223105p32240913.html
Sent from the Hadoop core-dev mailing list archive at Nabble.com.