FAQ
I am trying to backup (with *distcp*)* *my Hadoop cluster so that I can
upgrade it without losing all of my data. I am trying to go* from
nn01-qa.ny3/ to medium01.ny3/backup

*Right now, it doesn't work. How can I fix this problem?

Both clusters are CDH3u5 on CentOS 6.3 x64 on the same VMware ESXi 5.1
host. The network is completely open to each host and all firewalls are
turned off. The only strange thing is that even when I specify port 8020
specifically, Hadoop tries to use port 8021. (Does HDFS guarantee integer
integrity? Just kidding.)
*
Logs from the source side:*
2013-02-10 04:39:35 at 97 as root@nn01-qa.ny3 in ~
# hadoop distcp hdfs://nn01-qa.ny3:8020/ hdfs://medium01.ny3:8020/backup
13/02/10 04:39:45 INFO tools.DistCp: srcPaths=[hdfs://nn01-qa.ny3:8020/]
13/02/10 04:39:45 INFO tools.DistCp:
destPath=hdfs://medium01.ny3:8020/backup
13/02/10 04:39:47 INFO ipc.Client: Retrying connect to server:
nn01-qa.ny3/10.24.9.100:8021. Already tried 0 time(s).
13/02/10 04:39:48 INFO ipc.Client: Retrying connect to server:
nn01-qa.ny3/10.24.9.100:8021. Already tried 1 time(s).
13/02/10 04:39:49 INFO ipc.Client: Retrying connect to server:
nn01-qa.ny3/10.24.9.100:8021. Already tried 2 time(s).
13/02/10 04:39:50 INFO ipc.Client: Retrying connect to server:
nn01-qa.ny3/10.24.9.100:8021. Already tried 3 time(s).
13/02/10 04:39:51 INFO ipc.Client: Retrying connect to server:
nn01-qa.ny3/10.24.9.100:8021. Already tried 4 time(s).
13/02/10 04:39:52 INFO ipc.Client: Retrying connect to server:
nn01-qa.ny3/10.24.9.100:8021. Already tried 5 time(s).
13/02/10 04:39:53 INFO ipc.Client: Retrying connect to server:
nn01-qa.ny3/10.24.9.100:8021. Already tried 6 time(s).
13/02/10 04:39:54 INFO ipc.Client: Retrying connect to server:
nn01-qa.ny3/10.24.9.100:8021. Already tried 7 time(s).
13/02/10 04:39:55 INFO ipc.Client: Retrying connect to server:
nn01-qa.ny3/10.24.9.100:8021. Already tried 8 time(s).
13/02/10 04:39:56 INFO ipc.Client: Retrying connect to server:
nn01-qa.ny3/10.24.9.100:8021. Already tried 9 time(s).
With failures, global counters are inaccurate; consider running with -i
Copy failed: java.net.ConnectException: Call to
nn01-qa.ny3/10.24.9.100:8021 failed on connection exception:
java.net.ConnectException: Connection refused
at org.apache.hadoop.ipc.Client.wrapException(Client.java:1179)
at org.apache.hadoop.ipc.Client.call(Client.java:1155)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226)
at org.apache.hadoop.mapred.$Proxy1.getProtocolVersion(Unknown
Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:398)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:384)
at
org.apache.hadoop.mapred.JobClient.createRPCProxy(JobClient.java:511)
at org.apache.hadoop.mapred.JobClient.init(JobClient.java:496)
at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:479)
at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1018)
at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
at
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:519)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:484)
at
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:468)
at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:575)
at
org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:212)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1292)
at org.apache.hadoop.ipc.Client.call(Client.java:1121)
... 13 more


2013-02-10 04:39:56 at 98 as root@nn01-qa.ny3 in ~
# netstat -an | grep 8020
tcp 0 0 10.24.9.100:8020
0.0.0.0:* LISTEN
tcp 0 0 10.24.9.100:33205
10.24.9.100:8020 TIME_WAIT
tcp 0 0 10.24.9.100:8020
10.24.9.102:45745 ESTABLISHED
tcp 0 0 10.24.9.100:33192
10.24.9.100:8020 TIME_WAIT
tcp 0 0 10.24.9.100:8020
10.24.9.101:59276 ESTABLISHED
tcp 0 0 10.24.9.100:8020
10.24.9.103:37636 ESTABLISHED

2013-02-10 04:40:23 at 99 as root@nn01-qa.ny3 in ~
# netstat -an | grep 8021

2013-02-10 04:40:25 at 100 as root@nn01-qa.ny3 in ~
# service iptables status
iptables: Firewall is not running.

2013-02-10 04:44:11 at 102 as root@nn01-qa.ny3 in ~
# telnet medium01.ny3 8020
Trying 10.24.5.120...
Connected to medium01.ny3.
Escape character is '^]'.

Oh Hai!
Connection closed by foreign host.

2013-02-10 04:44:27 at 103 as root@nn01-qa.ny3 in ~
# telnet medium01.ny3 8021
Trying 10.24.5.120...
telnet: connect to address 10.24.5.120: Connection refused



*Logs from the destination side:*
2013-02-10 04:32:24 at 96 as root@medium01.ny3 in ~/tmp/cm
# hadoop distcp hdfs://nn01-qa.ny3:8020/ hdfs://medium01.ny3:8020/backup/
13/02/10 04:35:08 INFO tools.DistCp: srcPaths=[hdfs://nn01-qa.ny3:8020/]
13/02/10 04:35:08 INFO tools.DistCp:
destPath=hdfs://medium01.ny3:8020/backup
13/02/10 04:35:10 INFO ipc.Client: Retrying connect to server:
medium01.ny3/10.24.5.120:8021. Already tried 0 time(s).
13/02/10 04:35:11 INFO ipc.Client: Retrying connect to server:
medium01.ny3/10.24.5.120:8021. Already tried 1 time(s).
13/02/10 04:35:12 INFO ipc.Client: Retrying connect to server:
medium01.ny3/10.24.5.120:8021. Already tried 2 time(s).
13/02/10 04:35:13 INFO ipc.Client: Retrying connect to server:
medium01.ny3/10.24.5.120:8021. Already tried 3 time(s).
13/02/10 04:35:14 INFO ipc.Client: Retrying connect to server:
medium01.ny3/10.24.5.120:8021. Already tried 4 time(s).
13/02/10 04:35:15 INFO ipc.Client: Retrying connect to server:
medium01.ny3/10.24.5.120:8021. Already tried 5 time(s).
13/02/10 04:35:16 INFO ipc.Client: Retrying connect to server:
medium01.ny3/10.24.5.120:8021. Already tried 6 time(s).
13/02/10 04:35:17 INFO ipc.Client: Retrying connect to server:
medium01.ny3/10.24.5.120:8021. Already tried 7 time(s).
13/02/10 04:35:18 INFO ipc.Client: Retrying connect to server:
medium01.ny3/10.24.5.120:8021. Already tried 8 time(s).
13/02/10 04:35:19 INFO ipc.Client: Retrying connect to server:
medium01.ny3/10.24.5.120:8021. Already tried 9 time(s).
With failures, global counters are inaccurate; consider running with -i
Copy failed: java.net.ConnectException: Call to
medium01.ny3/10.24.5.120:8021 failed on connection exception:
java.net.ConnectException: Connection refused
at org.apache.hadoop.ipc.Client.wrapException(Client.java:1179)
at org.apache.hadoop.ipc.Client.call(Client.java:1155)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226)
at org.apache.hadoop.mapred.$Proxy1.getProtocolVersion(Unknown
Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:398)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:384)
at
org.apache.hadoop.mapred.JobClient.createRPCProxy(JobClient.java:511)
at org.apache.hadoop.mapred.JobClient.init(JobClient.java:496)
at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:479)
at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1018)
at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
at
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:519)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:484)
at
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:468)
at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:575)
at
org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:212)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1292)
at org.apache.hadoop.ipc.Client.call(Client.java:1121)
... 13 more


2013-02-10 04:35:19 at 97 as root@medium01.ny3 in ~/tmp/cm
# netstat -an | grep 8020
tcp 0 0 10.24.5.120:8020
0.0.0.0:* LISTEN
tcp 0 0 10.24.5.120:60726
10.24.5.120:8020 ESTABLISHED
tcp 0 0 10.24.5.120:8020
10.24.5.120:60726 ESTABLISHED
tcp 0 0 10.24.5.120:60798
10.24.5.120:8020 TIME_WAIT

2013-02-10 04:37:14 at 98 as root@medium01.ny3 in ~/tmp/cm
# netstat -an | grep 8021

2013-02-10 04:37:15 at 99 as root@medium01.ny3 in ~/tmp/cm
# service iptables status
iptables: Firewall is not running.

2013-02-10 04:43:32 at 104 as root@medium01.ny3 in ~/tmp/cm
# telnet nn01-qa.ny3 8020
Trying 10.24.9.100...
Connected to nn01-qa.ny3.
Escape character is '^]'.


Hi!
Connection closed by foreign host.

*Logs from a slave node in the source cluster:*
2013-02-10 04:45:01 at 72 as root@sn01-qa.ny3 in ~/tmp/centos
# hadoop distcp hdfs://nn01-qa.ny3:8020/ hdfs://medium01.ny3:8020/backup
13/02/10 04:45:13 INFO tools.DistCp: srcPaths=[hdfs://nn01-qa.ny3:8020/]
13/02/10 04:45:13 INFO tools.DistCp:
destPath=hdfs://medium01.ny3:8020/backup
13/02/10 04:45:15 INFO ipc.Client: Retrying connect to server:
nn01-qa.ny3/10.24.9.100:8021. Already tried 0 time(s).
13/02/10 04:45:16 INFO ipc.Client: Retrying connect to server:
nn01-qa.ny3/10.24.9.100:8021. Already tried 1 time(s).
13/02/10 04:45:17 INFO ipc.Client: Retrying connect to server:
nn01-qa.ny3/10.24.9.100:8021. Already tried 2 time(s).
13/02/10 04:45:18 INFO ipc.Client: Retrying connect to server:
nn01-qa.ny3/10.24.9.100:8021. Already tried 3 time(s).
13/02/10 04:45:19 INFO ipc.Client: Retrying connect to server:
nn01-qa.ny3/10.24.9.100:8021. Already tried 4 time(s).
13/02/10 04:45:20 INFO ipc.Client: Retrying connect to server:
nn01-qa.ny3/10.24.9.100:8021. Already tried 5 time(s).
13/02/10 04:45:21 INFO ipc.Client: Retrying connect to server:
nn01-qa.ny3/10.24.9.100:8021. Already tried 6 time(s).
^C
2013-02-10 04:45:21 at 73 as root@sn01-qa.ny3 in ~/tmp/centos
# hadoop fs -get /user/root/tmp/CentOS-6.3-x86_64-netinstall.iso.gz.9
./CentOS-6.3-x86_64-netinstall.iso.gz.9

2013-02-10 04:45:32 at 74 as root@sn01-qa.ny3 in ~/tmp/centos
# l
total 188M
drwxr-xr-x 2 root root 4.0K Feb 10 04:45 .
drwxr-xr-x. 4 root root 4.0K Feb 10 04:04 ..
-rw-r--r-- 1 root root 188M Feb 10 04:45
CentOS-6.3-x86_64-netinstall.iso.gz.9

*How can I fix this problem?*

--

Search Discussions

  • Harsh J at Feb 10, 2013 at 6:49 am
    Hi Gordon,

    DistCp is a MapReduce-based copying job and its looking to connect to
    8021 cause thats the JobTracker port of the place where the job's to
    be submitted. Does this help clarify your situation and error?
    On Sun, Feb 10, 2013 at 10:20 AM, Gordon Fogus wrote:
    I am trying to backup (with distcp) my Hadoop cluster so that I can upgrade
    it without losing all of my data. I am trying to go from nn01-qa.ny3/ to
    medium01.ny3/backup

    Right now, it doesn't work. How can I fix this problem?

    Both clusters are CDH3u5 on CentOS 6.3 x64 on the same VMware ESXi 5.1 host.
    The network is completely open to each host and all firewalls are turned
    off. The only strange thing is that even when I specify port 8020
    specifically, Hadoop tries to use port 8021. (Does HDFS guarantee integer
    integrity? Just kidding.)

    Logs from the source side:
    2013-02-10 04:39:35 at 97 as root@nn01-qa.ny3 in ~
    # hadoop distcp hdfs://nn01-qa.ny3:8020/ hdfs://medium01.ny3:8020/backup
    13/02/10 04:39:45 INFO tools.DistCp: srcPaths=[hdfs://nn01-qa.ny3:8020/]
    13/02/10 04:39:45 INFO tools.DistCp:
    destPath=hdfs://medium01.ny3:8020/backup
    13/02/10 04:39:47 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 0 time(s).
    13/02/10 04:39:48 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 1 time(s).
    13/02/10 04:39:49 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 2 time(s).
    13/02/10 04:39:50 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 3 time(s).
    13/02/10 04:39:51 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 4 time(s).
    13/02/10 04:39:52 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 5 time(s).
    13/02/10 04:39:53 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 6 time(s).
    13/02/10 04:39:54 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 7 time(s).
    13/02/10 04:39:55 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 8 time(s).
    13/02/10 04:39:56 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 9 time(s).
    With failures, global counters are inaccurate; consider running with -i
    Copy failed: java.net.ConnectException: Call to
    nn01-qa.ny3/10.24.9.100:8021 failed on connection exception:
    java.net.ConnectException: Connection refused
    at org.apache.hadoop.ipc.Client.wrapException(Client.java:1179)
    at org.apache.hadoop.ipc.Client.call(Client.java:1155)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226)
    at org.apache.hadoop.mapred.$Proxy1.getProtocolVersion(Unknown
    Source)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:398)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:384)
    at
    org.apache.hadoop.mapred.JobClient.createRPCProxy(JobClient.java:511)
    at org.apache.hadoop.mapred.JobClient.init(JobClient.java:496)
    at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:479)
    at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1018)
    at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
    at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
    Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at
    sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
    at
    org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:519)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:484)
    at
    org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:468)
    at
    org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:575)
    at
    org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:212)
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1292)
    at org.apache.hadoop.ipc.Client.call(Client.java:1121)
    ... 13 more


    2013-02-10 04:39:56 at 98 as root@nn01-qa.ny3 in ~
    # netstat -an | grep 8020
    tcp 0 0 10.24.9.100:8020 0.0.0.0:*
    LISTEN
    tcp 0 0 10.24.9.100:33205 10.24.9.100:8020
    TIME_WAIT
    tcp 0 0 10.24.9.100:8020 10.24.9.102:45745
    ESTABLISHED
    tcp 0 0 10.24.9.100:33192 10.24.9.100:8020
    TIME_WAIT
    tcp 0 0 10.24.9.100:8020 10.24.9.101:59276
    ESTABLISHED
    tcp 0 0 10.24.9.100:8020 10.24.9.103:37636
    ESTABLISHED

    2013-02-10 04:40:23 at 99 as root@nn01-qa.ny3 in ~
    # netstat -an | grep 8021

    2013-02-10 04:40:25 at 100 as root@nn01-qa.ny3 in ~
    # service iptables status
    iptables: Firewall is not running.

    2013-02-10 04:44:11 at 102 as root@nn01-qa.ny3 in ~
    # telnet medium01.ny3 8020
    Trying 10.24.5.120...
    Connected to medium01.ny3.
    Escape character is '^]'.

    Oh Hai!
    Connection closed by foreign host.

    2013-02-10 04:44:27 at 103 as root@nn01-qa.ny3 in ~
    # telnet medium01.ny3 8021
    Trying 10.24.5.120...
    telnet: connect to address 10.24.5.120: Connection refused




    Logs from the destination side:
    2013-02-10 04:32:24 at 96 as root@medium01.ny3 in ~/tmp/cm
    # hadoop distcp hdfs://nn01-qa.ny3:8020/ hdfs://medium01.ny3:8020/backup/
    13/02/10 04:35:08 INFO tools.DistCp: srcPaths=[hdfs://nn01-qa.ny3:8020/]
    13/02/10 04:35:08 INFO tools.DistCp:
    destPath=hdfs://medium01.ny3:8020/backup
    13/02/10 04:35:10 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 0 time(s).
    13/02/10 04:35:11 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 1 time(s).
    13/02/10 04:35:12 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 2 time(s).
    13/02/10 04:35:13 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 3 time(s).
    13/02/10 04:35:14 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 4 time(s).
    13/02/10 04:35:15 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 5 time(s).
    13/02/10 04:35:16 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 6 time(s).
    13/02/10 04:35:17 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 7 time(s).
    13/02/10 04:35:18 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 8 time(s).
    13/02/10 04:35:19 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 9 time(s).
    With failures, global counters are inaccurate; consider running with -i
    Copy failed: java.net.ConnectException: Call to
    medium01.ny3/10.24.5.120:8021 failed on connection exception:
    java.net.ConnectException: Connection refused
    at org.apache.hadoop.ipc.Client.wrapException(Client.java:1179)
    at org.apache.hadoop.ipc.Client.call(Client.java:1155)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226)
    at org.apache.hadoop.mapred.$Proxy1.getProtocolVersion(Unknown
    Source)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:398)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:384)
    at
    org.apache.hadoop.mapred.JobClient.createRPCProxy(JobClient.java:511)
    at org.apache.hadoop.mapred.JobClient.init(JobClient.java:496)
    at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:479)
    at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1018)
    at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
    at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
    Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at
    sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
    at
    org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:519)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:484)
    at
    org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:468)
    at
    org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:575)
    at
    org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:212)
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1292)
    at org.apache.hadoop.ipc.Client.call(Client.java:1121)
    ... 13 more


    2013-02-10 04:35:19 at 97 as root@medium01.ny3 in ~/tmp/cm
    # netstat -an | grep 8020
    tcp 0 0 10.24.5.120:8020 0.0.0.0:*
    LISTEN
    tcp 0 0 10.24.5.120:60726 10.24.5.120:8020
    ESTABLISHED
    tcp 0 0 10.24.5.120:8020 10.24.5.120:60726
    ESTABLISHED
    tcp 0 0 10.24.5.120:60798 10.24.5.120:8020
    TIME_WAIT

    2013-02-10 04:37:14 at 98 as root@medium01.ny3 in ~/tmp/cm
    # netstat -an | grep 8021

    2013-02-10 04:37:15 at 99 as root@medium01.ny3 in ~/tmp/cm
    # service iptables status
    iptables: Firewall is not running.

    2013-02-10 04:43:32 at 104 as root@medium01.ny3 in ~/tmp/cm
    # telnet nn01-qa.ny3 8020
    Trying 10.24.9.100...
    Connected to nn01-qa.ny3.
    Escape character is '^]'.


    Hi!
    Connection closed by foreign host.


    Logs from a slave node in the source cluster:
    2013-02-10 04:45:01 at 72 as root@sn01-qa.ny3 in ~/tmp/centos
    # hadoop distcp hdfs://nn01-qa.ny3:8020/ hdfs://medium01.ny3:8020/backup
    13/02/10 04:45:13 INFO tools.DistCp: srcPaths=[hdfs://nn01-qa.ny3:8020/]
    13/02/10 04:45:13 INFO tools.DistCp:
    destPath=hdfs://medium01.ny3:8020/backup
    13/02/10 04:45:15 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 0 time(s).
    13/02/10 04:45:16 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 1 time(s).
    13/02/10 04:45:17 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 2 time(s).
    13/02/10 04:45:18 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 3 time(s).
    13/02/10 04:45:19 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 4 time(s).
    13/02/10 04:45:20 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 5 time(s).
    13/02/10 04:45:21 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 6 time(s).
    ^C
    2013-02-10 04:45:21 at 73 as root@sn01-qa.ny3 in ~/tmp/centos
    # hadoop fs -get /user/root/tmp/CentOS-6.3-x86_64-netinstall.iso.gz.9
    ./CentOS-6.3-x86_64-netinstall.iso.gz.9

    2013-02-10 04:45:32 at 74 as root@sn01-qa.ny3 in ~/tmp/centos
    # l
    total 188M
    drwxr-xr-x 2 root root 4.0K Feb 10 04:45 .
    drwxr-xr-x. 4 root root 4.0K Feb 10 04:04 ..
    -rw-r--r-- 1 root root 188M Feb 10 04:45
    CentOS-6.3-x86_64-netinstall.iso.gz.9


    How can I fix this problem?

    --



    --
    Harsh J

    --
  • Gordon Fogus at Feb 11, 2013 at 4:48 pm
    Thank you, that was helpful. I started the JobTracker and MapReduce
    services and ran the request again and I no longer have the original error.

    I now have a NullPointerException with a "global counters are inaccurate"
    message. How do I correct these counters? What do they count?

    # hadoop distcp hdfs://nn01-qa.ny3:8020/ hdfs://medium01.ny3:8020/backup/
    13/02/11 16:45:35 INFO tools.DistCp: srcPaths=[hdfs://nn01-qa.ny3:8020/]
    13/02/11 16:45:35 INFO tools.DistCp:
    destPath=hdfs://medium01.ny3:8020/backup
    With failures, global counters are inaccurate; consider running with -i
    Copy failed: java.lang.NullPointerException
    at org.apache.hadoop.tools.DistCp.makeRelative(DistCp.java:925)
    at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1112)
    at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
    at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)


    On 10 February 2013 06:49, Harsh J wrote:

    Hi Gordon,

    DistCp is a MapReduce-based copying job and its looking to connect to
    8021 cause thats the JobTracker port of the place where the job's to
    be submitted. Does this help clarify your situation and error?
    On Sun, Feb 10, 2013 at 10:20 AM, Gordon Fogus wrote:
    I am trying to backup (with distcp) my Hadoop cluster so that I can upgrade
    it without losing all of my data. I am trying to go from nn01-qa.ny3/ to
    medium01.ny3/backup

    Right now, it doesn't work. How can I fix this problem?

    Both clusters are CDH3u5 on CentOS 6.3 x64 on the same VMware ESXi 5.1 host.
    The network is completely open to each host and all firewalls are turned
    off. The only strange thing is that even when I specify port 8020
    specifically, Hadoop tries to use port 8021. (Does HDFS guarantee integer
    integrity? Just kidding.)

    Logs from the source side:
    2013-02-10 04:39:35 at 97 as root@nn01-qa.ny3 in ~
    # hadoop distcp hdfs://nn01-qa.ny3:8020/ hdfs://medium01.ny3:8020/backup
    13/02/10 04:39:45 INFO tools.DistCp: srcPaths=[hdfs://nn01-qa.ny3:8020/]
    13/02/10 04:39:45 INFO tools.DistCp:
    destPath=hdfs://medium01.ny3:8020/backup
    13/02/10 04:39:47 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 0 time(s).
    13/02/10 04:39:48 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 1 time(s).
    13/02/10 04:39:49 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 2 time(s).
    13/02/10 04:39:50 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 3 time(s).
    13/02/10 04:39:51 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 4 time(s).
    13/02/10 04:39:52 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 5 time(s).
    13/02/10 04:39:53 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 6 time(s).
    13/02/10 04:39:54 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 7 time(s).
    13/02/10 04:39:55 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 8 time(s).
    13/02/10 04:39:56 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 9 time(s).
    With failures, global counters are inaccurate; consider running with -i
    Copy failed: java.net.ConnectException: Call to
    nn01-qa.ny3/10.24.9.100:8021 failed on connection exception:
    java.net.ConnectException: Connection refused
    at org.apache.hadoop.ipc.Client.wrapException(Client.java:1179)
    at org.apache.hadoop.ipc.Client.call(Client.java:1155)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226)
    at org.apache.hadoop.mapred.$Proxy1.getProtocolVersion(Unknown
    Source)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:398)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:384)
    at
    org.apache.hadoop.mapred.JobClient.createRPCProxy(JobClient.java:511)
    at org.apache.hadoop.mapred.JobClient.init(JobClient.java:496)
    at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:479)
    at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1018)
    at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
    at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
    Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at
    sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
    at
    org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:519)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:484)
    at
    org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:468)
    at
    org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:575)
    at
    org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:212)
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1292)
    at org.apache.hadoop.ipc.Client.call(Client.java:1121)
    ... 13 more


    2013-02-10 04:39:56 at 98 as root@nn01-qa.ny3 in ~
    # netstat -an | grep 8020
    tcp 0 0 10.24.9.100:8020 0.0.0.0:*
    LISTEN
    tcp 0 0 10.24.9.100:33205 10.24.9.100:8020
    TIME_WAIT
    tcp 0 0 10.24.9.100:8020 10.24.9.102:45745
    ESTABLISHED
    tcp 0 0 10.24.9.100:33192 10.24.9.100:8020
    TIME_WAIT
    tcp 0 0 10.24.9.100:8020 10.24.9.101:59276
    ESTABLISHED
    tcp 0 0 10.24.9.100:8020 10.24.9.103:37636
    ESTABLISHED

    2013-02-10 04:40:23 at 99 as root@nn01-qa.ny3 in ~
    # netstat -an | grep 8021

    2013-02-10 04:40:25 at 100 as root@nn01-qa.ny3 in ~
    # service iptables status
    iptables: Firewall is not running.

    2013-02-10 04:44:11 at 102 as root@nn01-qa.ny3 in ~
    # telnet medium01.ny3 8020
    Trying 10.24.5.120...
    Connected to medium01.ny3.
    Escape character is '^]'.

    Oh Hai!
    Connection closed by foreign host.

    2013-02-10 04:44:27 at 103 as root@nn01-qa.ny3 in ~
    # telnet medium01.ny3 8021
    Trying 10.24.5.120...
    telnet: connect to address 10.24.5.120: Connection refused




    Logs from the destination side:
    2013-02-10 04:32:24 at 96 as root@medium01.ny3 in ~/tmp/cm
    # hadoop distcp hdfs://nn01-qa.ny3:8020/
    hdfs://medium01.ny3:8020/backup/
    13/02/10 04:35:08 INFO tools.DistCp: srcPaths=[hdfs://nn01-qa.ny3:8020/]
    13/02/10 04:35:08 INFO tools.DistCp:
    destPath=hdfs://medium01.ny3:8020/backup
    13/02/10 04:35:10 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 0 time(s).
    13/02/10 04:35:11 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 1 time(s).
    13/02/10 04:35:12 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 2 time(s).
    13/02/10 04:35:13 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 3 time(s).
    13/02/10 04:35:14 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 4 time(s).
    13/02/10 04:35:15 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 5 time(s).
    13/02/10 04:35:16 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 6 time(s).
    13/02/10 04:35:17 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 7 time(s).
    13/02/10 04:35:18 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 8 time(s).
    13/02/10 04:35:19 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 9 time(s).
    With failures, global counters are inaccurate; consider running with -i
    Copy failed: java.net.ConnectException: Call to
    medium01.ny3/10.24.5.120:8021 failed on connection exception:
    java.net.ConnectException: Connection refused
    at org.apache.hadoop.ipc.Client.wrapException(Client.java:1179)
    at org.apache.hadoop.ipc.Client.call(Client.java:1155)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226)
    at org.apache.hadoop.mapred.$Proxy1.getProtocolVersion(Unknown
    Source)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:398)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:384)
    at
    org.apache.hadoop.mapred.JobClient.createRPCProxy(JobClient.java:511)
    at org.apache.hadoop.mapred.JobClient.init(JobClient.java:496)
    at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:479)
    at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1018)
    at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
    at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
    Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at
    sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
    at
    org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:519)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:484)
    at
    org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:468)
    at
    org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:575)
    at
    org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:212)
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1292)
    at org.apache.hadoop.ipc.Client.call(Client.java:1121)
    ... 13 more


    2013-02-10 04:35:19 at 97 as root@medium01.ny3 in ~/tmp/cm
    # netstat -an | grep 8020
    tcp 0 0 10.24.5.120:8020 0.0.0.0:*
    LISTEN
    tcp 0 0 10.24.5.120:60726 10.24.5.120:8020
    ESTABLISHED
    tcp 0 0 10.24.5.120:8020 10.24.5.120:60726
    ESTABLISHED
    tcp 0 0 10.24.5.120:60798 10.24.5.120:8020
    TIME_WAIT

    2013-02-10 04:37:14 at 98 as root@medium01.ny3 in ~/tmp/cm
    # netstat -an | grep 8021

    2013-02-10 04:37:15 at 99 as root@medium01.ny3 in ~/tmp/cm
    # service iptables status
    iptables: Firewall is not running.

    2013-02-10 04:43:32 at 104 as root@medium01.ny3 in ~/tmp/cm
    # telnet nn01-qa.ny3 8020
    Trying 10.24.9.100...
    Connected to nn01-qa.ny3.
    Escape character is '^]'.


    Hi!
    Connection closed by foreign host.


    Logs from a slave node in the source cluster:
    2013-02-10 04:45:01 at 72 as root@sn01-qa.ny3 in ~/tmp/centos
    # hadoop distcp hdfs://nn01-qa.ny3:8020/ hdfs://medium01.ny3:8020/backup
    13/02/10 04:45:13 INFO tools.DistCp: srcPaths=[hdfs://nn01-qa.ny3:8020/]
    13/02/10 04:45:13 INFO tools.DistCp:
    destPath=hdfs://medium01.ny3:8020/backup
    13/02/10 04:45:15 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 0 time(s).
    13/02/10 04:45:16 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 1 time(s).
    13/02/10 04:45:17 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 2 time(s).
    13/02/10 04:45:18 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 3 time(s).
    13/02/10 04:45:19 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 4 time(s).
    13/02/10 04:45:20 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 5 time(s).
    13/02/10 04:45:21 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 6 time(s).
    ^C
    2013-02-10 04:45:21 at 73 as root@sn01-qa.ny3 in ~/tmp/centos
    # hadoop fs -get /user/root/tmp/CentOS-6.3-x86_64-netinstall.iso.gz.9
    ./CentOS-6.3-x86_64-netinstall.iso.gz.9

    2013-02-10 04:45:32 at 74 as root@sn01-qa.ny3 in ~/tmp/centos
    # l
    total 188M
    drwxr-xr-x 2 root root 4.0K Feb 10 04:45 .
    drwxr-xr-x. 4 root root 4.0K Feb 10 04:04 ..
    -rw-r--r-- 1 root root 188M Feb 10 04:45
    CentOS-6.3-x86_64-netinstall.iso.gz.9


    How can I fix this problem?

    --



    --
    Harsh J

    --


    --
  • Gordon Fogus at Feb 11, 2013 at 8:46 pm
    Interesting. I found some bug reports of distcp not working (throwing NPE)
    if the source is the root (in my case, hdfs://nn01-qa.ny3/). Turned out to
    be exactly my case! I simply copied /user and /hbase to the destination
    and it worked with distcp.

    Is this a widely known issue?

    On 11 February 2013 16:47, Gordon Fogus wrote:

    Thank you, that was helpful. I started the JobTracker and MapReduce
    services and ran the request again and I no longer have the original error.

    I now have a NullPointerException with a "global counters are inaccurate"
    message. How do I correct these counters? What do they count?

    # hadoop distcp hdfs://nn01-qa.ny3:8020/ hdfs://medium01.ny3:8020/backup/
    13/02/11 16:45:35 INFO tools.DistCp: srcPaths=[hdfs://nn01-qa.ny3:8020/]
    13/02/11 16:45:35 INFO tools.DistCp:
    destPath=hdfs://medium01.ny3:8020/backup

    With failures, global counters are inaccurate; consider running with -i
    Copy failed: java.lang.NullPointerException
    at org.apache.hadoop.tools.DistCp.makeRelative(DistCp.java:925)
    at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1112)

    at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
    at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)


    On 10 February 2013 06:49, Harsh J wrote:

    Hi Gordon,

    DistCp is a MapReduce-based copying job and its looking to connect to
    8021 cause thats the JobTracker port of the place where the job's to
    be submitted. Does this help clarify your situation and error?

    On Sun, Feb 10, 2013 at 10:20 AM, Gordon Fogus <gordon.fogus@gmail.com>
    wrote:
    I am trying to backup (with distcp) my Hadoop cluster so that I can upgrade
    it without losing all of my data. I am trying to go from nn01-qa.ny3/ to
    medium01.ny3/backup

    Right now, it doesn't work. How can I fix this problem?

    Both clusters are CDH3u5 on CentOS 6.3 x64 on the same VMware ESXi 5.1 host.
    The network is completely open to each host and all firewalls are turned
    off. The only strange thing is that even when I specify port 8020
    specifically, Hadoop tries to use port 8021. (Does HDFS guarantee integer
    integrity? Just kidding.)

    Logs from the source side:
    2013-02-10 04:39:35 at 97 as root@nn01-qa.ny3 in ~
    # hadoop distcp hdfs://nn01-qa.ny3:8020/
    hdfs://medium01.ny3:8020/backup
    13/02/10 04:39:45 INFO tools.DistCp:
    srcPaths=[hdfs://nn01-qa.ny3:8020/]
    13/02/10 04:39:45 INFO tools.DistCp:
    destPath=hdfs://medium01.ny3:8020/backup
    13/02/10 04:39:47 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 0 time(s).
    13/02/10 04:39:48 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 1 time(s).
    13/02/10 04:39:49 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 2 time(s).
    13/02/10 04:39:50 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 3 time(s).
    13/02/10 04:39:51 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 4 time(s).
    13/02/10 04:39:52 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 5 time(s).
    13/02/10 04:39:53 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 6 time(s).
    13/02/10 04:39:54 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 7 time(s).
    13/02/10 04:39:55 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 8 time(s).
    13/02/10 04:39:56 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 9 time(s).
    With failures, global counters are inaccurate; consider running with -i
    Copy failed: java.net.ConnectException: Call to
    nn01-qa.ny3/10.24.9.100:8021 failed on connection exception:
    java.net.ConnectException: Connection refused
    at org.apache.hadoop.ipc.Client.wrapException(Client.java:1179)
    at org.apache.hadoop.ipc.Client.call(Client.java:1155)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226)
    at org.apache.hadoop.mapred.$Proxy1.getProtocolVersion(Unknown
    Source)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:398)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:384)
    at
    org.apache.hadoop.mapred.JobClient.createRPCProxy(JobClient.java:511)
    at org.apache.hadoop.mapred.JobClient.init(JobClient.java:496)
    at
    org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:479)
    at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1018)
    at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
    at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
    Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at
    sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
    at
    org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:519)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:484)
    at
    org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:468)
    at
    org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:575)
    at
    org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:212)
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1292)
    at org.apache.hadoop.ipc.Client.call(Client.java:1121)
    ... 13 more


    2013-02-10 04:39:56 at 98 as root@nn01-qa.ny3 in ~
    # netstat -an | grep 8020
    tcp 0 0 10.24.9.100:8020 0.0.0.0:*
    LISTEN
    tcp 0 0 10.24.9.100:33205 10.24.9.100:8020
    TIME_WAIT
    tcp 0 0 10.24.9.100:8020 10.24.9.102:45745
    ESTABLISHED
    tcp 0 0 10.24.9.100:33192 10.24.9.100:8020
    TIME_WAIT
    tcp 0 0 10.24.9.100:8020 10.24.9.101:59276
    ESTABLISHED
    tcp 0 0 10.24.9.100:8020 10.24.9.103:37636
    ESTABLISHED

    2013-02-10 04:40:23 at 99 as root@nn01-qa.ny3 in ~
    # netstat -an | grep 8021

    2013-02-10 04:40:25 at 100 as root@nn01-qa.ny3 in ~
    # service iptables status
    iptables: Firewall is not running.

    2013-02-10 04:44:11 at 102 as root@nn01-qa.ny3 in ~
    # telnet medium01.ny3 8020
    Trying 10.24.5.120...
    Connected to medium01.ny3.
    Escape character is '^]'.

    Oh Hai!
    Connection closed by foreign host.

    2013-02-10 04:44:27 at 103 as root@nn01-qa.ny3 in ~
    # telnet medium01.ny3 8021
    Trying 10.24.5.120...
    telnet: connect to address 10.24.5.120: Connection refused




    Logs from the destination side:
    2013-02-10 04:32:24 at 96 as root@medium01.ny3 in ~/tmp/cm
    # hadoop distcp hdfs://nn01-qa.ny3:8020/
    hdfs://medium01.ny3:8020/backup/
    13/02/10 04:35:08 INFO tools.DistCp:
    srcPaths=[hdfs://nn01-qa.ny3:8020/]
    13/02/10 04:35:08 INFO tools.DistCp:
    destPath=hdfs://medium01.ny3:8020/backup
    13/02/10 04:35:10 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 0 time(s).
    13/02/10 04:35:11 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 1 time(s).
    13/02/10 04:35:12 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 2 time(s).
    13/02/10 04:35:13 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 3 time(s).
    13/02/10 04:35:14 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 4 time(s).
    13/02/10 04:35:15 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 5 time(s).
    13/02/10 04:35:16 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 6 time(s).
    13/02/10 04:35:17 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 7 time(s).
    13/02/10 04:35:18 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 8 time(s).
    13/02/10 04:35:19 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 9 time(s).
    With failures, global counters are inaccurate; consider running with -i
    Copy failed: java.net.ConnectException: Call to
    medium01.ny3/10.24.5.120:8021 failed on connection exception:
    java.net.ConnectException: Connection refused
    at org.apache.hadoop.ipc.Client.wrapException(Client.java:1179)
    at org.apache.hadoop.ipc.Client.call(Client.java:1155)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226)
    at org.apache.hadoop.mapred.$Proxy1.getProtocolVersion(Unknown
    Source)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:398)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:384)
    at
    org.apache.hadoop.mapred.JobClient.createRPCProxy(JobClient.java:511)
    at org.apache.hadoop.mapred.JobClient.init(JobClient.java:496)
    at
    org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:479)
    at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1018)
    at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
    at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
    Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at
    sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
    at
    org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:519)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:484)
    at
    org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:468)
    at
    org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:575)
    at
    org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:212)
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1292)
    at org.apache.hadoop.ipc.Client.call(Client.java:1121)
    ... 13 more


    2013-02-10 04:35:19 at 97 as root@medium01.ny3 in ~/tmp/cm
    # netstat -an | grep 8020
    tcp 0 0 10.24.5.120:8020 0.0.0.0:*
    LISTEN
    tcp 0 0 10.24.5.120:60726 10.24.5.120:8020
    ESTABLISHED
    tcp 0 0 10.24.5.120:8020 10.24.5.120:60726
    ESTABLISHED
    tcp 0 0 10.24.5.120:60798 10.24.5.120:8020
    TIME_WAIT

    2013-02-10 04:37:14 at 98 as root@medium01.ny3 in ~/tmp/cm
    # netstat -an | grep 8021

    2013-02-10 04:37:15 at 99 as root@medium01.ny3 in ~/tmp/cm
    # service iptables status
    iptables: Firewall is not running.

    2013-02-10 04:43:32 at 104 as root@medium01.ny3 in ~/tmp/cm
    # telnet nn01-qa.ny3 8020
    Trying 10.24.9.100...
    Connected to nn01-qa.ny3.
    Escape character is '^]'.


    Hi!
    Connection closed by foreign host.


    Logs from a slave node in the source cluster:
    2013-02-10 04:45:01 at 72 as root@sn01-qa.ny3 in ~/tmp/centos
    # hadoop distcp hdfs://nn01-qa.ny3:8020/
    hdfs://medium01.ny3:8020/backup
    13/02/10 04:45:13 INFO tools.DistCp:
    srcPaths=[hdfs://nn01-qa.ny3:8020/]
    13/02/10 04:45:13 INFO tools.DistCp:
    destPath=hdfs://medium01.ny3:8020/backup
    13/02/10 04:45:15 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 0 time(s).
    13/02/10 04:45:16 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 1 time(s).
    13/02/10 04:45:17 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 2 time(s).
    13/02/10 04:45:18 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 3 time(s).
    13/02/10 04:45:19 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 4 time(s).
    13/02/10 04:45:20 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 5 time(s).
    13/02/10 04:45:21 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 6 time(s).
    ^C
    2013-02-10 04:45:21 at 73 as root@sn01-qa.ny3 in ~/tmp/centos
    # hadoop fs -get /user/root/tmp/CentOS-6.3-x86_64-netinstall.iso.gz.9
    ./CentOS-6.3-x86_64-netinstall.iso.gz.9

    2013-02-10 04:45:32 at 74 as root@sn01-qa.ny3 in ~/tmp/centos
    # l
    total 188M
    drwxr-xr-x 2 root root 4.0K Feb 10 04:45 .
    drwxr-xr-x. 4 root root 4.0K Feb 10 04:04 ..
    -rw-r--r-- 1 root root 188M Feb 10 04:45
    CentOS-6.3-x86_64-netinstall.iso.gz.9


    How can I fix this problem?

    --



    --
    Harsh J

    --


    --
  • Gordon Fogus at Feb 11, 2013 at 10:31 pm
    Second, what is the performance of a distcp command? Say there were 4 DNs
    in the source side and 4DNs in the destination side (with a replication
    factor of 1 on the destination side). Would rsync on a hdfs fuse mount or
    distcp complete in a smaller amount of time? Or are they all going to be
    bound to a single 1Gbps link if the NN was connected with 1Gbps?

    On 11 February 2013 20:45, Gordon Fogus wrote:

    Interesting. I found some bug reports of distcp not working (throwing
    NPE) if the source is the root (in my case, hdfs://nn01-qa.ny3/). Turned
    out to be exactly my case! I simply copied /user and /hbase to the
    destination and it worked with distcp.

    Is this a widely known issue?

    On 11 February 2013 16:47, Gordon Fogus wrote:

    Thank you, that was helpful. I started the JobTracker and MapReduce
    services and ran the request again and I no longer have the original error.

    I now have a NullPointerException with a "global counters are
    inaccurate" message. How do I correct these counters? What do they count?

    # hadoop distcp hdfs://nn01-qa.ny3:8020/ hdfs://medium01.ny3:8020/backup/
    13/02/11 16:45:35 INFO tools.DistCp: srcPaths=[hdfs://nn01-qa.ny3:8020/]
    13/02/11 16:45:35 INFO tools.DistCp:
    destPath=hdfs://medium01.ny3:8020/backup

    With failures, global counters are inaccurate; consider running with -i
    Copy failed: java.lang.NullPointerException
    at org.apache.hadoop.tools.DistCp.makeRelative(DistCp.java:925)
    at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1112)

    at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
    at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)


    On 10 February 2013 06:49, Harsh J wrote:

    Hi Gordon,

    DistCp is a MapReduce-based copying job and its looking to connect to
    8021 cause thats the JobTracker port of the place where the job's to
    be submitted. Does this help clarify your situation and error?

    On Sun, Feb 10, 2013 at 10:20 AM, Gordon Fogus <gordon.fogus@gmail.com>
    wrote:
    I am trying to backup (with distcp) my Hadoop cluster so that I can upgrade
    it without losing all of my data. I am trying to go from nn01-qa.ny3/ to
    medium01.ny3/backup

    Right now, it doesn't work. How can I fix this problem?

    Both clusters are CDH3u5 on CentOS 6.3 x64 on the same VMware ESXi 5.1 host.
    The network is completely open to each host and all firewalls are turned
    off. The only strange thing is that even when I specify port 8020
    specifically, Hadoop tries to use port 8021. (Does HDFS guarantee integer
    integrity? Just kidding.)

    Logs from the source side:
    2013-02-10 04:39:35 at 97 as root@nn01-qa.ny3 in ~
    # hadoop distcp hdfs://nn01-qa.ny3:8020/
    hdfs://medium01.ny3:8020/backup
    13/02/10 04:39:45 INFO tools.DistCp:
    srcPaths=[hdfs://nn01-qa.ny3:8020/]
    13/02/10 04:39:45 INFO tools.DistCp:
    destPath=hdfs://medium01.ny3:8020/backup
    13/02/10 04:39:47 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 0 time(s).
    13/02/10 04:39:48 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 1 time(s).
    13/02/10 04:39:49 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 2 time(s).
    13/02/10 04:39:50 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 3 time(s).
    13/02/10 04:39:51 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 4 time(s).
    13/02/10 04:39:52 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 5 time(s).
    13/02/10 04:39:53 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 6 time(s).
    13/02/10 04:39:54 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 7 time(s).
    13/02/10 04:39:55 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 8 time(s).
    13/02/10 04:39:56 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 9 time(s).
    With failures, global counters are inaccurate; consider running with
    -i
    Copy failed: java.net.ConnectException: Call to
    nn01-qa.ny3/10.24.9.100:8021 failed on connection exception:
    java.net.ConnectException: Connection refused
    at
    org.apache.hadoop.ipc.Client.wrapException(Client.java:1179)
    at org.apache.hadoop.ipc.Client.call(Client.java:1155)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226)
    at org.apache.hadoop.mapred.$Proxy1.getProtocolVersion(Unknown
    Source)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:398)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:384)
    at
    org.apache.hadoop.mapred.JobClient.createRPCProxy(JobClient.java:511)
    at org.apache.hadoop.mapred.JobClient.init(JobClient.java:496)
    at
    org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:479)
    at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1018)
    at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
    at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
    Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at
    sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
    at
    org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:519)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:484)
    at
    org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:468)
    at
    org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:575)
    at
    org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:212)
    at
    org.apache.hadoop.ipc.Client.getConnection(Client.java:1292)
    at org.apache.hadoop.ipc.Client.call(Client.java:1121)
    ... 13 more


    2013-02-10 04:39:56 at 98 as root@nn01-qa.ny3 in ~
    # netstat -an | grep 8020
    tcp 0 0 10.24.9.100:8020 0.0.0.0:*
    LISTEN
    tcp 0 0 10.24.9.100:33205 10.24.9.100:8020
    TIME_WAIT
    tcp 0 0 10.24.9.100:8020 10.24.9.102:45745
    ESTABLISHED
    tcp 0 0 10.24.9.100:33192 10.24.9.100:8020
    TIME_WAIT
    tcp 0 0 10.24.9.100:8020 10.24.9.101:59276
    ESTABLISHED
    tcp 0 0 10.24.9.100:8020 10.24.9.103:37636
    ESTABLISHED

    2013-02-10 04:40:23 at 99 as root@nn01-qa.ny3 in ~
    # netstat -an | grep 8021

    2013-02-10 04:40:25 at 100 as root@nn01-qa.ny3 in ~
    # service iptables status
    iptables: Firewall is not running.

    2013-02-10 04:44:11 at 102 as root@nn01-qa.ny3 in ~
    # telnet medium01.ny3 8020
    Trying 10.24.5.120...
    Connected to medium01.ny3.
    Escape character is '^]'.

    Oh Hai!
    Connection closed by foreign host.

    2013-02-10 04:44:27 at 103 as root@nn01-qa.ny3 in ~
    # telnet medium01.ny3 8021
    Trying 10.24.5.120...
    telnet: connect to address 10.24.5.120: Connection refused




    Logs from the destination side:
    2013-02-10 04:32:24 at 96 as root@medium01.ny3 in ~/tmp/cm
    # hadoop distcp hdfs://nn01-qa.ny3:8020/
    hdfs://medium01.ny3:8020/backup/
    13/02/10 04:35:08 INFO tools.DistCp:
    srcPaths=[hdfs://nn01-qa.ny3:8020/]
    13/02/10 04:35:08 INFO tools.DistCp:
    destPath=hdfs://medium01.ny3:8020/backup
    13/02/10 04:35:10 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 0 time(s).
    13/02/10 04:35:11 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 1 time(s).
    13/02/10 04:35:12 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 2 time(s).
    13/02/10 04:35:13 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 3 time(s).
    13/02/10 04:35:14 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 4 time(s).
    13/02/10 04:35:15 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 5 time(s).
    13/02/10 04:35:16 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 6 time(s).
    13/02/10 04:35:17 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 7 time(s).
    13/02/10 04:35:18 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 8 time(s).
    13/02/10 04:35:19 INFO ipc.Client: Retrying connect to server:
    medium01.ny3/10.24.5.120:8021. Already tried 9 time(s).
    With failures, global counters are inaccurate; consider running with
    -i
    Copy failed: java.net.ConnectException: Call to
    medium01.ny3/10.24.5.120:8021 failed on connection exception:
    java.net.ConnectException: Connection refused
    at
    org.apache.hadoop.ipc.Client.wrapException(Client.java:1179)
    at org.apache.hadoop.ipc.Client.call(Client.java:1155)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226)
    at org.apache.hadoop.mapred.$Proxy1.getProtocolVersion(Unknown
    Source)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:398)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:384)
    at
    org.apache.hadoop.mapred.JobClient.createRPCProxy(JobClient.java:511)
    at org.apache.hadoop.mapred.JobClient.init(JobClient.java:496)
    at
    org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:479)
    at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1018)
    at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
    at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
    Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at
    sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
    at
    org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:519)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:484)
    at
    org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:468)
    at
    org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:575)
    at
    org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:212)
    at
    org.apache.hadoop.ipc.Client.getConnection(Client.java:1292)
    at org.apache.hadoop.ipc.Client.call(Client.java:1121)
    ... 13 more


    2013-02-10 04:35:19 at 97 as root@medium01.ny3 in ~/tmp/cm
    # netstat -an | grep 8020
    tcp 0 0 10.24.5.120:8020 0.0.0.0:*
    LISTEN
    tcp 0 0 10.24.5.120:60726 10.24.5.120:8020
    ESTABLISHED
    tcp 0 0 10.24.5.120:8020 10.24.5.120:60726
    ESTABLISHED
    tcp 0 0 10.24.5.120:60798 10.24.5.120:8020
    TIME_WAIT

    2013-02-10 04:37:14 at 98 as root@medium01.ny3 in ~/tmp/cm
    # netstat -an | grep 8021

    2013-02-10 04:37:15 at 99 as root@medium01.ny3 in ~/tmp/cm
    # service iptables status
    iptables: Firewall is not running.

    2013-02-10 04:43:32 at 104 as root@medium01.ny3 in ~/tmp/cm
    # telnet nn01-qa.ny3 8020
    Trying 10.24.9.100...
    Connected to nn01-qa.ny3.
    Escape character is '^]'.


    Hi!
    Connection closed by foreign host.


    Logs from a slave node in the source cluster:
    2013-02-10 04:45:01 at 72 as root@sn01-qa.ny3 in ~/tmp/centos
    # hadoop distcp hdfs://nn01-qa.ny3:8020/
    hdfs://medium01.ny3:8020/backup
    13/02/10 04:45:13 INFO tools.DistCp:
    srcPaths=[hdfs://nn01-qa.ny3:8020/]
    13/02/10 04:45:13 INFO tools.DistCp:
    destPath=hdfs://medium01.ny3:8020/backup
    13/02/10 04:45:15 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 0 time(s).
    13/02/10 04:45:16 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 1 time(s).
    13/02/10 04:45:17 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 2 time(s).
    13/02/10 04:45:18 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 3 time(s).
    13/02/10 04:45:19 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 4 time(s).
    13/02/10 04:45:20 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 5 time(s).
    13/02/10 04:45:21 INFO ipc.Client: Retrying connect to server:
    nn01-qa.ny3/10.24.9.100:8021. Already tried 6 time(s).
    ^C
    2013-02-10 04:45:21 at 73 as root@sn01-qa.ny3 in ~/tmp/centos
    # hadoop fs -get /user/root/tmp/CentOS-6.3-x86_64-netinstall.iso.gz.9
    ./CentOS-6.3-x86_64-netinstall.iso.gz.9

    2013-02-10 04:45:32 at 74 as root@sn01-qa.ny3 in ~/tmp/centos
    # l
    total 188M
    drwxr-xr-x 2 root root 4.0K Feb 10 04:45 .
    drwxr-xr-x. 4 root root 4.0K Feb 10 04:04 ..
    -rw-r--r-- 1 root root 188M Feb 10 04:45
    CentOS-6.3-x86_64-netinstall.iso.gz.9


    How can I fix this problem?

    --



    --
    Harsh J

    --


    --

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcdh-user @
categorieshadoop
postedFeb 10, '13 at 4:50a
activeFeb 11, '13 at 10:31p
posts5
users2
websitecloudera.com
irc#hadoop

2 users in discussion

Gordon Fogus: 4 posts Harsh J: 1 post

People

Translate

site design / logo © 2022 Grokbase