FAQ
Hi,

We have a cluster of 10 machines (one master (hostname: megh03), and nine
slaves (hostname:meghXX)).
The cluster is set up. Whenever I run a job, I get error on one machine
megh08. Error is pasted here:

[meghadmin@prashant hadoop-0.18.3]$ bin/hadoop jar
hadoop-0.18.3-examples.jar wordcount conf out6
10/03/26 22:40:14 INFO mapred.FileInputFormat: Total input paths to process
: 11
10/03/26 22:40:14 INFO mapred.FileInputFormat: Total input paths to process
: 11
10/03/26 22:40:15 INFO mapred.JobClient: Running job: job_201003262242_0004
10/03/26 22:40:16 INFO mapred.JobClient: map 0% reduce 0%
10/03/26 22:40:19 INFO mapred.JobClient: map 8% reduce 0%
10/03/26 22:40:20 INFO mapred.JobClient: map 25% reduce 0%
10/03/26 22:40:21 INFO mapred.JobClient: map 91% reduce 0%
10/03/26 22:40:26 INFO mapred.JobClient: map 91% reduce 2%
10/03/26 22:40:28 INFO mapred.JobClient: Task Id :
attempt_201003262242_0004_m_000006_0, Status : FAILED
*Error initializing attempt_201003262242_0004_m_000006_0:
java.net.ConnectException: Call to megh03/10.2.4.139:9000 failed on
connection exception: java.net.ConnectException: Connection refused*
at org.apache.hadoop.ipc.Client.wrapException(Client.java:743)
at org.apache.hadoop.ipc.Client.call(Client.java:719)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
at org.apache.hadoop.dfs.$Proxy5.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348)
at org.apache.hadoop.dfs.DFSClient.createRPCNamenode(DFSClient.java:103)
at org.apache.hadoop.dfs.DFSClient.(DistributedFileSystem.java:67)
at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1339)
at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:56)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1351)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:213)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175)
at
org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:638)
at
org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1297)
at
org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:937)
at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1334)
at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:2343)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:100)
at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:301)
at org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:178)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:820)
at org.apache.hadoop.ipc.Client.call(Client.java:705)
... 16 more

10/03/26 22:40:28 WARN mapred.JobClient: *Error reading task
outputhttp://megh08:50060/tasklog?plaintext=true&taskid=attempt_201003262242_0004_m_000006_0&filter=stdout
*
10/03/26 22:40:28 WARN mapred.JobClient: *Error reading task
outputhttp://megh08:50060/tasklog?plaintext=true&taskid=attempt_201003262242_0004_m_000006_0&filter=stderr
*
10/03/26 22:40:31 INFO mapred.JobClient: map 100% reduce 2%
10/03/26 22:40:36 INFO mapred.JobClient: Job complete: job_201003262242_0004
10/03/26 22:40:36 INFO mapred.JobClient: Counters: 17
10/03/26 22:40:36 INFO mapred.JobClient: File Systems
10/03/26 22:40:36 INFO mapred.JobClient: HDFS bytes read=48534
10/03/26 22:40:36 INFO mapred.JobClient: HDFS bytes written=26261
10/03/26 22:40:36 INFO mapred.JobClient: Local bytes read=32541
10/03/26 22:40:36 INFO mapred.JobClient: Local bytes written=70377
10/03/26 22:40:36 INFO mapred.JobClient: Job Counters
10/03/26 22:40:36 INFO mapred.JobClient: Launched reduce tasks=1
10/03/26 22:40:36 INFO mapred.JobClient: Rack-local map tasks=1
10/03/26 22:40:36 INFO mapred.JobClient: Launched map tasks=13
10/03/26 22:40:36 INFO mapred.JobClient: Data-local map tasks=11
10/03/26 22:40:36 INFO mapred.JobClient: Map-Reduce Framework
10/03/26 22:40:36 INFO mapred.JobClient: Reduce input groups=1521
10/03/26 22:40:36 INFO mapred.JobClient: Combine output records=3374
10/03/26 22:40:36 INFO mapred.JobClient: Map input records=1580
10/03/26 22:40:36 INFO mapred.JobClient: Reduce output records=1521
10/03/26 22:40:36 INFO mapred.JobClient: Map output bytes=63905
10/03/26 22:40:36 INFO mapred.JobClient: Map input bytes=47913
10/03/26 22:40:36 INFO mapred.JobClient: Combine input records=6498
10/03/26 22:40:36 INFO mapred.JobClient: Map output records=4645
10/03/26 22:40:36 INFO mapred.JobClient: Reduce input records=1521


Can anybody tell me what may be the problem here?


--
Thanks and Regards,
Prashant Ullegaddi,
Search and Information Extraction Lab,
IIIT-Hyderabad, India.

Search Discussions

  • Karthik K at Mar 28, 2010 at 12:13 am
    you can start by grep-ing for the port number in the
    core/hdfs/mapred-site.xml files / any other app specific configuration that
    you might have loaded, to understand more about the process running on 9000.
    And of course, the firewall settings.

    On Fri, Mar 26, 2010 at 10:18 AM, prashant ullegaddi wrote:

    Hi,

    We have a cluster of 10 machines (one master (hostname: megh03), and nine
    slaves (hostname:meghXX)).
    The cluster is set up. Whenever I run a job, I get error on one machine
    megh08. Error is pasted here:

    [meghadmin@prashant hadoop-0.18.3]$ bin/hadoop jar
    hadoop-0.18.3-examples.jar wordcount conf out6
    10/03/26 22:40:14 INFO mapred.FileInputFormat: Total input paths to process
    : 11
    10/03/26 22:40:14 INFO mapred.FileInputFormat: Total input paths to process
    : 11
    10/03/26 22:40:15 INFO mapred.JobClient: Running job: job_201003262242_0004
    10/03/26 22:40:16 INFO mapred.JobClient: map 0% reduce 0%
    10/03/26 22:40:19 INFO mapred.JobClient: map 8% reduce 0%
    10/03/26 22:40:20 INFO mapred.JobClient: map 25% reduce 0%
    10/03/26 22:40:21 INFO mapred.JobClient: map 91% reduce 0%
    10/03/26 22:40:26 INFO mapred.JobClient: map 91% reduce 2%
    10/03/26 22:40:28 INFO mapred.JobClient: Task Id :
    attempt_201003262242_0004_m_000006_0, Status : FAILED
    *Error initializing attempt_201003262242_0004_m_000006_0:
    java.net.ConnectException: Call to megh03/10.2.4.139:9000 failed on
    connection exception: java.net.ConnectException: Connection refused*
    at org.apache.hadoop.ipc.Client.wrapException(Client.java:743)
    at org.apache.hadoop.ipc.Client.call(Client.java:719)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
    at org.apache.hadoop.dfs.$Proxy5.getProtocolVersion(Unknown Source)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348)
    at org.apache.hadoop.dfs.DFSClient.createRPCNamenode(DFSClient.java:103)
    at org.apache.hadoop.dfs.DFSClient.<init>(DFSClient.java:172)
    at

    org.apache.hadoop.dfs.DistributedFileSystem.initialize(DistributedFileSystem.java:67)
    at
    org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1339)
    at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:56)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1351)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:213)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175)
    at
    org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:638)
    at
    org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1297)
    at
    org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:937)
    at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1334)
    at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:2343)
    Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at
    sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
    at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:100)
    at
    org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:301)
    at org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:178)
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:820)
    at org.apache.hadoop.ipc.Client.call(Client.java:705)
    ... 16 more

    10/03/26 22:40:28 WARN mapred.JobClient: *Error reading task

    outputhttp://megh08:50060/tasklog?plaintext=true&taskid=attempt_201003262242_0004_m_000006_0&filter=stdout
    *
    10/03/26 22:40:28 WARN mapred.JobClient: *Error reading task

    outputhttp://megh08:50060/tasklog?plaintext=true&taskid=attempt_201003262242_0004_m_000006_0&filter=stderr
    *
    10/03/26 22:40:31 INFO mapred.JobClient: map 100% reduce 2%
    10/03/26 22:40:36 INFO mapred.JobClient: Job complete:
    job_201003262242_0004
    10/03/26 22:40:36 INFO mapred.JobClient: Counters: 17
    10/03/26 22:40:36 INFO mapred.JobClient: File Systems
    10/03/26 22:40:36 INFO mapred.JobClient: HDFS bytes read=48534
    10/03/26 22:40:36 INFO mapred.JobClient: HDFS bytes written=26261
    10/03/26 22:40:36 INFO mapred.JobClient: Local bytes read=32541
    10/03/26 22:40:36 INFO mapred.JobClient: Local bytes written=70377
    10/03/26 22:40:36 INFO mapred.JobClient: Job Counters
    10/03/26 22:40:36 INFO mapred.JobClient: Launched reduce tasks=1
    10/03/26 22:40:36 INFO mapred.JobClient: Rack-local map tasks=1
    10/03/26 22:40:36 INFO mapred.JobClient: Launched map tasks=13
    10/03/26 22:40:36 INFO mapred.JobClient: Data-local map tasks=11
    10/03/26 22:40:36 INFO mapred.JobClient: Map-Reduce Framework
    10/03/26 22:40:36 INFO mapred.JobClient: Reduce input groups=1521
    10/03/26 22:40:36 INFO mapred.JobClient: Combine output records=3374
    10/03/26 22:40:36 INFO mapred.JobClient: Map input records=1580
    10/03/26 22:40:36 INFO mapred.JobClient: Reduce output records=1521
    10/03/26 22:40:36 INFO mapred.JobClient: Map output bytes=63905
    10/03/26 22:40:36 INFO mapred.JobClient: Map input bytes=47913
    10/03/26 22:40:36 INFO mapred.JobClient: Combine input records=6498
    10/03/26 22:40:36 INFO mapred.JobClient: Map output records=4645
    10/03/26 22:40:36 INFO mapred.JobClient: Reduce input records=1521


    Can anybody tell me what may be the problem here?


    --
    Thanks and Regards,
    Prashant Ullegaddi,
    Search and Information Extraction Lab,
    IIIT-Hyderabad, India.
  • Prashant ullegaddi at Mar 28, 2010 at 5:00 pm
    Thank you.

    Culprit was missing of host-to-ip mappings in /etc/hosts file megh08.

    On Sun, Mar 28, 2010 at 5:43 AM, Karthik K wrote:

    you can start by grep-ing for the port number in the
    core/hdfs/mapred-site.xml files / any other app specific configuration that
    you might have loaded, to understand more about the process running on
    9000.
    And of course, the firewall settings.


    On Fri, Mar 26, 2010 at 10:18 AM, prashant ullegaddi <
    prashullegaddi@gmail.com> wrote:
    Hi,

    We have a cluster of 10 machines (one master (hostname: megh03), and nine
    slaves (hostname:meghXX)).
    The cluster is set up. Whenever I run a job, I get error on one machine
    megh08. Error is pasted here:

    [meghadmin@prashant hadoop-0.18.3]$ bin/hadoop jar
    hadoop-0.18.3-examples.jar wordcount conf out6
    10/03/26 22:40:14 INFO mapred.FileInputFormat: Total input paths to process
    : 11
    10/03/26 22:40:14 INFO mapred.FileInputFormat: Total input paths to process
    : 11
    10/03/26 22:40:15 INFO mapred.JobClient: Running job:
    job_201003262242_0004
    10/03/26 22:40:16 INFO mapred.JobClient: map 0% reduce 0%
    10/03/26 22:40:19 INFO mapred.JobClient: map 8% reduce 0%
    10/03/26 22:40:20 INFO mapred.JobClient: map 25% reduce 0%
    10/03/26 22:40:21 INFO mapred.JobClient: map 91% reduce 0%
    10/03/26 22:40:26 INFO mapred.JobClient: map 91% reduce 2%
    10/03/26 22:40:28 INFO mapred.JobClient: Task Id :
    attempt_201003262242_0004_m_000006_0, Status : FAILED
    *Error initializing attempt_201003262242_0004_m_000006_0:
    java.net.ConnectException: Call to megh03/10.2.4.139:9000 failed on
    connection exception: java.net.ConnectException: Connection refused*
    at org.apache.hadoop.ipc.Client.wrapException(Client.java:743)
    at org.apache.hadoop.ipc.Client.call(Client.java:719)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
    at org.apache.hadoop.dfs.$Proxy5.getProtocolVersion(Unknown Source)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348)
    at
    org.apache.hadoop.dfs.DFSClient.createRPCNamenode(DFSClient.java:103)
    at org.apache.hadoop.dfs.DFSClient.<init>(DFSClient.java:172)
    at

    org.apache.hadoop.dfs.DistributedFileSystem.initialize(DistributedFileSystem.java:67)
    at
    org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1339)
    at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:56)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1351)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:213)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175)
    at
    org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:638)
    at
    org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1297)
    at
    org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:937)
    at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1334)
    at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:2343)
    Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at
    sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
    at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:100)
    at
    org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:301)
    at
    org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:178)
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:820)
    at org.apache.hadoop.ipc.Client.call(Client.java:705)
    ... 16 more

    10/03/26 22:40:28 WARN mapred.JobClient: *Error reading task

    outputhttp://megh08:50060/tasklog?plaintext=true&taskid=attempt_201003262242_0004_m_000006_0&filter=stdout
    *
    10/03/26 22:40:28 WARN mapred.JobClient: *Error reading task

    outputhttp://megh08:50060/tasklog?plaintext=true&taskid=attempt_201003262242_0004_m_000006_0&filter=stderr
    *
    10/03/26 22:40:31 INFO mapred.JobClient: map 100% reduce 2%
    10/03/26 22:40:36 INFO mapred.JobClient: Job complete:
    job_201003262242_0004
    10/03/26 22:40:36 INFO mapred.JobClient: Counters: 17
    10/03/26 22:40:36 INFO mapred.JobClient: File Systems
    10/03/26 22:40:36 INFO mapred.JobClient: HDFS bytes read=48534
    10/03/26 22:40:36 INFO mapred.JobClient: HDFS bytes written=26261
    10/03/26 22:40:36 INFO mapred.JobClient: Local bytes read=32541
    10/03/26 22:40:36 INFO mapred.JobClient: Local bytes written=70377
    10/03/26 22:40:36 INFO mapred.JobClient: Job Counters
    10/03/26 22:40:36 INFO mapred.JobClient: Launched reduce tasks=1
    10/03/26 22:40:36 INFO mapred.JobClient: Rack-local map tasks=1
    10/03/26 22:40:36 INFO mapred.JobClient: Launched map tasks=13
    10/03/26 22:40:36 INFO mapred.JobClient: Data-local map tasks=11
    10/03/26 22:40:36 INFO mapred.JobClient: Map-Reduce Framework
    10/03/26 22:40:36 INFO mapred.JobClient: Reduce input groups=1521
    10/03/26 22:40:36 INFO mapred.JobClient: Combine output records=3374
    10/03/26 22:40:36 INFO mapred.JobClient: Map input records=1580
    10/03/26 22:40:36 INFO mapred.JobClient: Reduce output records=1521
    10/03/26 22:40:36 INFO mapred.JobClient: Map output bytes=63905
    10/03/26 22:40:36 INFO mapred.JobClient: Map input bytes=47913
    10/03/26 22:40:36 INFO mapred.JobClient: Combine input records=6498
    10/03/26 22:40:36 INFO mapred.JobClient: Map output records=4645
    10/03/26 22:40:36 INFO mapred.JobClient: Reduce input records=1521


    Can anybody tell me what may be the problem here?


    --
    Thanks and Regards,
    Prashant Ullegaddi,
    Search and Information Extraction Lab,
    IIIT-Hyderabad, India.


    --
    Thanks and Regards,
    Prashant Ullegaddi,
    Search and Information Extraction Lab,
    IIIT-Hyderabad, India.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedMar 26, '10 at 5:19p
activeMar 28, '10 at 5:00p
posts3
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Prashant ullegaddi: 2 posts Karthik K: 1 post

People

Translate

site design / logo © 2022 Grokbase