Hi,
I have a cluster of 4 debian squeeze machines, on all of them I
installed the same version ( hadoop-0.20.2.tar.gz )
I have : n-0 namenode, n-1: jobtracker and n-{0,1,2,3} slaves
but you can see all my configs in more detail @
http://pastie.org/1754875
the machines have 3GiB RAM.
I don't think disk space is an issue anywhere, but FWIW:
/var/bomvl (which would house hdfs stuff) is a filesystem of 140GB on
all nodes. they all have 1.8GB free on /
so, I invoke the cluster on n-0:
$ bin/hadoop namenode -format
11/04/04 18:27:33 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = n-0.map-reduce.search.wall2.test/10.2.1.90
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 0.20.2
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
11/04/04 18:27:33 INFO namenode.FSNamesystem: fsOwner=dplaetin,Search,root
11/04/04 18:27:33 INFO namenode.FSNamesystem: supergroup=supergroup
11/04/04 18:27:33 INFO namenode.FSNamesystem: isPermissionEnabled=true
11/04/04 18:27:33 INFO common.Storage: Image file of size 98 saved in 0 seconds.
11/04/04 18:27:34 INFO common.Storage: Storage directory /var/bomvl/name has been successfully formatted.
11/04/04 18:27:34 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at n-0.map-reduce.search.wall2.test/10.2.1.90
************************************************************/
$ bin/start-all.sh
starting namenode, logging to /var/log/hadoop/hadoop-dplaetin-namenode-n-0.map-reduce.search.wall2.test.out
n-0: starting datanode, logging to /var/log/hadoop/hadoop-dplaetin-datanode-n-0.map-reduce.search.wall2.test.out
n-2: starting datanode, logging to /var/log/hadoop/hadoop-dplaetin-datanode-n-2.map-reduce.search.wall2.test.out
n-3: starting datanode, logging to /var/log/hadoop/hadoop-dplaetin-datanode-n-3.map-reduce.search.wall2.test.out
n-1: starting datanode, logging to /var/log/hadoop/hadoop-dplaetin-datanode-n-1.map-reduce.search.wall2.test.out
localhost: starting secondarynamenode, logging to /var/log/hadoop/hadoop-dplaetin-secondarynamenode-n-0.map-reduce.search.wall2.test.out
starting jobtracker, logging to /var/log/hadoop/hadoop-dplaetin-jobtracker-n-0.map-reduce.search.wall2.test.out
n-2: starting tasktracker, logging to /var/log/hadoop/hadoop-dplaetin-tasktracker-n-2.map-reduce.search.wall2.test.out
n-3: starting tasktracker, logging to /var/log/hadoop/hadoop-dplaetin-tasktracker-n-3.map-reduce.search.wall2.test.out
n-0: starting tasktracker, logging to /var/log/hadoop/hadoop-dplaetin-tasktracker-n-0.map-reduce.search.wall2.test.out
n-1: starting tasktracker, logging to /var/log/hadoop/hadoop-dplaetin-tasktracker-n-1.map-reduce.search.wall2.test.out
However, in n-0:/var/log/hadoop/hadoop-dplaetin-namenode-n-0.map-reduce.search.wall2.test.log,
I see this:
2011-04-04 18:27:34,760 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = n-0.map-reduce.search.wall2.test/10.2.1.90
STARTUP_MSG: args = []
STARTUP_MSG: version = 0.20.2
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
2011-04-04 18:27:34,846 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics with hostName=NameNode, port=9000
2011-04-04 18:27:34,851 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at: n-0-lan0/10.1.1.2:9000
2011-04-04 18:27:34,853 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=NameNode, sessionId=null
2011-04-04 18:27:34,854 INFO org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics: Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext
2011-04-04 18:27:34,893 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=dplaetin,Search,root
2011-04-04 18:27:34,893 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup
2011-04-04 18:27:34,893 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled=true
2011-04-04 18:27:34,899 INFO org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics: Initializing FSNamesystemMetrics using context object:org.apache.hadoop.metrics.spi.NullContext
2011-04-04 18:27:34,900 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemStatusMBean
2011-04-04 18:27:34,927 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files = 1
2011-04-04 18:27:34,931 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files under construction = 0
2011-04-04 18:27:34,931 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 98 loaded in 0 seconds.
2011-04-04 18:27:34,932 INFO org.apache.hadoop.hdfs.server.common.Storage: Edits file /var/bomvl/name/current/edits of size 4 edits # 0 loaded in 0 seconds.
2011-04-04 18:27:34,971 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 98 saved in 0 seconds.
2011-04-04 18:27:35,146 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading FSImage in 263 msecs
2011-04-04 18:27:35,147 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Total number of blocks = 0
2011-04-04 18:27:35,147 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of invalid blocks = 0
2011-04-04 18:27:35,147 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of under-replicated blocks = 0
2011-04-04 18:27:35,147 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of over-replicated blocks = 0
2011-04-04 18:27:35,148 INFO org.apache.hadoop.hdfs.StateChange: STATE* Leaving safe mode after 0 secs.
2011-04-04 18:27:35,148 INFO org.apache.hadoop.hdfs.StateChange: STATE* Network topology has 0 racks and 0 datanodes
2011-04-04 18:27:35,148 INFO org.apache.hadoop.hdfs.StateChange: STATE* UnderReplicatedBlocks has 0 blocks
2011-04-04 18:27:40,247 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2011-04-04 18:27:40,299 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50070
2011-04-04 18:27:40,300 INFO org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned 50070 webServer.getConnectors()[0].getLocalPort() returned 50070
2011-04-04 18:27:40,300 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50070
2011-04-04 18:27:40,300 INFO org.mortbay.log: jetty-6.1.14
2011-04-04 18:27:40,518 INFO org.mortbay.log: Started SelectChannelConnector@0.0.0.0:50070
2011-04-04 18:27:40,518 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Web-server up at: 0.0.0.0:50070
2011-04-04 18:27:40,521 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2011-04-04 18:27:40,521 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 9000: starting
2011-04-04 18:27:40,522 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 9000: starting
2011-04-04 18:27:40,522 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 9000: starting
2011-04-04 18:27:40,522 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 9000: starting
2011-04-04 18:27:40,526 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 9000: starting
2011-04-04 18:27:40,526 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 on 9000: starting
2011-04-04 18:27:40,526 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 9000: starting
2011-04-04 18:27:40,526 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 9000: starting
2011-04-04 18:27:40,527 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 9000: starting
2011-04-04 18:27:40,528 INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 on 9000: starting
2011-04-04 18:27:40,527 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on 9000: starting
2011-04-04 18:27:40,550 INFO org.apache.hadoop.ipc.Server: Error
register getProtocolVersion
java.lang.IllegalArgumentException: Duplicate metricsName:getProtocolVersion
at org.apache.hadoop.metrics.util.MetricsRegistry.add(MetricsRegistry.java:53)
at org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.(MetricsTimeVaryingRate.java:99)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
2011-04-04 18:27:40,550 INFO org.apache.hadoop.ipc.Server: Error register getProtocolVersion
java.lang.IllegalArgumentException: Duplicate metricsName:getProtocolVersion
at org.apache.hadoop.metrics.util.MetricsRegistry.add(MetricsRegistry.java:53)
at org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.(MetricsTimeVaryingRate.java:99)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
2011-04-04 18:27:40,550 INFO org.apache.hadoop.ipc.Server: Error register getProtocolVersion
java.lang.IllegalArgumentException: Duplicate metricsName:getProtocolVersion
at org.apache.hadoop.metrics.util.MetricsRegistry.add(MetricsRegistry.java:53)
at org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.(MetricsTimeVaryingRate.java:99)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
2011-04-04 18:29:20,803 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.registerDatanode: node registration from 10.1.1.5:50010 storage DS-1679058086-10.2.0.80-50010-1301934560735
2011-04-04 18:29:20,806 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /default-rack/10.1.1.5:50010
2011-04-04 18:29:32,077 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.registerDatanode: node registration from 10.1.1.3:50010 storage DS-195369122-10.2.1.91-50010-1301934572032
2011-04-04 18:29:32,077 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /default-rack/10.1.1.3:50010
2011-04-04 18:29:39,321 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.registerDatanode: node registration from 10.1.1.4:50010 storage DS-564465257-10.2.1.79-50010-1301934579273
2011-04-04 18:29:39,321 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /default-rack/10.1.1.4:50010
2011-04-04 18:30:58,007 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.registerDatanode: node registration from 10.1.1.2:50010 storage DS-166622494-10.2.1.90-50010-1301934658005
2011-04-04 18:30:58,007 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /default-rack/10.1.1.2:50010
2011-04-04 18:33:13,788 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 10.1.1.2
2011-04-04 18:33:13,788 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of transactions: 0 Total time for transactions(ms): 0Number of transactions batched in Syncs: 0 Number of syncs: 0 SyncTimes(ms): 0
2011-04-04 18:33:14,091 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll FSImage from 10.1.1.2
2011-04-04 18:33:14,092 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
transactions: 0 Total time for transactions(ms): 0Number of
transactions batched in Syncs: 0 Number of syncs: 1 SyncTimes(ms): 33
Anyone knows what could be the issue at hand? The suggestions I could find in other threads are "check diskspace" and "make sure the versions are the same",
but on the former I'm good, the latter is not an issue here.
When I try to put a file on dfhs, I get:
011-04-04 17:10:28,306 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 9000, call addBlock(/tmp/hadoop-dplaetin/mapred/system/jobtracker.info, DFSClient_371874060) from 10.1.1.3:47491: error: java.io.IOException: File /tmp/hadoop-dplaetin/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1
java.io.IOException: File /tmp/hadoop-dplaetin/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
On n-1, hadoop-dplaetin-datanode-n-1.map-reduce.search.wall2.test.log looks good:
$ cat hadoop-dplaetin-datanode-n-1.map-reduce.search.wall2.test.log
2011-04-04 18:27:35,886 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = n-1.map-reduce.search.wall2.test/10.2.1.91
STARTUP_MSG: args = []
STARTUP_MSG: version = 0.20.2
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
2011-04-04 18:27:40,535 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory /var/bomvl/data is not formatted.
2011-04-04 18:27:40,535 INFO org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
2011-04-04 18:27:40,828 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Registered FSDatasetStatusMBean
2011-04-04 18:27:40,829 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Opened info server at 50010
2011-04-04 18:27:40,831 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Balancing bandwith is 1048576 bytes/s
2011-04-04 18:27:45,876 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2011-04-04 18:27:45,940 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50075
2011-04-04 18:27:45,940 INFO org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned 50075 webServer.getConnectors()[0].getLocalPort() returned 50075
2011-04-04 18:27:45,940 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50075
2011-04-04 18:27:45,940 INFO org.mortbay.log: jetty-6.1.14
2011-04-04 18:29:27,006 INFO org.mortbay.log: Started SelectChannelConnector@0.0.0.0:50075
2011-04-04 18:29:27,011 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=DataNode, sessionId=null
2011-04-04 18:29:32,026 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics with hostName=DataNode, port=50020
2011-04-04 18:29:32,029 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2011-04-04 18:29:32,029 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 50020: starting
2011-04-04 18:29:32,031 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 50020: starting
2011-04-04 18:29:32,031 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 50020: starting
2011-04-04 18:29:32,031 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 50020: starting
2011-04-04 18:29:32,031 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: dnRegistration = DatanodeRegistration(n144-17a.wall2.test:50010, storageID=, infoPort=50075, ipcPort=50020)
2011-04-04 18:29:32,061 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: New storage id DS-195369122-10.2.1.91-50010-1301934572032 is assigned to data-node 10.1.1.3:50010
2011-04-04 18:29:32,061 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.1.1.3:50010, storageID=DS-195369122-10.2.1.91-50010-1301934572032, infoPort=50075, ipcPort=50020)In DataNode.run, data = FSDataset{dirpath='/var/bomvl/data/current'}
2011-04-04 18:29:32,062 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: using BLOCKREPORT_INTERVAL of 3600000msec Initial delay: 0msec
2011-04-04 18:29:32,070 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 0 blocks got processed in 2 msecs
2011-04-04 18:29:32,071 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Starting Periodic
block scanner.
However, this doesn't seem very right:
$ head -n 40 hadoop-dplaetin-tasktracker-n-1.map-reduce.search.wall2.test.log
2011-04-04 18:27:39,041 INFO org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting TaskTracker
STARTUP_MSG: host = n-1.map-reduce.search.wall2.test/10.2.1.91
STARTUP_MSG: args = []
STARTUP_MSG: version = 0.20.2
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
2011-04-04 18:27:44,176 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2011-04-04 18:27:44,263 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50060
2011-04-04 18:27:44,269 INFO org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned 50060 webServer.getConnectors()[0].getLocalPort() returned 50060
2011-04-04 18:27:44,269 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50060
2011-04-04 18:27:44,269 INFO org.mortbay.log: jetty-6.1.14
2011-04-04 18:28:05,858 INFO org.mortbay.log: Started SelectChannelConnector@0.0.0.0:50060
2011-04-04 18:28:05,863 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=TaskTracker, sessionId=
2011-04-04 18:28:05,872 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics with hostName=TaskTracker, port=42914
2011-04-04 18:28:05,908 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2011-04-04 18:28:05,908 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 42914: starting
2011-04-04 18:28:05,909 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 42914: starting
2011-04-04 18:28:05,909 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 42914: starting
2011-04-04 18:28:05,909 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 42914: starting
2011-04-04 18:28:05,910 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 42914: starting
2011-04-04 18:28:05,910 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 on 42914: starting
2011-04-04 18:28:05,910 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 42914: starting
2011-04-04 18:28:05,910 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 42914: starting
2011-04-04 18:28:05,910 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on 42914: starting
2011-04-04 18:28:05,910 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker up at: localhost/127.0.0.1:42914
2011-04-04 18:28:05,910 INFO org.apache.hadoop.mapred.TaskTracker: Starting tracker tracker_n144-17a.wall2.test:localhost/127.0.0.1:42914
2011-04-04 18:28:06,933 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: n-1/10.1.1.3:9001. Already tried 0 time(s).
2011-04-04 18:28:07,934 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: n-1/10.1.1.3:9001. Already tried 1 time(s).
2011-04-04 18:28:08,934 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: n-1/10.1.1.3:9001. Already tried 2 time(s).
2011-04-04 18:28:09,935 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: n-1/10.1.1.3:9001. Already tried 3 time(s).
2011-04-04 18:28:10,935 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: n-1/10.1.1.3:9001. Already tried 4 time(s).
2011-04-04 18:28:11,935 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: n-1/10.1.1.3:9001. Already tried 5 time(s).
2011-04-04 18:28:12,936 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: n-1/10.1.1.3:9001. Already tried 6 time(s).
2011-04-04 18:28:13,936 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: n-1/10.1.1.3:9001. Already tried 7 time(s).
2011-04-04 18:28:14,937 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: n-1/10.1.1.3:9001. Already tried 8 time(s).
2011-04-04 18:28:15,937 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: n-1/10.1.1.3:9001. Already tried 9 time(s).
2011-04-04 18:28:15,938 INFO org.apache.hadoop.ipc.RPC: Server at n-1/10.1.1.3:9001 not available yet, Zzzzz...
2011-04-04 18:28:17,939 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: n-1/10.1.1.3:9001. Already tried 0 time(s).
Indeed, the jobtracker does not seem to be running (although that's probably another problem):
$ sudo netstat -n -l -p | grep 9001
$
Thanks for any help,
Dieter