On a cluster running 0.19.2
We have some production jobs that perform ETL tasks that open files
in hdfs during the reduce task (with speculative execution in reduce
stage
programmatically turned off). Since upgrading the cluster from 0.19.1,
we've
been seeing some odd behavior in that we are experiencing timeouts with
block/file creation, timeouts that are long enough that the reduce
attempt gets
killed. Subsequent reduce attempts then fail because the first killed
attempt
is still noted (by the namenode I assume) to create the block/file
according to
the exception that bubbles up. Didn't see anything like this in JIRA,
and I'm
trying to grab a few jstacks from the namenode when I see these errors
pop
up (usually correlated with a somewhat busy cluster) in an effort to
get some
idea of what is going on here.
Currently the cluster is small with about 5 data nodes and 10s of TBs
with
the 2x the namespace files easily fitting in memory.... I don't see
any process
eating more than a couple percent of cpu on the name node box (which
also hosts the secondary nn). iostat shows 100-200 block read/written
every
other second on this host leaving plenty of headroom there. The
cluster is
scheduled to grow in the near future, which may worsen this hang/
blocking
if its due to a bottleneck.
Before I start tracing through the code, I thought I might ask whether
anyone
has seen anything the exerts from the jobtracker logs below? Is there
a way
to guarantee that all in processes takes for a given reduce task will be
terminated (and any associated network connections be sent a reset or
something) before a new reduce task is started.
On kind of side thought - is the task attempt name in the jobconf that
is handed
to the reduce in configure() and if so - what might the setting name
be to get at
it? Or does one need to go through a more circuitous route to obtain the
TaskAttemptID associated with the attempt?
Back to the point at hand, from the jobtracker logs:
Failing initial reduce:
----------------------------
2009-09-27 22:24:25,056 INFO org.apache.hadoop.mapred.TaskInProgress:
Error from attempt_200909231347_0694_r_000002_0:
java.net.SocketTimeoutException: 69000 millis timeout while waiting
for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/X.X.X.2:47440 remote=/
X.X.X.2:50010]
at
org
.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:
162)
at
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:150)
at
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:123)
at
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:116)
at java.io.DataInputStream.readByte(DataInputStream.java:248)
at
org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:325)
at
org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:346)
at org.apache.hadoop.io.Text.readString(Text.java:400)
at org.apache.hadoop.hdfs.DFSClient
$DFSOutputStream.createBlockOutputStream(DFSClient.java:2787)
at org.apache.hadoop.hdfs.DFSClient
$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2712)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access
$2000(DFSClient.java:1996)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream
$DataStreamer.run(DFSClient.java:2182)
Failing second reduce:
-------------------------------
2009-09-27 22:53:22,048 INFO org.apache.hadoop.mapred.TaskInProgress:
Error from attempt_200909231347_0694_r_000002_3:
org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed
to create file >blah<
for DFSClient_attempt_200909231347_0694_r_000002_3 on client X.X.X.7,
because this file is already being created by
DFSClient_attempt_200909231347_0694_r_000002_0 on X.X.X.2
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal
(FSNamesystem.java:1085) at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNames
ystem.java:998) at
org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:
301)
at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown
Source) at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597) at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894)
at org.apache.hadoop.ipc.Client.call(Client.java:697)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
at $Proxy1.create(Unknown Source)
at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
at
sun
.reflect
.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:
25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org
.apache
.hadoop
.io
.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:
82)
at
org
.apache
.hadoop
.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at $Proxy1.create(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient
$DFSOutputStream.(DFSClient.java:454)
at
org
.apache
.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:
188)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:487)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:468)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:375)
Many thanks...
dave bayer