FAQ

On Thu, 28 Oct 2010 14:22:17 +0200 Erik Forsberg wrote:

*) The job is writing its output to MultipleSequenceFileOutputFormat,
could that be related?
Still having this problem. I'm kind of leaning against some kind of
race condition in writing of MultipleSequenceFileOutputFormat files,
since one day there's one failure in total (I run 120 reducers), while
another there's ~65 reducer failures.

Anyone else have experience this? Any hints on how to debug?

Regards,
\EF

I see two different types of errors. The first seems to trigger
java.io.IOException: No temporary
file /opera/log1/hadoop/dfs/data/tmp/<file> on the datanodes, while
the second seems to trigger java.io.IOException: Meta file not found,
blockFile=/opera/log4/hadoop/dfs/data/tmp

Thanks,
\EF

Here's an example of the first type, following the logs for the
block. Reducer log (job ran on tasktracker on 10.20.11.89):

2010-10-28 10:43:12,392 WARN org.apache.hadoop.hdfs.DFSClient:
DFSOutputStream ResponseProcessor exception for block
blk_1784371483411986483_5140381java.io.EOFException at
java.io.DataInputStream.readFully(DataInputStream.java:180) at
java.io.DataInputStream.readLong(DataInputStream.java:399) at
org.apache.hadoop.hdfs.protocol.DataTransferProtocol$PipelineAck.readFields(DataTransferProtocol.java:119)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:2410)

2010-10-28 10:43:12,395 WARN org.apache.hadoop.hdfs.DFSClient: Error
Recovery for block blk_1784371483411986483_5140381 bad datanode[0]
10.20.11.89:50010 2010-10-28 10:43:12,395 WARN
org.apache.hadoop.hdfs.DFSClient: Error Recovery for block
blk_1784371483411986483_5140381 in pipeline 10.20.11.89:50010,
10.20.11.65:50010, 10.20.11.56:50010: bad datanode 10.20.11.89:50010

Namenode log for block:

2010-10-28 10:31:37,431 INFO org.apache.hadoop.hdfs.StateChange:
BLOCK*
NameSystem.allocateBlock: /outputdir/daily=20101027/part-00057.
blk_1784371483411986483_5140381 2010-10-28 10:43:12,409 INFO
org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.addStoredBlock:
blockMap updated: 10.20.11.56:50010 is added to
blk_1784371483411986483_5140381 size 99916481 2010-10-28 10:43:12,409
INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
NameSystem.addStoredBlock: blockMap updated: 10.20.11.65:50010 is
added to blk_1784371483411986483_5140381 size 99916481 2010-10-28
10:43:12,458 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
commitBlockSynchronization(lastblock=blk_1784371483411986483_5140381,
newgenerationstamp=5141711, newlength=99916481,
newtargets=[10.20.11.65:50010, 10.20.11.56:50010], closeFile=false,
deleteBlock=false)

Datanode log on 10.20.11.89:

2010-10-28 10:31:37,412 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
blk_1784371483411986483_5140381 src: /10.20.11.89:54906
dest: /10.20.11.89:50010 java.io.IOException: No temporary
file /opera/log1/hadoop/dfs/data/tmp/blk_1784371483411986483 for
block blk_1784371483411986483_5140381 at
org.apache.hadoop.hdfs.server.datanode.FSDataset.finalizeBlock(FSDataset.java:1240)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:955)
at java.lang.Thread.run(Thread.java:619) 2010-10-28 10:43:12,390 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 2
for block blk_1784371483411986483_5140381 terminating

Datanode log on 10.20.11.65 has no mention on block. Datanode log on
10.20.11.89:

2010-10-28 10:31:37,443 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
blk_1784371483411986483_5140381 src: /10.20.11.65:58194
dest: /10.20.11.56:50010 2010-10-28 10:43:12,373 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace:
src: /10.20.11.65:58194, dest: /10.20.11.56:50010, bytes: 99916481,
op: HDFS_WRITE, cliID:
DFSClient_attempt_201010251211_4053_r_000057_1, offset: 0, srvID:
DS-869973382-10.20.11.56-50010-1254491292131, blockid:
blk_1784371483411986483_5140381, duration: 694916828459 2010-10-28
10:43:12,373 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:
PacketResponder 0 for block blk_1784371483411986483_5140381
terminating 2010-10-28 10:43:12,408 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Client calls
recoverBlock(block=blk_1784371483411986483_5140381,
targets=[10.20.11.65:50010, 10.20.11.56:50010]) 2010-10-28
10:43:12,422 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:
oldblock=blk_1784371483411986483_5140381(length=99916481),
newblock=blk_1784371483411986483_5141711(length=99916481),
datanode=10.20.11.56:50010


And then there's the second type which starts in the same reducer log
as:

2010-10-28 10:43:14,138 WARN org.apache.hadoop.hdfs.DFSClient:
DFSOutputStream ResponseProcessor exception for block
blk_4986602050098481381_5140430java.io.IOException: Bad response 1
for block blk_4986602050098481381_5140430 from datanode
10.20.11.86:50010 at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:2440)

2010-10-28 10:43:14,139 WARN org.apache.hadoop.hdfs.DFSClient: Error
Recovery for block blk_4986602050098481381_5140430 bad datanode[2]
10.20.11.86:50010 2010-10-28 10:43:14,139 WARN
org.apache.hadoop.hdfs.DFSClient: Error Recovery for block
blk_4986602050098481381_5140430 in pipeline 10.20.11.89:50010,
10.20.11.88:50010, 10.20.11.86:50010: bad datanode 10.20.11.86:50010
2010-10-28 10:43:14,149 WARN org.apache.hadoop.hdfs.DFSClient: Error
Recovery for block blk_4986602050098481381_5140430 failed because
recovery from primary datanode 10.20.11.88:50010 failed 1 times.
Pipeline was 10.20.11.89:50010, 10.20.11.88:50010, 10.20.11.86:50010.
Will retry... 2010-10-28 10:43:15,151 WARN
org.apache.hadoop.hdfs.DFSClient: Error Recovery for block
blk_4986602050098481381_5140430 bad datanode[2] 10.20.11.86:50010
2010-10-28 10:43:15,151 WARN org.apache.hadoop.hdfs.DFSClient: Error
Recovery for block blk_4986602050098481381_5140430 in pipeline
10.20.11.89:50010, 10.20.11.88:50010, 10.20.11.86:50010: bad datanode
10.20.11.86:50010 2010-10-28 10:43:15,164 WARN
org.apache.hadoop.hdfs.DFSClient: Error Recovery for block
blk_4986602050098481381_5140430 failed because recovery from primary
datanode 10.20.11.88:50010 failed 2 times. Pipeline was
10.20.11.89:50010, 10.20.11.88:50010, 10.20.11.86:50010. Will
retry... 2010-10-28 10:43:16,164 WARN
org.apache.hadoop.hdfs.DFSClient: Error Recovery for block
blk_4986602050098481381_5140430 bad datanode[2] 10.20.11.86:50010
2010-10-28 10:43:16,164 WARN org.apache.hadoop.hdfs.DFSClient: Error
Recovery for block blk_4986602050098481381_5140430 in pipeline
10.20.11.89:50010, 10.20.11.88:50010, 10.20.11.86:50010: bad datanode
10.20.11.86:50010 2010-10-28 10:43:16,177 WARN
org.apache.hadoop.hdfs.DFSClient: Error Recovery for block
blk_4986602050098481381_5140430 failed because recovery from primary
datanode 10.20.11.88:50010 failed 3 times. Pipeline was
10.20.11.89:50010, 10.20.11.88:50010, 10.20.11.86:50010. Will
retry... 2010-10-28 10:43:17,177 WARN
org.apache.hadoop.hdfs.DFSClient: Error Recovery for block
blk_4986602050098481381_5140430 bad datanode[2] 10.20.11.86:50010
2010-10-28 10:43:17,177 WARN org.apache.hadoop.hdfs.DFSClient: Error
Recovery for block blk_4986602050098481381_5140430 in pipeline
10.20.11.89:50010, 10.20.11.88:50010, 10.20.11.86:50010: bad datanode
10.20.11.86:50010 2010-10-28 10:43:17,192 WARN
org.apache.hadoop.hdfs.DFSClient: Error Recovery for block
blk_4986602050098481381_5140430 failed because recovery from primary
datanode 10.20.11.88:50010 failed 4 times. Pipeline was
10.20.11.89:50010, 10.20.11.88:50010, 10.20.11.86:50010. Will
retry... 2010-10-28 10:43:18,192 WARN
org.apache.hadoop.hdfs.DFSClient: Error Recovery for block
blk_4986602050098481381_5140430 bad datanode[2] 10.20.11.86:50010
2010-10-28 10:43:18,192 WARN org.apache.hadoop.hdfs.DFSClient: Error
Recovery for block blk_4986602050098481381_5140430 in pipeline
10.20.11.89:50010, 10.20.11.88:50010, 10.20.11.86:50010: bad datanode
10.20.11.86:50010 2010-10-28 10:43:18,197 WARN
org.apache.hadoop.hdfs.DFSClient: Error Recovery for block
blk_4986602050098481381_5140430 failed because recovery from primary
datanode 10.20.11.88:50010 failed 5 times. Pipeline was
10.20.11.89:50010, 10.20.11.88:50010, 10.20.11.86:50010. Will
retry... 2010-10-28 10:43:19,197 WARN
org.apache.hadoop.hdfs.DFSClient: Error Recovery for block
blk_4986602050098481381_5140430 bad datanode[2] 10.20.11.86:50010
2010-10-28 10:43:19,198 WARN org.apache.hadoop.hdfs.DFSClient: Error
Recovery for block blk_4986602050098481381_5140430 in pipeline
10.20.11.89:50010, 10.20.11.88:50010, 10.20.11.86:50010: bad datanode
10.20.11.86:50010 2010-10-28 10:43:19,203 WARN
org.apache.hadoop.hdfs.DFSClient: Error Recovery for block
blk_4986602050098481381_5140430 failed because recovery from primary
datanode 10.20.11.88:50010 failed 6 times. Pipeline was
10.20.11.89:50010, 10.20.11.88:50010, 10.20.11.86:50010. Marking
primary datanode as bad. 2010-10-28 10:43:20,213 WARN
org.apache.hadoop.hdfs.DFSClient: Error Recovery for block
blk_4986602050098481381_5140430 failed because recovery from primary
datanode 10.20.11.86:50010 failed 1 times. Pipeline was
10.20.11.89:50010, 10.20.11.86:50010. Will retry... 2010-10-28
10:43:21,220 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery
for block blk_4986602050098481381_5140430 failed because recovery
from primary datanode 10.20.11.86:50010 failed 2 times. Pipeline was
10.20.11.89:50010, 10.20.11.86:50010. Will retry... 2010-10-28
10:43:22,228 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery
for block blk_4986602050098481381_5140430 failed because recovery
from primary datanode 10.20.11.86:50010 failed 3 times. Pipeline was
10.20.11.89:50010, 10.20.11.86:50010. Will retry... 2010-10-28
10:43:23,237 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery
for block blk_4986602050098481381_5140430 failed because recovery
from primary datanode 10.20.11.86:50010 failed 4 times. Pipeline was
10.20.11.89:50010, 10.20.11.86:50010. Will retry... 2010-10-28
10:43:24,243 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery
for block blk_4986602050098481381_5140430 failed because recovery
from primary datanode 10.20.11.86:50010 failed 5 times. Pipeline was
10.20.11.89:50010, 10.20.11.86:50010. Will retry... 2010-10-28
10:43:25,253 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery
for block blk_4986602050098481381_5140430 failed because recovery
from primary datanode 10.20.11.86:50010 failed 6 times. Pipeline was
10.20.11.89:50010, 10.20.11.86:50010. Marking primary datanode as
bad. 2010-10-28 10:43:26,262 WARN org.apache.hadoop.hdfs.DFSClient:
Error Recovery for block blk_4986602050098481381_5140430 failed
because recovery from primary datanode 10.20.11.89:50010 failed 1
times. Pipeline was 10.20.11.89:50010. Will retry... 2010-10-28
10:43:27,268 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery
for block blk_4986602050098481381_5140430 failed because recovery
from primary datanode 10.20.11.89:50010 failed 2 times. Pipeline was
10.20.11.89:50010. Will retry... 2010-10-28 10:43:28,273 WARN
org.apache.hadoop.hdfs.DFSClient: Error Recovery for block
blk_4986602050098481381_5140430 failed because recovery from primary
datanode 10.20.11.89:50010 failed 3 times. Pipeline was
10.20.11.89:50010. Will retry... 2010-10-28 10:43:29,276 WARN
org.apache.hadoop.hdfs.DFSClient: Error Recovery for block
blk_4986602050098481381_5140430 failed because recovery from primary
datanode 10.20.11.89:50010 failed 4 times. Pipeline was
10.20.11.89:50010. Will retry... 2010-10-28 10:43:30,280 WARN
org.apache.hadoop.hdfs.DFSClient: Error Recovery for block
blk_4986602050098481381_5140430 failed because recovery from primary
datanode 10.20.11.89:50010 failed 5 times. Pipeline was
10.20.11.89:50010. Will retry... 2010-10-28 10:43:31,284 WARN
org.apache.hadoop.hdfs.DFSClient: Error Recovery for block
blk_4986602050098481381_5140430 failed because recovery from primary
datanode 10.20.11.89:50010 failed 6 times. Pipeline was
10.20.11.89:50010. Aborting... 2010-10-28 10:43:31,285 INFO
org.apache.hadoop.streaming.PipeMapRed: mapRedFinished 2010-10-28
10:43:31,288 WARN org.apache.hadoop.mapred.TaskTracker: Error running
child java.io.IOException: Error Recovery for block
blk_4986602050098481381_5140430 failed because recovery from primary
datanode 10.20.11.89:50010 failed 6 times. Pipeline was
10.20.11.89:50010. Aborting... at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2601)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2087)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2251)
2010-10-28 10:43:31,291 INFO org.apache.hadoop.mapred.TaskRunner:
Runnning cleanup for the task

Namenode log for block:

2010-10-28 10:34:11,880 INFO org.apache.hadoop.hdfs.StateChange:
BLOCK*
NameSystem.allocateBlock: /outputdir/_temporary/_attempt_201010251211_4053_r_000057_1/hourly=2010102703/part-00057.
blk_4986602050098481381_5140430

(Yes, that's all)

Datanode log on 10.20.11.86:

2010-10-28 10:34:11,858 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
blk_4986602050098481381_5140430 src: /10.20.11.88:60247
dest: /10.20.11.86:50010 java.io.IOException: No temporary
file /opera/log3/hadoop/dfs/data/tmp/blk_4986602050098481381 for
block blk_4986602050098481381_5140430 at
org.apache.hadoop.hdfs.server.datanode.FSDataset.finalizeBlock(FSDataset.java:1240)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.lastDataNodeRun(BlockReceiver.java:809)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:859)
at java.lang.Thread.run(Thread.java:619) 2010-10-28 10:43:14,013 WARN
org.apache.hadoop.hdfs.server.datanode.DataNode: checkDiskError:
exception: java.io.IOException: No temporary
file /opera/log3/hadoop/dfs/data/tmp/blk_4986602050098481381 for
block blk_4986602050098481381_5140430 at
org.apache.hadoop.hdfs.server.datanode.FSDataset.finalizeBlock(FSDataset.java:1240)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.lastDataNodeRun(BlockReceiver.java:809)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:859)
at java.lang.Thread.run(Thread.java:619) 2010-10-28 10:43:14,140 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder
blk_4986602050098481381_5140430 0 Exception java.io.IOException: No
temporary
file /opera/log3/hadoop/dfs/data/tmp/blk_4986602050098481381 for
block blk_4986602050098481381_5140430 at
org.apache.hadoop.hdfs.server.datanode.FSDataset.finalizeBlock(FSDataset.java:1240)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.lastDataNodeRun(BlockReceiver.java:809)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:859)
at java.lang.Thread.run(Thread.java:619)

2010-10-28 10:43:14,140 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0
for block blk_4986602050098481381_5140430 terminating 2010-10-28
10:43:14,013 WARN org.apache.hadoop.hdfs.server.datanode.DataNode:
IOException in BlockReceiver.lastNodeRun: java.io.IOException: No
temporary
file /opera/log3/hadoop/dfs/data/tmp/blk_4986602050098481381 for
block blk_4986602050098481381_5140430 at
org.apache.hadoop.hdfs.server.datanode.FSDataset.finalizeBlock(FSDataset.java:1240)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.lastDataNodeRun(BlockReceiver.java:809)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:859)
at java.lang.Thread.run(Thread.java:619) 2010-10-28 10:43:14,013 WARN
org.apache.hadoop.hdfs.server.datanode.DataNode: checkDiskError:
exception: java.io.IOException: No temporary
file /opera/log3/hadoop/dfs/data/tmp/blk_4986602050098481381 for
block blk_4986602050098481381_5140430 at
org.apache.hadoop.hdfs.server.datanode.FSDataset.finalizeBlock(FSDataset.java:1240)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.lastDataNodeRun(BlockReceiver.java:809)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:859)
at java.lang.Thread.run(Thread.java:619) 2010-10-28 10:43:14,140 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder
blk_4986602050098481381_5140430 0 Exception java.io.IOException: No
temporary
file /opera/log3/hadoop/dfs/data/tmp/blk_4986602050098481381 for
block blk_4986602050098481381_5140430 at
org.apache.hadoop.hdfs.server.datanode.FSDataset.finalizeBlock(FSDataset.java:1240)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.lastDataNodeRun(BlockReceiver.java:809)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:859)
at java.lang.Thread.run(Thread.java:619)

2010-10-28 10:43:14,140 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0
for block blk_4986602050098481381_5140430 terminating 2010-10-28
10:43:20,211 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:
Client calls recoverBlock(block=blk_4986602050098481381_5140430,
targets=[10.20.11.89:50010, 10.20.11.86:50010]) 2010-10-28
10:43:20,215 WARN
org.apache.hadoop.hdfs.server.protocol.InterDatanodeProtocol: Failed
to getBlockMetaDataInfo for block (=blk_4986602050098481381_5140430)
from datanode (=10.20.11.89:50010)
org.apache.hadoop.ipc.RemoteException: java.io.IOException: Meta file
not found,
blockFile=/opera/log4/hadoop/dfs/data/tmp/blk_4986602050098481381 at
org.apache.hadoop.hdfs.server.datanode.FSDataset.findMetaFile(FSDataset.java:674)
at
org.apache.hadoop.hdfs.server.datanode.FSDataset.getStoredBlock(FSDataset.java:723)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.getBlockMetaDataInfo(DataNode.java:1431)
at sun.reflect.GeneratedMethodAccessor41.invoke(Unknown Source) at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597) at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:512) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:966) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:962) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:396) at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:960)

at org.apache.hadoop.ipc.Client.call(Client.java:740)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:221)
at $Proxy5.getBlockMetaDataInfo(Unknown Source)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1542)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1637)
at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597) at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:512) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:966) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:962) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:396) at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:960) 2010-10-28
10:43:20,215 WARN
org.apache.hadoop.hdfs.server.protocol.InterDatanodeProtocol: Failed
to getBlockMetaDataInfo for block (=blk_4986602050098481381_5140430)
from datanode (=10.20.11.86:50010) 2010-10-28 10:43:20,216 INFO
org.apache.hadoop.ipc.Server: IPC Server handler 1 on 50020, call
recoverBlock(blk_4986602050098481381_5140430, false,
[Lorg.apache.hadoop.hdfs.protocol.DatanodeInfo;@3956cce8) from
10.20.11.89:43120: error: java.io.IOException: All datanodes failed:
block=blk_4986602050098481381_5140430,
datanodeids=[10.20.11.89:50010, 10.20.11.86:50010]
java.io.IOException: All datanodes failed:
block=blk_4986602050098481381_5140430,
datanodeids=[10.20.11.89:50010, 10.20.11.86:50010] at
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1565)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1637)
at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597) at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:512) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:966) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:962) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:396) at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:960) 2010-10-28
10:43:21,218 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:
Client calls recoverBlock(block=blk_4986602050098481381_5140430,
targets=[10.20.11.89:50010, 10.20.11.86:50010]) 2010-10-28
10:43:21,222 WARN
org.apache.hadoop.hdfs.server.protocol.InterDatanodeProtocol: Failed
to getBlockMetaDataInfo for block (=blk_4986602050098481381_5140430)
from datanode (=10.20.11.89:50010)
org.apache.hadoop.ipc.RemoteException: java.io.IOException: Meta file
not found,
blockFile=/opera/log4/hadoop/dfs/data/tmp/blk_4986602050098481381 at
org.apache.hadoop.hdfs.server.datanode.FSDataset.findMetaFile(FSDataset.java:674)
at
org.apache.hadoop.hdfs.server.datanode.FSDataset.getStoredBlock(FSDataset.java:723)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.getBlockMetaDataInfo(DataNode.java:1431)
at sun.reflect.GeneratedMethodAccessor41.invoke(Unknown Source) at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597) at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:512) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:966) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:962) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:396) at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:960)

at org.apache.hadoop.ipc.Client.call(Client.java:740)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:221)
at $Proxy5.getBlockMetaDataInfo(Unknown Source)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1542)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1637)
at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597) at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:512) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:966) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:962) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:396) at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:960) 2010-10-28
10:43:21,222 WARN
org.apache.hadoop.hdfs.server.protocol.InterDatanodeProtocol: Failed
to getBlockMetaDataInfo for block (=blk_4986602050098481381_5140430)
from datanode (=10.20.11.86:50010) java.io.IOException: Meta file not
found,
blockFile=/opera/log3/hadoop/dfs/data/tmp/blk_4986602050098481381 at
org.apache.hadoop.hdfs.server.datanode.FSDataset.findMetaFile(FSDataset.java:674)
at
org.apache.hadoop.hdfs.server.datanode.FSDataset.getStoredBlock(FSDataset.java:723)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.getBlockMetaDataInfo(DataNode.java:1431)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1542)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1637)
at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597) at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:512) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:966) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:962) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:396) at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:960) 2010-10-28
10:43:21,222 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5
on 50020, call recoverBlock(blk_4986602050098481381_5140430, false,
[Lorg.apache.hadoop.hdfs.protocol.DatanodeInfo;@726d0fae) from
10.20.11.89:43120: error: java.io.IOException: All datanodes failed:
block=blk_4986602050098481381_5140430,
datanodeids=[10.20.11.89:50010, 10.20.11.86:50010]
java.io.IOException: All datanodes failed:
block=blk_4986602050098481381_5140430,
datanodeids=[10.20.11.89:50010, 10.20.11.86:50010] at
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1565)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1637)
at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597) at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:512) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:966) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:962) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:396) at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:960)

(There are more of these message in that log)

Datanode log on 10.20.11.89:

2010-10-28 10:34:11,851 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
blk_4986602050098481381_5140430 src: /10.20.11.89:58621
dest: /10.20.11.89:50010

2010-10-28 10:43:14,138 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 2
for block blk_4986602050098481381_5140430 terminating 2010-10-28
10:43:14,146 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4
on 50020, call getBlockMetaDataInfo(blk_4986602050098481381_5140430)
from 10.20.11.88:37156: error: java.io.IOException: Meta file not
found,
blockFile=/opera/log4/hadoop/dfs/data/tmp/blk_4986602050098481381
java.io.IOException: Meta file not found,
blockFile=/opera/log4/hadoop/dfs/data/tmp/blk_4986602050098481381 at
org.apache.hadoop.hdfs.server.datanode.FSDataset.findMetaFile(FSDataset.java:674)
at
org.apache.hadoop.hdfs.server.datanode.FSDataset.getStoredBlock(FSDataset.java:723)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.getBlockMetaDataInfo(DataNode.java:1431)
at sun.reflect.GeneratedMethodAccessor41.invoke(Unknown Source) at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597) at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:512) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:966) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:962) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:396) at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:960) 2010-10-28
10:43:15,158 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0
on 50020, call getBlockMetaDataInfo(blk_4986602050098481381_5140430)
from 10.20.11.88:37156: error: java.io.IOException: Meta file not
found,
blockFile=/opera/log4/hadoop/dfs/data/tmp/blk_4986602050098481381
java.io.IOException: Meta file not found,
blockFile=/opera/log4/hadoop/dfs/data/tmp/blk_4986602050098481381 at
org.apache.hadoop.hdfs.server.datanode.FSDataset.findMetaFile(FSDataset.java:674)
at
org.apache.hadoop.hdfs.server.datanode.FSDataset.getStoredBlock(FSDataset.java:723)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.getBlockMetaDataInfo(DataNode.java:1431)
at sun.reflect.GeneratedMethodAccessor41.invoke(Unknown Source) at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597) at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:512) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:966) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:962) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:396) at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:960)

(And then a bunch of similar ones)

Datanode log on 10.20.11.88:

2010-10-28 10:34:11,858 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
blk_4986602050098481381_5140430 src: /10.20.11.89:43997
dest: /10.20.11.88:50010 2010-10-28 10:43:14,143 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder
blk_4986602050098481381_5140430 1 Exception java.io.EOFException at
java.io.DataInputStream.readFully(DataInputStream.java:180) at
java.io.DataInputStream.readLong(DataInputStream.java:399) at
org.apache.hadoop.hdfs.protocol.DataTransferProtocol$PipelineAck.readFields(DataTransferProtocol.java:119)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:882)
at java.lang.Thread.run(Thread.java:619)

2010-10-28 10:43:14,143 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 1
for block blk_4986602050098481381_5140430 terminating 2010-10-28
10:43:14,147 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:
Client calls recoverBlock(block=blk_4986602050098481381_5140430,
targets=[10.20.11.89:50010, 10.20.11.88:50010]) 2010-10-28
10:43:14,152 WARN
org.apache.hadoop.hdfs.server.protocol.InterDatanodeProtocol: Failed
to getBlockMetaDataInfo for block (=blk_4986602050098481381_5140430)
from datanode (=10.20.11.89:50010)
org.apache.hadoop.ipc.RemoteException: java.io.IOException: Meta file
not found,
blockFile=/opera/log4/hadoop/dfs/data/tmp/blk_4986602050098481381 at
org.apache.hadoop.hdfs.server.datanode.FSDataset.findMetaFile(FSDataset.java:674)
at
org.apache.hadoop.hdfs.server.datanode.FSDataset.getStoredBlock(FSDataset.java:723)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.getBlockMetaDataInfo(DataNode.java:1431)
at sun.reflect.GeneratedMethodAccessor41.invoke(Unknown Source) at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597) at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:512) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:966) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:962) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:396) at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:960)

at org.apache.hadoop.ipc.Client.call(Client.java:740)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:221)
at $Proxy5.getBlockMetaDataInfo(Unknown Source)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1542)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1637)
at sun.reflect.GeneratedMethodAccessor53.invoke(Unknown Source) at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597) at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:512) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:966) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:962) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:396) at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:960) 2010-10-28
10:43:14,153 WARN
org.apache.hadoop.hdfs.server.protocol.InterDatanodeProtocol: Failed
to getBlockMetaDataInfo for block (=blk_4986602050098481381_5140430)
from datanode (=10.20.11.88:50010) java.io.IOException: Meta file not
found,
blockFile=/opera/log3/hadoop/dfs/data/tmp/blk_4986602050098481381 at
org.apache.hadoop.hdfs.server.datanode.FSDataset.findMetaFile(FSDataset.java:674)
at
org.apache.hadoop.hdfs.server.datanode.FSDataset.getStoredBlock(FSDataset.java:723)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.getBlockMetaDataInfo(DataNode.java:1431)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1542)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1637)
at sun.reflect.GeneratedMethodAccessor53.invoke(Unknown Source) at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597) at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:512) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:966) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:962) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:396) at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:960) 2010-10-28
10:43:14,153 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0
on 50020, call recoverBlock(blk_4986602050098481381_5140430, false,
[Lorg.apache.hadoop.hdfs.protocol.DatanodeInfo;@58245b57) from
10.20.11.89:34684: error: java.io.IOException: All datanodes failed:
block=blk_4986602050098481381_5140430,
datanodeids=[10.20.11.89:50010, 10.20.11.88:50010] 2010-10-28
10:43:14,143 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:
PacketResponder blk_4986602050098481381_5140430 1 Exception
java.io.EOFException at
java.io.DataInputStream.readFully(DataInputStream.java:180) at
java.io.DataInputStream.readLong(DataInputStream.java:399) at
org.apache.hadoop.hdfs.protocol.DataTransferProtocol$PipelineAck.readFields(DataTransferProtocol.java:119)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:882)
at java.lang.Thread.run(Thread.java:619)

2010-10-28 10:43:14,143 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 1
for block blk_4986602050098481381_5140430 terminating 2010-10-28
10:43:14,147 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:
Client calls recoverBlock(block=blk_4986602050098481381_5140430,
targets=[10.20.11.89:50010, 10.20.11.88:50010]) 2010-10-28
10:43:14,152 WARN
org.apache.hadoop.hdfs.server.protocol.InterDatanodeProtocol: Failed
to getBlockMetaDataInfo for block (=blk_4986602050098481381_5140430)
from datanode (=10.20.11.89:50010)
org.apache.hadoop.ipc.RemoteException: java.io.IOException: Meta file
not found,
blockFile=/opera/log4/hadoop/dfs/data/tmp/blk_4986602050098481381 at
org.apache.hadoop.hdfs.server.datanode.FSDataset.findMetaFile(FSDataset.java:674)
at
org.apache.hadoop.hdfs.server.datanode.FSDataset.getStoredBlock(FSDataset.java:723)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.getBlockMetaDataInfo(DataNode.java:1431)
at sun.reflect.GeneratedMethodAccessor41.invoke(Unknown Source) at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597) at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:512) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:966) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:962) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:396) at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:960)

at org.apache.hadoop.ipc.Client.call(Client.java:740)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:221)
at $Proxy5.getBlockMetaDataInfo(Unknown Source)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1542)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1637)
at sun.reflect.GeneratedMethodAccessor53.invoke(Unknown Source) at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597) at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:512) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:966) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:962) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:396) at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:960) 2010-10-28
10:43:14,153 WARN
org.apache.hadoop.hdfs.server.protocol.InterDatanodeProtocol: Failed
to getBlockMetaDataInfo for block (=blk_4986602050098481381_5140430)
from datanode (=10.20.11.88:50010) java.io.IOException: Meta file not
found,
blockFile=/opera/log3/hadoop/dfs/data/tmp/blk_4986602050098481381 at
org.apache.hadoop.hdfs.server.datanode.FSDataset.findMetaFile(FSDataset.java:674)
at
org.apache.hadoop.hdfs.server.datanode.FSDataset.getStoredBlock(FSDataset.java:723)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.getBlockMetaDataInfo(DataNode.java:1431)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1542)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1637)
at sun.reflect.GeneratedMethodAccessor53.invoke(Unknown Source) at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597) at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:512) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:966) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:962) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:396) at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:960) 2010-10-28
10:43:14,153 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0
on 50020, call recoverBlock(blk_4986602050098481381_5140430, false,
[Lorg.apache.hadoop.hdfs.protocol.DatanodeInfo;@58245b57) from
10.20.11.89:34684: error: java.io.IOException: All datanodes failed:
block=blk_4986602050098481381_5140430,
datanodeids=[10.20.11.89:50010, 10.20.11.88:50010]

--
Erik Forsberg <forsberg@opera.com>
Developer, Opera Software - http://www.opera.com/

Search Discussions

Discussion Posts

Previous

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 2 of 2 | next ›
Discussion Overview
groupcommon-user @
categorieshadoop
postedOct 28, '10 at 12:22p
activeNov 15, '10 at 10:37a
posts2
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Erik Forsberg: 2 posts

People

Translate

site design / logo © 2022 Grokbase