FAQ

On Wed, 24 Nov 2010 10:30:09 +0100 Erik Forsberg wrote:

Hi!

I'm having some trouble with Map/Reduce jobs failing due to HDFS
errors. I've been digging around the logs trying to figure out what's
happening, and I see the following in the datanode logs:

2010-11-19 10:27:01,059 WARN
org.apache.hadoop.hdfs.server.datanode.DataNode: IOException in
BlockReceiver.lastNodeRun: java.io.IOException: No temporary
file /opera/log4/hadoop/dfs/data/tmp/blk_-8143694940938019938 for
block blk_-8143694940938019938_6144372 at <snip>
What would be the possible causes of such exceptions?
It seems the cause of this was my puppetd not being able to detect that
the datanode was already running, which caused it to try to start a
second datanode. That in turn seems to cause tmp directories to be
cleaned before the second datanode finds out that the storage
directories are locked. Some kind of race condition I would guess,
because it only happens on systems with high load.

More details here:
https://groups.google.com/a/cloudera.org/group/cdh-user/browse_frm/thread/d4572d2d1191be91#

\EF
--
Erik Forsberg <forsberg@opera.com>
Developer, Opera Software - http://www.opera.com/

Search Discussions

Discussion Posts

Previous

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 2 of 2 | next ›
Discussion Overview
grouphdfs-user @
categorieshadoop
postedNov 24, '10 at 9:30a
activeDec 3, '10 at 7:40a
posts2
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Erik Forsberg: 2 posts

People

Translate

site design / logo © 2022 Grokbase