FAQ
Exceptions in DataXceiver#run can result in a zombie datanode
--------------------------------------------------------------

Key: HDFS-2182
URL: https://issues.apache.org/jira/browse/HDFS-2182
Project: Hadoop HDFS
Issue Type: Bug
Components: data-node
Reporter: Eli Collins
Fix For: 0.23.0


DataXceiver#run currently swallows all exceptions, it should instead plumb them up to DataXceiverServer#run so it can decide whether the exception should be tolerated or the daemon should exit. An IOE should be tolerated (because it's likely just an issue with a particular thread, or an intermittent failure), as it is today, but eg j.l.Error should be not.

This came up in the following bug I'm seeing on a test cluster: if there's eg a NoClassDefFoundError thrown in DataXceiver#run (because the host jars were replaced out from underneath it, it ran out of descriptors, etc.) we'll end up with a datanode that is alive but always fails because it can't create any DataXceiver threads. In this case the datanode should shut itself down rather than continue to run.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouphdfs-dev @
categorieshadoop
postedJul 21, '11 at 8:15p
activeJul 21, '11 at 8:15p
posts1
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Eli Collins (JIRA): 1 post

People

Translate

site design / logo © 2022 Grokbase