FAQ
DataNode sends an Success ack when block write fails
----------------------------------------------------

Key: HDFS-637
URL: https://issues.apache.org/jira/browse/HDFS-637
Project: Hadoop HDFS
Issue Type: Bug
Components: data-node
Reporter: Hairong Kuang
Assignee: Hairong Kuang
Priority: Blocker
Fix For: 0.21.0


When I work on HDFS-624, I saw TestFileAppend3#TC7 occasionally fails. After lots of debug, I saw that the client unexpected received a response of "-2 SUCCESS SUCCESS" in which -2 is the packet sequence number. This happened in a pipeline of 2 datanodes and one of them failed. It turned out when block receiver fails, it shuts down itself and interrupts the packet responder but responder tries to handle interruption with the condition "Thread.isInterrupted()" but unfortunately a thread's interrupt status is not set in some cases as explained in the Thread#interrupt javadoc:

If this thread is blocked in an invocation of the wait(), wait(long), or wait(long, int) methods of the Object class, or of the join(), join(long), join(long, int), sleep(long), or sleep(long, int), methods of this class, then its interrupt status will be cleared and it will receive an InterruptedException.

So datanode does not detect the interruption and continues as if no error occurs.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Hairong Kuang (JIRA) at Sep 23, 2009 at 11:34 pm
    [ https://issues.apache.org/jira/browse/HDFS-637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Hairong Kuang resolved HDFS-637.
    --------------------------------

    Resolution: Fixed

    I've just committed this.
    DataNode sends a Success ack when block write fails
    ---------------------------------------------------

    Key: HDFS-637
    URL: https://issues.apache.org/jira/browse/HDFS-637
    Project: Hadoop HDFS
    Issue Type: Bug
    Components: data-node
    Reporter: Hairong Kuang
    Assignee: Hairong Kuang
    Priority: Blocker
    Fix For: 0.21.0

    Attachments: interrupted.patch, interrupted1.patch


    When I work on HDFS-624, I saw TestFileAppend3#TC7 occasionally fails. After lots of debug, I saw that the client unexpected received a response of "-2 SUCCESS SUCCESS" in which -2 is the packet sequence number. This happened in a pipeline of 2 datanodes and one of them failed. It turned out when block receiver fails, it shuts down itself and interrupts the packet responder but responder tries to handle interruption with the condition "Thread.isInterrupted()" but unfortunately a thread's interrupt status is not set in some cases as explained in the Thread#interrupt javadoc:
    If this thread is blocked in an invocation of the wait(), wait(long), or wait(long, int) methods of the Object class, or of the join(), join(long), join(long, int), sleep(long), or sleep(long, int), methods of this class, then its interrupt status will be cleared and it will receive an InterruptedException.
    So datanode does not detect the interruption and continues as if no error occurs.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouphdfs-dev @
categorieshadoop
postedSep 21, '09 at 6:00a
activeSep 23, '09 at 11:34p
posts2
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Hairong Kuang (JIRA): 2 posts

People

Translate

site design / logo © 2022 Grokbase