FAQ
NameNode unable to start due to stale edits log after a crash
-------------------------------------------------------------

Key: HDFS-1221
URL: https://issues.apache.org/jira/browse/HDFS-1221
Project: Hadoop HDFS
Issue Type: Bug
Affects Versions: 0.20.1
Reporter: Thanh Do


- Summary:
If a crash happens during FSEditLog.createEditLogFile(), the
edits log file on disk may be stale. During next reboot, NameNode
will get an exception when parsing the edits file, because of stale data,
leading to unsuccessful reboot.
Note: This is just one example. Since we see that edits log (and fsimage)
does not have checksum, they are vulnerable to corruption too.

- Details:
The steps to create new edits log (which we infer from HDFS code) are:
1) truncate the file to zero size
2) write FSConstants.LAYOUT_VERSION to buffer
3) insert the end-of-file marker OP_INVALID to the end of the buffer
4) preallocate 1MB of data, and fill the data with 0
5) flush the buffer to disk

Note that only in step 1, 4, 5, the data on disk is actually changed.
Now, suppose a crash happens after step 4, but before step 5.
In the next reboot, NameNode will fetch this edits log file (which contains
all 0). The first thing parsed is the LAYOUT_VERSION, which is 0. This is OK,
because NameNode has code to handle that case.
(but we expect LAYOUT_VERSION to be -18, don't we).
Now it parses the operation code, which happens to be 0. Unfortunately, since 0
is the value for OP_ADD, the NameNode expects some parameters corresponding
to that operation. Now NameNode calls readString to read the path, which throws
an exception leading to a failed reboot.

We found this problem almost at the same time as HDFS developers.
Basically, the edits log is truncated before fsimage.ckpt is renamed to fsimage.
Hence, any crash happens after the truncation but before the renaming will lead
to a data loss. Detailed description can be found here:
https://issues.apache.org/jira/browse/HDFS-955
This bug was found by our Failure Testing Service framework:
http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-98.html
For questions, please email us: Thanh Do (thanhdo@cs.wisc.edu) and
Haryadi Gunawi (haryadi@eecs.berkeley.edu)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Konstantin Shvachko (JIRA) at Aug 5, 2010 at 10:06 pm
    [ https://issues.apache.org/jira/browse/HDFS-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Konstantin Shvachko resolved HDFS-1221.
    ---------------------------------------

    Resolution: Not A Problem
    NameNode unable to start due to stale edits log after a crash
    -------------------------------------------------------------

    Key: HDFS-1221
    URL: https://issues.apache.org/jira/browse/HDFS-1221
    Project: Hadoop HDFS
    Issue Type: Bug
    Components: name-node
    Affects Versions: 0.20.1
    Reporter: Thanh Do

    - Summary:
    If a crash happens during FSEditLog.createEditLogFile(), the
    edits log file on disk may be stale. During next reboot, NameNode
    will get an exception when parsing the edits file, because of stale data,
    leading to unsuccessful reboot.
    Note: This is just one example. Since we see that edits log (and fsimage)
    does not have checksum, they are vulnerable to corruption too.

    - Details:
    The steps to create new edits log (which we infer from HDFS code) are:
    1) truncate the file to zero size
    2) write FSConstants.LAYOUT_VERSION to buffer
    3) insert the end-of-file marker OP_INVALID to the end of the buffer
    4) preallocate 1MB of data, and fill the data with 0
    5) flush the buffer to disk

    Note that only in step 1, 4, 5, the data on disk is actually changed.
    Now, suppose a crash happens after step 4, but before step 5.
    In the next reboot, NameNode will fetch this edits log file (which contains
    all 0). The first thing parsed is the LAYOUT_VERSION, which is 0. This is OK,
    because NameNode has code to handle that case.
    (but we expect LAYOUT_VERSION to be -18, don't we).
    Now it parses the operation code, which happens to be 0. Unfortunately, since 0
    is the value for OP_ADD, the NameNode expects some parameters corresponding
    to that operation. Now NameNode calls readString to read the path, which throws
    an exception leading to a failed reboot.
    This bug was found by our Failure Testing Service framework:
    http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-98.html
    For questions, please email us: Thanh Do (thanhdo@cs.wisc.edu) and
    Haryadi Gunawi (haryadi@eecs.berkeley.edu)
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouphdfs-dev @
categorieshadoop
postedJun 17, '10 at 12:58p
activeAug 5, '10 at 10:06p
posts2
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Konstantin Shvachko (JIRA): 2 posts

People

Translate

site design / logo © 2022 Grokbase