FAQ
page 312 of Tom White's "Hadoop: The Definitive Guide" mentions that the
Offline Image Viewer supplied with 0.21.0 can be used to test the integrity
of any backups taken from the Secondary Namenode (previous.checkpoint)
directory.

How does this work in practice?

I've tested the tool on a valid fsimage file and a dummy file and I find
that the only difference is that the output is a file of non-zero size if
successful, and a zero-size file if there is a failure (as well as the
message "Input file ended unexpectedly. Exiting").

The documentation (
http://hadoop.apache.org/hdfs/docs/current/hdfs_imageviewer.html) doesn't
explicitly say how to verify fsimage integrity but I assume that the tool
completing without encountering an error is enough to prove that the image
is valid. It does say "If the tool is not able to process an image file, it
will exit cleanly" - but this is not of much use for automated backups.

Since this is going to be a common use-case for the Offline Image Viewer, I
suggest that the documentation is updated to specifically state how to check
for a valid image (eg during an automated backup process).

So, can anyone confirm how the Offline Image Viewer is used to verify the
integrity of a fsimage file?

thanks

Search Discussions

  • Shared at Apr 28, 2011 at 8:48 am
    The script below is what I am now using for checking image integrity.

    #!/bin/bash
    # check_previous_checkpoint_integrity.sh
    #
    # This simple script tests the latest image which is in the
    'previous.checkpoint' directory for namenode images.
    # This will provide an early warning as to whether an HDFS image has been
    corrupted.

    export HADOOP_HOME=/usr/local/hadoop-install/hadoop-0.21.0

    INPUT_FILE=/home/hadoop/hdfs/namesecondary/previous.checkpoint/fsimage
    OUTPUT_FILE=/tmp/fsimage.txt

    # If successful, this will create a non-empty file called fsimage.txt
    # If fsimage is invalid, an empty fsimage.txt file will be created
    $HADOOP_HOME/bin/hdfs oiv -i $INPUT_FILE -o $OUTPUT_FILE

    if [ -s $OUTPUT_FILE ]
    then
    echo "OK (file modified at `stat -c %y $INPUT_FILE`)"
    else
    echo "FAIL"
    fi


    On 14 April 2011 11:12, shared mailinglists
    wrote:
    page 312 of Tom White's "Hadoop: The Definitive Guide" mentions that the
    Offline Image Viewer supplied with 0.21.0 can be used to test the integrity
    of any backups taken from the Secondary Namenode (previous.checkpoint)
    directory.

    How does this work in practice?

    I've tested the tool on a valid fsimage file and a dummy file and I find
    that the only difference is that the output is a file of non-zero size if
    successful, and a zero-size file if there is a failure (as well as the
    message "Input file ended unexpectedly. Exiting").

    The documentation (
    http://hadoop.apache.org/hdfs/docs/current/hdfs_imageviewer.html) doesn't
    explicitly say how to verify fsimage integrity but I assume that the tool
    completing without encountering an error is enough to prove that the image
    is valid. It does say "If the tool is not able to process an image file,
    it will exit cleanly" - but this is not of much use for automated backups.

    Since this is going to be a common use-case for the Offline Image Viewer, I
    suggest that the documentation is updated to specifically state how to check
    for a valid image (eg during an automated backup process).

    So, can anyone confirm how the Offline Image Viewer is used to verify the
    integrity of a fsimage file?

    thanks

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedApr 14, '11 at 10:13a
activeApr 28, '11 at 8:48a
posts2
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Shared: 2 posts

People

Translate

site design / logo © 2021 Grokbase