FAQ
name node server does not load large (> 2^31 bytes) edits file
--------------------------------------------------------------

Key: HADOOP-646
URL: http://issues.apache.org/jira/browse/HADOOP-646
Project: Hadoop
Issue Type: Bug
Components: dfs
Affects Versions: 0.7.1
Reporter: Christian Kunz
Priority: Critical


FileInputStream.available() returns negative values when reading a large file (> 2^31 bytes) -- this is a known (unresolved) java bug:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6402006

Consequence: a large edits file is not loaded and deleted without any warnings. The system reverts back to the old fsimage.

This happens in jdk1.6 as well, i.e. the bug has not yet been fixed.

In addition, when finally I was able to load my big cron-backed-up edits file (6.5 GB) with a kludgy work-around, the blocks did not exist anymore in the data node servers, probably deleted from the previous attempts when the name node server did not know about the changed situation.

Moral till this is fixed or worked-around: don't wait too long to restart the name node server. Otherwise this is a way to lose the entire dfs.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

Search Discussions

  • dhruba borthakur (JIRA) at Oct 26, 2006 at 11:30 pm
    [ http://issues.apache.org/jira/browse/HADOOP-646?page=comments#action_12445043 ]

    dhruba borthakur commented on HADOOP-646:
    -----------------------------------------

    Maybe we can avoid using the available() method completely. Can we first find the size of the edits file and then issue the appropriate number of readByte() calls?
    name node server does not load large (> 2^31 bytes) edits file
    --------------------------------------------------------------

    Key: HADOOP-646
    URL: http://issues.apache.org/jira/browse/HADOOP-646
    Project: Hadoop
    Issue Type: Bug
    Components: dfs
    Affects Versions: 0.7.1
    Reporter: Christian Kunz
    Priority: Critical

    FileInputStream.available() returns negative values when reading a large file (> 2^31 bytes) -- this is a known (unresolved) java bug:
    http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6402006
    Consequence: a large edits file is not loaded and deleted without any warnings. The system reverts back to the old fsimage.
    This happens in jdk1.6 as well, i.e. the bug has not yet been fixed.
    In addition, when finally I was able to load my big cron-backed-up edits file (6.5 GB) with a kludgy work-around, the blocks did not exist anymore in the data node servers, probably deleted from the previous attempts when the name node server did not know about the changed situation.
    Moral till this is fixed or worked-around: don't wait too long to restart the name node server. Otherwise this is a way to lose the entire dfs.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see: http://www.atlassian.com/software/jira
  • Milind Bhandarkar (JIRA) at Oct 29, 2006 at 10:27 pm
    [ http://issues.apache.org/jira/browse/HADOOP-646?page=comments#action_12445466 ]

    Milind Bhandarkar commented on HADOOP-646:
    ------------------------------------------

    Dhruba,

    Finding out number of entries from the size of the edits file is not possible, since these are not fixed size entries. However, trying to read a next byte and ending when we get an EOFExxception will allow us to avoid the call to available() completely. I am attaching a patch (untested for files > 2G) which does exactly that.

    Christian,

    Can you try this patch out for the backed-up edits file (6.5 G) that you have and let me know if it worked ? (It should be safe to try this patch out if you backup your image.)

    - milind
    name node server does not load large (> 2^31 bytes) edits file
    --------------------------------------------------------------

    Key: HADOOP-646
    URL: http://issues.apache.org/jira/browse/HADOOP-646
    Project: Hadoop
    Issue Type: Bug
    Components: dfs
    Affects Versions: 0.7.1
    Reporter: Christian Kunz
    Priority: Critical

    FileInputStream.available() returns negative values when reading a large file (> 2^31 bytes) -- this is a known (unresolved) java bug:
    http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6402006
    Consequence: a large edits file is not loaded and deleted without any warnings. The system reverts back to the old fsimage.
    This happens in jdk1.6 as well, i.e. the bug has not yet been fixed.
    In addition, when finally I was able to load my big cron-backed-up edits file (6.5 GB) with a kludgy work-around, the blocks did not exist anymore in the data node servers, probably deleted from the previous attempts when the name node server did not know about the changed situation.
    Moral till this is fixed or worked-around: don't wait too long to restart the name node server. Otherwise this is a way to lose the entire dfs.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see: http://www.atlassian.com/software/jira
  • Milind Bhandarkar (JIRA) at Oct 29, 2006 at 10:45 pm
    [ http://issues.apache.org/jira/browse/HADOOP-646?page=all ]

    Milind Bhandarkar updated HADOOP-646:
    -------------------------------------

    Attachment: edits.patch

    Patch for christian to test his large edits file loading.
    name node server does not load large (> 2^31 bytes) edits file
    --------------------------------------------------------------

    Key: HADOOP-646
    URL: http://issues.apache.org/jira/browse/HADOOP-646
    Project: Hadoop
    Issue Type: Bug
    Components: dfs
    Affects Versions: 0.7.1
    Reporter: Christian Kunz
    Priority: Critical
    Attachments: edits.patch


    FileInputStream.available() returns negative values when reading a large file (> 2^31 bytes) -- this is a known (unresolved) java bug:
    http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6402006
    Consequence: a large edits file is not loaded and deleted without any warnings. The system reverts back to the old fsimage.
    This happens in jdk1.6 as well, i.e. the bug has not yet been fixed.
    In addition, when finally I was able to load my big cron-backed-up edits file (6.5 GB) with a kludgy work-around, the blocks did not exist anymore in the data node servers, probably deleted from the previous attempts when the name node server did not know about the changed situation.
    Moral till this is fixed or worked-around: don't wait too long to restart the name node server. Otherwise this is a way to lose the entire dfs.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see: http://www.atlassian.com/software/jira
  • Christian Kunz (JIRA) at Nov 1, 2006 at 8:12 pm
    [ http://issues.apache.org/jira/browse/HADOOP-646?page=comments#action_12446366 ]

    Christian Kunz commented on HADOOP-646:
    ---------------------------------------

    Sorry to get back so late, but the previous large edits file was deleted, and I had to wait for another 'big' one to be generated. I tested the patch on a 3.2GB edits file, and it seemed to have loaded it okay (sampling of files to be copied to local worked). FYI, the loading took more than 10 minutes, and during that time the name node server was not available.
    name node server does not load large (> 2^31 bytes) edits file
    --------------------------------------------------------------

    Key: HADOOP-646
    URL: http://issues.apache.org/jira/browse/HADOOP-646
    Project: Hadoop
    Issue Type: Bug
    Components: dfs
    Affects Versions: 0.7.1
    Reporter: Christian Kunz
    Priority: Critical
    Attachments: edits.patch


    FileInputStream.available() returns negative values when reading a large file (> 2^31 bytes) -- this is a known (unresolved) java bug:
    http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6402006
    Consequence: a large edits file is not loaded and deleted without any warnings. The system reverts back to the old fsimage.
    This happens in jdk1.6 as well, i.e. the bug has not yet been fixed.
    In addition, when finally I was able to load my big cron-backed-up edits file (6.5 GB) with a kludgy work-around, the blocks did not exist anymore in the data node servers, probably deleted from the previous attempts when the name node server did not know about the changed situation.
    Moral till this is fixed or worked-around: don't wait too long to restart the name node server. Otherwise this is a way to lose the entire dfs.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see: http://www.atlassian.com/software/jira
  • Sameer Paranjpye (JIRA) at Nov 10, 2006 at 9:40 pm
    [ http://issues.apache.org/jira/browse/HADOOP-646?page=all ]

    Sameer Paranjpye updated HADOOP-646:
    ------------------------------------

    Status: Patch Available (was: Open)
    Fix Version/s: 0.9.0
    Affects Version/s: 0.8.0
    (was: 0.7.1)
    Assignee: Milind Bhandarkar
    name node server does not load large (> 2^31 bytes) edits file
    --------------------------------------------------------------

    Key: HADOOP-646
    URL: http://issues.apache.org/jira/browse/HADOOP-646
    Project: Hadoop
    Issue Type: Bug
    Components: dfs
    Affects Versions: 0.8.0
    Reporter: Christian Kunz
    Assigned To: Milind Bhandarkar
    Priority: Critical
    Fix For: 0.9.0

    Attachments: edits.patch


    FileInputStream.available() returns negative values when reading a large file (> 2^31 bytes) -- this is a known (unresolved) java bug:
    http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6402006
    Consequence: a large edits file is not loaded and deleted without any warnings. The system reverts back to the old fsimage.
    This happens in jdk1.6 as well, i.e. the bug has not yet been fixed.
    In addition, when finally I was able to load my big cron-backed-up edits file (6.5 GB) with a kludgy work-around, the blocks did not exist anymore in the data node servers, probably deleted from the previous attempts when the name node server did not know about the changed situation.
    Moral till this is fixed or worked-around: don't wait too long to restart the name node server. Otherwise this is a way to lose the entire dfs.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see: http://www.atlassian.com/software/jira
  • Doug Cutting (JIRA) at Nov 10, 2006 at 9:56 pm
    [ http://issues.apache.org/jira/browse/HADOOP-646?page=all ]

    Doug Cutting updated HADOOP-646:
    --------------------------------

    Status: Resolved (was: Patch Available)
    Resolution: Fixed

    I just committed this. Thanks, Milind!
    name node server does not load large (> 2^31 bytes) edits file
    --------------------------------------------------------------

    Key: HADOOP-646
    URL: http://issues.apache.org/jira/browse/HADOOP-646
    Project: Hadoop
    Issue Type: Bug
    Components: dfs
    Affects Versions: 0.8.0
    Reporter: Christian Kunz
    Assigned To: Milind Bhandarkar
    Priority: Critical
    Fix For: 0.9.0

    Attachments: edits.patch


    FileInputStream.available() returns negative values when reading a large file (> 2^31 bytes) -- this is a known (unresolved) java bug:
    http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6402006
    Consequence: a large edits file is not loaded and deleted without any warnings. The system reverts back to the old fsimage.
    This happens in jdk1.6 as well, i.e. the bug has not yet been fixed.
    In addition, when finally I was able to load my big cron-backed-up edits file (6.5 GB) with a kludgy work-around, the blocks did not exist anymore in the data node servers, probably deleted from the previous attempts when the name node server did not know about the changed situation.
    Moral till this is fixed or worked-around: don't wait too long to restart the name node server. Otherwise this is a way to lose the entire dfs.
    --
    This message is automatically generated by JIRA.
    -
    If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
    -
    For more information on JIRA, see: http://www.atlassian.com/software/jira

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedOct 26, '06 at 10:09p
activeNov 10, '06 at 9:56p
posts7
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Doug Cutting (JIRA): 7 posts

People

Translate

site design / logo © 2022 Grokbase