FAQ
Hi,

My repo is one week old and the change I did was to modify the
Configuration object at BackupNode.initialize() to make the name and edit
dirs to other directories, so I could run both namenode and backup node in
the same machine. When I copied a file to HDFS, the follow exception was
below was thrown. Have anyone seem that ?


11/06/15 17:52:22 INFO ipc.Server: IPC Server handler 1 on 50100, call
journal(NamenodeRegistration(localhost:8020, role=NameNode), 101, 164,
[B@3951f910), rpc version=1, client version=5, methodsFingerPrint=302283637
from 192.168.1.102:56780: error: java.io.IOException: Error replaying edit
log at offset 13
Recent opcode offsets: 1
java.io.IOException: Error replaying edit log at offset 13
Recent opcode offsets: 1
at
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:514)
at
org.apache.hadoop.hdfs.server.namenode.BackupImage.journal(BackupImage.java:242)
at
org.apache.hadoop.hdfs.server.namenode.BackupNode.journal(BackupNode.java:251)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:422)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1496)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1492)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1131)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1490)
Caused by: org.apache.hadoop.fs.ChecksumException: Transaction 1 is corrupt.
Calculated checksum is -2116249809 but read checksum 0
at
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.validateChecksum(FSEditLogLoader.java:546)
at
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:490)
... 13 more


Thanks and Regards,
André Oriani

Search Discussions

  • Ivan Kelly at Jun 16, 2011 at 10:32 am
    This seems to have been introduced here:
    https://github.com/apache/hadoop-hdfs/commit/27b956fa62ce9b467ab7dd287dd6dcd5ab6a0cb3#src/java/org/apache/hadoop/hdfs/server/namenode/BackupImage.java
    The backup streams never write the version, so it should never try to
    read it either. I would have expected this to fail earlier as it's
    reading junk since the stream pointer is a int past where it should be.
    BackupStreams don't write the checksum either. This really should have
    failed the BackupNode unit test, but I think there other problems with
    that. cf.
    https://issues.apache.org/jira/browse/HDFS-1521?focusedCommentId=13010242&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13010242

    Could you try again with code from April 10th.

    Another candidate for causing it could be HDFS-2003 which went in on the
    8th of this month.




    On 16/06/2011 00:42, André Oriani wrote:
    Hi,

    My repo is one week old and the change I did was to modify the
    Configuration object at BackupNode.initialize() to make the name and edit
    dirs to other directories, so I could run both namenode and backup node in
    the same machine. When I copied a file to HDFS, the follow exception was
    below was thrown. Have anyone seem that ?


    11/06/15 17:52:22 INFO ipc.Server: IPC Server handler 1 on 50100, call
    journal(NamenodeRegistration(localhost:8020, role=NameNode), 101, 164,
    [B@3951f910), rpc version=1, client version=5, methodsFingerPrint=302283637
    from 192.168.1.102:56780: error: java.io.IOException: Error replaying edit
    log at offset 13
    Recent opcode offsets: 1
    java.io.IOException: Error replaying edit log at offset 13
    Recent opcode offsets: 1
    at
    org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:514)
    at
    org.apache.hadoop.hdfs.server.namenode.BackupImage.journal(BackupImage.java:242)
    at
    org.apache.hadoop.hdfs.server.namenode.BackupNode.journal(BackupNode.java:251)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at
    org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:422)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1496)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1492)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at
    org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1131)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1490)
    Caused by: org.apache.hadoop.fs.ChecksumException: Transaction 1 is corrupt.
    Calculated checksum is -2116249809 but read checksum 0
    at
    org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.validateChecksum(FSEditLogLoader.java:546)
    at
    org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:490)
    ... 13 more


    Thanks and Regards,
    André Oriani
  • André Oriani at Jun 18, 2011 at 3:11 am
    Hi Ivan,

    Sorry for taking long time to answer your email. I did the test as you asked
    and I found the commit below as the one that caused the breakage. I wish I
    could provide a fix, but I do not have time for today.


    commit 27b956fa62ce9b467ab7dd287dd6dcd5ab6a0cb3
    Author: Hairong Kuang <hairong@apache.org>
    Date: Mon Apr 11 17:15:27 2011 +0000

    HDFS-1630. Support fsedits checksum. Contrbuted by Hairong Kuang.


    git-svn-id:
    https://svn.apache.org/repos/asf/hadoop/hdfs/trunk@109113113f79535-47bb-0310-9956-ffa450edef68


    Regards,
    André Oriani

    On Thu, Jun 16, 2011 at 07:31, Ivan Kelly wrote:

    This seems to have been introduced here:
    https://github.com/apache/**hadoop-hdfs/commit/**
    27b956fa62ce9b467ab7dd287dd6dc**d5ab6a0cb3#src/java/org/**
    apache/hadoop/hdfs/server/**namenode/BackupImage.java<https://github.com/apache/hadoop-hdfs/commit/27b956fa62ce9b467ab7dd287dd6dcd5ab6a0cb3#src/java/org/apache/hadoop/hdfs/server/namenode/BackupImage.java>
    The backup streams never write the version, so it should never try to read
    it either. I would have expected this to fail earlier as it's reading junk
    since the stream pointer is a int past where it should be. BackupStreams
    don't write the checksum either. This really should have failed the
    BackupNode unit test, but I think there other problems with that. cf.
    https://issues.apache.org/**jira/browse/HDFS-1521?**
    focusedCommentId=13010242&**page=com.atlassian.jira.**
    plugin.system.issuetabpanels:**comment-tabpanel#comment-**13010242<https://issues.apache.org/jira/browse/HDFS-1521?focusedCommentId=13010242&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13010242>

    Could you try again with code from April 10th.

    Another candidate for causing it could be HDFS-2003 which went in on the
    8th of this month.





    On 16/06/2011 00:42, André Oriani wrote:

    Hi,

    My repo is one week old and the change I did was to modify the
    Configuration object at BackupNode.initialize() to make the name and edit
    dirs to other directories, so I could run both namenode and backup node in
    the same machine. When I copied a file to HDFS, the follow exception was
    below was thrown. Have anyone seem that ?


    11/06/15 17:52:22 INFO ipc.Server: IPC Server handler 1 on 50100, call
    journal(NamenodeRegistration(**localhost:8020, role=NameNode), 101, 164,
    [B@3951f910), rpc version=1, client version=5,
    methodsFingerPrint=302283637
    from 192.168.1.102:56780: error: java.io.IOException: Error replaying
    edit
    log at offset 13
    Recent opcode offsets: 1
    java.io.IOException: Error replaying edit log at offset 13
    Recent opcode offsets: 1
    at
    org.apache.hadoop.hdfs.server.**namenode.FSEditLogLoader.**
    loadEditRecords(**FSEditLogLoader.java:514)
    at
    org.apache.hadoop.hdfs.server.**namenode.BackupImage.journal(**
    BackupImage.java:242)
    at
    org.apache.hadoop.hdfs.server.**namenode.BackupNode.journal(**
    BackupNode.java:251)
    at sun.reflect.**NativeMethodAccessorImpl.**invoke0(Native Method)
    at
    sun.reflect.**NativeMethodAccessorImpl.**invoke(**
    NativeMethodAccessorImpl.java:**39)
    at
    sun.reflect.**DelegatingMethodAccessorImpl.**invoke(**
    DelegatingMethodAccessorImpl.**java:25)
    at java.lang.reflect.Method.**invoke(Method.java:597)
    at
    org.apache.hadoop.ipc.**WritableRpcEngine$Server.call(**
    WritableRpcEngine.java:422)
    at org.apache.hadoop.ipc.Server$**Handler$1.run(Server.java:**1496)
    at org.apache.hadoop.ipc.Server$**Handler$1.run(Server.java:**1492)
    at java.security.**AccessController.doPrivileged(**Native Method)
    at javax.security.auth.Subject.**doAs(Subject.java:396)
    at
    org.apache.hadoop.security.**UserGroupInformation.doAs(**
    UserGroupInformation.java:**1131)
    at org.apache.hadoop.ipc.Server$**Handler.run(Server.java:1490)
    Caused by: org.apache.hadoop.fs.**ChecksumException: Transaction 1 is
    corrupt.
    Calculated checksum is -2116249809 but read checksum 0
    at
    org.apache.hadoop.hdfs.server.**namenode.FSEditLogLoader.**
    validateChecksum(**FSEditLogLoader.java:546)
    at
    org.apache.hadoop.hdfs.server.**namenode.FSEditLogLoader.**
    loadEditRecords(**FSEditLogLoader.java:490)
    ... 13 more


    Thanks and Regards,
    André Oriani
  • Ivan Kelly at Jun 20, 2011 at 8:40 am
    Hi Andre,

    Could you open a JIRA ticket for this. I think the fix should be quite
    straightforward. We just need to add checksum calculation when writing
    to the backup stream and either remove the reading of the version from
    the input side or add writing of the version on the output side.

    -Ivan
    On 18/06/2011 05:10, André Oriani wrote:
    Hi Ivan,

    Sorry for taking long time to answer your email. I did the test as you asked
    and I found the commit below as the one that caused the breakage. I wish I
    could provide a fix, but I do not have time for today.


    commit 27b956fa62ce9b467ab7dd287dd6dcd5ab6a0cb3
    Author: Hairong Kuang<hairong@apache.org>
    Date: Mon Apr 11 17:15:27 2011 +0000

    HDFS-1630. Support fsedits checksum. Contrbuted by Hairong Kuang.


    git-svn-id:
    https://svn.apache.org/repos/asf/hadoop/hdfs/trunk@109113113f79535-47bb-0310-9956-ffa450edef68


    Regards,
    André Oriani
  • André Oriani at Jun 21, 2011 at 2:15 am
    Hi Ivan.

    I filled https://issues.apache.org/jira/browse/HDFS-2090

    Regards,
    André
    On Mon, Jun 20, 2011 at 05:39, Ivan Kelly wrote:

    ommit 27b956fa62ce9b467ab7dd287dd6dc**d5ab6a0cb3
    Author: Hairong Kuang<hairong@apache.org>
    Date: Mon Apr 11 17:15:27 2011 +0000

    HDFS-1630. Support fsedits checksum. Contrbuted by Hairong Kuang.


    git-svn-id:
    https://svn.apache.org/repos/**asf/hadoop/hdfs/trunk@**
    109113113f79535-47bb-0310-**9956-ffa450edef68<https://svn.apache.org/repos/asf/hadoop/hdfs/trunk@109113113f79535-47bb-0310-9956-ffa450edef68>

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouphdfs-dev @
categorieshadoop
postedJun 15, '11 at 10:43p
activeJun 21, '11 at 2:15a
posts5
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

André Oriani: 3 posts Ivan Kelly: 2 posts

People

Translate

site design / logo © 2022 Grokbase