FAQ
Hi everyone,
I downloaded the nightly build (see below) yesterday and after the
cluster worked fine for about 10 hours I got the following
error message from the DFS client even all data nodes were up:
08/02/21 14:04:35 INFO fs.DFSClient: Could not obtain block
blk_-400895070464649
0788 from any node: java.io.IOException: No live nodes contain
current block
So I decided to restart our cluster but unfortunately the Namenode
fails to start with the following exception:
2008-02-21 14:20:48,831 INFO org.apache.hadoop.dfs.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = se09/141.76.44.209
STARTUP_MSG: args = []
STARTUP_MSG: version = 2008-02-19_11-01-48
STARTUP_MSG: build =
http://svn.apache.org/repos/asf/hadoop/core/trunk -r 628999; compiled
by 'hudson' on Tue Feb 19 11:09:05 UTC 2008
************************************************************/
2008-02-21 14:20:49,367 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
Initializing RPC Metrics with serverName=NameNode, port=8000
2008-02-21 14:20:49,374 INFO org.apache.hadoop.dfs.NameNode: Namenode
up at: se09.inf.tu-dresden.de/141.76.44.209:8000
2008-02-21 14:20:49,378 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
Initializing JVM Metrics with processName=NameNode, sessionId=null
2008-02-21 14:20:49,381 INFO org.apache.hadoop.dfs.NameNodeMetrics:
Initializing NameNodeMeterics using context
object:org.apache.hadoop.metrics.spi.NullContext
2008-02-21 14:20:49,501 INFO org.apache.hadoop.fs.FSNamesystem:
fsOwner=amartin,students
2008-02-21 14:20:49,501 INFO org.apache.hadoop.fs.FSNamesystem:
supergroup=supergroup
2008-02-21 14:20:49,501 INFO org.apache.hadoop.fs.FSNamesystem:
isPermissionEnabled=true
2008-02-21 14:20:49,788 INFO org.apache.hadoop.ipc.Server: Stopping
server on 8000
2008-02-21 14:20:49,790 ERROR org.apache.hadoop.dfs.NameNode:
java.io.IOException: Created 13 leases but found 4
at
org.apache.hadoop.dfs.FSImage.loadFilesUnderConstruction(FSImage.java:935)
at org.apache.hadoop.dfs.FSImage.loadFSImage(FSImage.java:749)
at org.apache.hadoop.dfs.FSImage.loadFSImage(FSImage.java:634)
at
org.apache.hadoop.dfs.FSImage.recoverTransitionRead(FSImage.java:223)
at org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:79)
at
org.apache.hadoop.dfs.FSNamesystem.initialize(FSNamesystem.java:261)
at org.apache.hadoop.dfs.FSNamesystem.<init>(FSNamesystem.java:242)
at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:131)
at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:176)
at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:162)
at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:851)
at org.apache.hadoop.dfs.NameNode.main(NameNode.java:860)

2008-02-21 14:20:49,791 INFO org.apache.hadoop.dfs.NameNode:
SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at se09/141.76.44.209
************************************************************/
Any ideas?

Cu on the 'net,
Bye - bye,

<<<<< André <<<< >>>> èrbnA >>>>>

Search Discussions

  • Raghu Angadi at Feb 21, 2008 at 6:39 pm
    Please file a jira (let me know if need help with that). Did subsequent
    tries to restart succeed?

    Thanks,
    Raghu.

    André Martin wrote:
    Hi everyone,
    I downloaded the nightly build (see below) yesterday and after the
    cluster worked fine for about 10 hours I got the following
    error message from the DFS client even all data nodes were up:
    08/02/21 14:04:35 INFO fs.DFSClient: Could not obtain block
    blk_-400895070464649
    0788 from any node: java.io.IOException: No live nodes contain
    current block
    So I decided to restart our cluster but unfortunately the Namenode
    fails to start with the following exception:
    2008-02-21 14:20:48,831 INFO org.apache.hadoop.dfs.NameNode: STARTUP_MSG:
    /************************************************************
    STARTUP_MSG: Starting NameNode
    STARTUP_MSG: host = se09/141.76.44.209
    STARTUP_MSG: args = []
    STARTUP_MSG: version = 2008-02-19_11-01-48
    STARTUP_MSG: build =
    http://svn.apache.org/repos/asf/hadoop/core/trunk -r 628999; compiled
    by 'hudson' on Tue Feb 19 11:09:05 UTC 2008
    ************************************************************/
    2008-02-21 14:20:49,367 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
    Initializing RPC Metrics with serverName=NameNode, port=8000
    2008-02-21 14:20:49,374 INFO org.apache.hadoop.dfs.NameNode: Namenode
    up at: se09.inf.tu-dresden.de/141.76.44.209:8000
    2008-02-21 14:20:49,378 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
    Initializing JVM Metrics with processName=NameNode, sessionId=null
    2008-02-21 14:20:49,381 INFO org.apache.hadoop.dfs.NameNodeMetrics:
    Initializing NameNodeMeterics using context
    object:org.apache.hadoop.metrics.spi.NullContext
    2008-02-21 14:20:49,501 INFO org.apache.hadoop.fs.FSNamesystem:
    fsOwner=amartin,students
    2008-02-21 14:20:49,501 INFO org.apache.hadoop.fs.FSNamesystem:
    supergroup=supergroup
    2008-02-21 14:20:49,501 INFO org.apache.hadoop.fs.FSNamesystem:
    isPermissionEnabled=true
    2008-02-21 14:20:49,788 INFO org.apache.hadoop.ipc.Server: Stopping
    server on 8000
    2008-02-21 14:20:49,790 ERROR org.apache.hadoop.dfs.NameNode:
    java.io.IOException: Created 13 leases but found 4
    at
    org.apache.hadoop.dfs.FSImage.loadFilesUnderConstruction(FSImage.java:935)

    at org.apache.hadoop.dfs.FSImage.loadFSImage(FSImage.java:749)
    at org.apache.hadoop.dfs.FSImage.loadFSImage(FSImage.java:634)
    at
    org.apache.hadoop.dfs.FSImage.recoverTransitionRead(FSImage.java:223)
    at org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:79)
    at
    org.apache.hadoop.dfs.FSNamesystem.initialize(FSNamesystem.java:261)
    at org.apache.hadoop.dfs.FSNamesystem.<init>(FSNamesystem.java:242)
    at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:131)
    at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:176)
    at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:162)
    at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:851)
    at org.apache.hadoop.dfs.NameNode.main(NameNode.java:860)

    2008-02-21 14:20:49,791 INFO org.apache.hadoop.dfs.NameNode:
    SHUTDOWN_MSG:
    /************************************************************
    SHUTDOWN_MSG: Shutting down NameNode at se09/141.76.44.209
    ************************************************************/
    Any ideas?

    Cu on the 'net,
    Bye - bye,

    <<<<< André <<<< >>>> èrbnA >>>>>



  • Robert Chansler at Feb 21, 2008 at 6:44 pm
    Thanks for helping the gentle user!

    On 21 02 08 10:38, "Raghu Angadi" wrote:


    Please file a jira (let me know if need help with that). Did subsequent
    tries to restart succeed?

    Thanks,
    Raghu.

    André Martin wrote:
    Hi everyone,
    I downloaded the nightly build (see below) yesterday and after the
    cluster worked fine for about 10 hours I got the following
    error message from the DFS client even all data nodes were up:
    08/02/21 14:04:35 INFO fs.DFSClient: Could not obtain block
    blk_-400895070464649
    0788 from any node: java.io.IOException: No live nodes contain
    current block
    So I decided to restart our cluster but unfortunately the Namenode
    fails to start with the following exception:
    2008-02-21 14:20:48,831 INFO org.apache.hadoop.dfs.NameNode: STARTUP_MSG:
    /************************************************************
    STARTUP_MSG: Starting NameNode
    STARTUP_MSG: host = se09/141.76.44.209
    STARTUP_MSG: args = []
    STARTUP_MSG: version = 2008-02-19_11-01-48
    STARTUP_MSG: build =
    http://svn.apache.org/repos/asf/hadoop/core/trunk -r 628999; compiled
    by 'hudson' on Tue Feb 19 11:09:05 UTC 2008
    ************************************************************/
    2008-02-21 14:20:49,367 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
    Initializing RPC Metrics with serverName=NameNode, port=8000
    2008-02-21 14:20:49,374 INFO org.apache.hadoop.dfs.NameNode: Namenode
    up at: se09.inf.tu-dresden.de/141.76.44.209:8000
    2008-02-21 14:20:49,378 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
    Initializing JVM Metrics with processName=NameNode, sessionId=null
    2008-02-21 14:20:49,381 INFO org.apache.hadoop.dfs.NameNodeMetrics:
    Initializing NameNodeMeterics using context
    object:org.apache.hadoop.metrics.spi.NullContext
    2008-02-21 14:20:49,501 INFO org.apache.hadoop.fs.FSNamesystem:
    fsOwner=amartin,students
    2008-02-21 14:20:49,501 INFO org.apache.hadoop.fs.FSNamesystem:
    supergroup=supergroup
    2008-02-21 14:20:49,501 INFO org.apache.hadoop.fs.FSNamesystem:
    isPermissionEnabled=true
    2008-02-21 14:20:49,788 INFO org.apache.hadoop.ipc.Server: Stopping
    server on 8000
    2008-02-21 14:20:49,790 ERROR org.apache.hadoop.dfs.NameNode:
    java.io.IOException: Created 13 leases but found 4
    at
    org.apache.hadoop.dfs.FSImage.loadFilesUnderConstruction(FSImage.java:935)

    at org.apache.hadoop.dfs.FSImage.loadFSImage(FSImage.java:749)
    at org.apache.hadoop.dfs.FSImage.loadFSImage(FSImage.java:634)
    at
    org.apache.hadoop.dfs.FSImage.recoverTransitionRead(FSImage.java:223)
    at org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:79)
    at
    org.apache.hadoop.dfs.FSNamesystem.initialize(FSNamesystem.java:261)
    at org.apache.hadoop.dfs.FSNamesystem.<init>(FSNamesystem.java:242)
    at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:131)
    at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:176)
    at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:162)
    at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:851)
    at org.apache.hadoop.dfs.NameNode.main(NameNode.java:860)

    2008-02-21 14:20:49,791 INFO org.apache.hadoop.dfs.NameNode:
    SHUTDOWN_MSG:
    /************************************************************
    SHUTDOWN_MSG: Shutting down NameNode at se09/141.76.44.209
    ************************************************************/
    Any ideas?

    Cu on the 'net,
    Bye - bye,

    <<<<< André <<<< >>>> èrbnA >>>>>



  • André Martin at Feb 22, 2008 at 2:00 pm
    Hi Raghu,
    done: https://issues.apache.org/jira/browse/HADOOP-2873
    Subsequent tries did not succeed - so it looks like I need to re-format
    the cluster :-(

    Cu on the 'net,
    Bye - bye,

    <<<<< André <<<< >>>> èrbnA >>>>>

    Raghu Angadi wrote:
    Please file a jira (let me know if need help with that). Did
    subsequent tries to restart succeed?

    Thanks,
    Raghu.

    André Martin wrote:
    Hi everyone,
    I downloaded the nightly build (see below) yesterday and after the
    cluster worked fine for about 10 hours I got the following
    error message from the DFS client even all data nodes were up:
    08/02/21 14:04:35 INFO fs.DFSClient: Could not obtain block
    blk_-400895070464649
    0788 from any node: java.io.IOException: No live nodes contain
    current block
    So I decided to restart our cluster but unfortunately the Namenode
    fails to start with the following exception:
    2008-02-21 14:20:48,831 INFO org.apache.hadoop.dfs.NameNode:
    STARTUP_MSG:
    /************************************************************
    STARTUP_MSG: Starting NameNode
    STARTUP_MSG: host = se09/141.76.44.209
    STARTUP_MSG: args = []
    STARTUP_MSG: version = 2008-02-19_11-01-48
    STARTUP_MSG: build =
    http://svn.apache.org/repos/asf/hadoop/core/trunk -r 628999;
    compiled by 'hudson' on Tue Feb 19 11:09:05 UTC 2008
    ************************************************************/
    2008-02-21 14:20:49,367 INFO
    org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing RPC Metrics
    with serverName=NameNode, port=8000
    2008-02-21 14:20:49,374 INFO org.apache.hadoop.dfs.NameNode:
    Namenode up at: se09.inf.tu-dresden.de/141.76.44.209:8000
    2008-02-21 14:20:49,378 INFO
    org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics
    with processName=NameNode, sessionId=null
    2008-02-21 14:20:49,381 INFO org.apache.hadoop.dfs.NameNodeMetrics:
    Initializing NameNodeMeterics using context
    object:org.apache.hadoop.metrics.spi.NullContext
    2008-02-21 14:20:49,501 INFO org.apache.hadoop.fs.FSNamesystem:
    fsOwner=amartin,students
    2008-02-21 14:20:49,501 INFO org.apache.hadoop.fs.FSNamesystem:
    supergroup=supergroup
    2008-02-21 14:20:49,501 INFO org.apache.hadoop.fs.FSNamesystem:
    isPermissionEnabled=true
    2008-02-21 14:20:49,788 INFO org.apache.hadoop.ipc.Server: Stopping
    server on 8000
    2008-02-21 14:20:49,790 ERROR org.apache.hadoop.dfs.NameNode:
    java.io.IOException: Created 13 leases but found 4
    at
    org.apache.hadoop.dfs.FSImage.loadFilesUnderConstruction(FSImage.java:935)

    at org.apache.hadoop.dfs.FSImage.loadFSImage(FSImage.java:749)
    at org.apache.hadoop.dfs.FSImage.loadFSImage(FSImage.java:634)
    at
    org.apache.hadoop.dfs.FSImage.recoverTransitionRead(FSImage.java:223)
    at
    org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:79)
    at
    org.apache.hadoop.dfs.FSNamesystem.initialize(FSNamesystem.java:261)
    at org.apache.hadoop.dfs.FSNamesystem.<init>(FSNamesystem.java:242)
    at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:131)
    at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:176)
    at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:162)
    at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:851)
    at org.apache.hadoop.dfs.NameNode.main(NameNode.java:860)

    2008-02-21 14:20:49,791 INFO org.apache.hadoop.dfs.NameNode:
    SHUTDOWN_MSG:
    /************************************************************
    SHUTDOWN_MSG: Shutting down NameNode at se09/141.76.44.209
    ************************************************************/
    Any ideas?

    Cu on the 'net,
    Bye - bye,

    <<<<< André <<<< >>>> èrbnA >>>>>



  • Raghu Angadi at Feb 22, 2008 at 4:50 pm

    André Martin wrote:
    Hi Raghu,
    done: https://issues.apache.org/jira/browse/HADOOP-2873
    Subsequent tries did not succeed - so it looks like I need to re-format
    the cluster :-(
    Please back up the log files and name node image files if you can before
    re-format.

    Raghu.
  • Raghu Angadi at Feb 22, 2008 at 5:23 pm
    From the jira: (for users similarly affected):
    ----
    Andre,

    as a temporary hack, you can just comment out the FSImage.java:749 and
    your restart should work, since these are last entries read from the
    image file.
    ----

    Raghu.

    Raghu Angadi wrote:
    André Martin wrote:
    Hi Raghu,
    done: https://issues.apache.org/jira/browse/HADOOP-2873
    Subsequent tries did not succeed - so it looks like I need to
    re-format the cluster :-(
    Please back up the log files and name node image files if you can before
    re-format.

    Raghu.
  • Konstantin Shvachko at Feb 22, 2008 at 7:39 pm
    André,
    You can try to rollback.
    You did use upgrade when you switched to the new trunk, right?
    --Konstantin

    Raghu Angadi wrote:
    André Martin wrote:
    Hi Raghu,
    done: https://issues.apache.org/jira/browse/HADOOP-2873
    Subsequent tries did not succeed - so it looks like I need to
    re-format the cluster :-(

    Please back up the log files and name node image files if you can before
    re-format.

    Raghu.
  • Steve Sapovits at Feb 22, 2008 at 9:44 pm
    What are the situations that make reformatting necessary? Testing, we seem
    to hit a lot of cases where we have to reformat. We're wondering how much of
    a real production issue this is.

    --
    Steve Sapovits
    Invite Media - http://www.invitemedia.com
    ssapovits@invitemedia.com
  • dhruba Borthakur at Feb 22, 2008 at 10:04 pm
    Reformatting should never be necessary if you are using released version
    of hadoop. Hadoop-2783 refers to a bug that got introduced into trunk
    (not in any released versions).

    Thanks,
    Dhruba


    -----Original Message-----
    From: Steve Sapovits
    Sent: Friday, February 22, 2008 1:43 PM
    To: core-user@hadoop.apache.org
    Subject: Re: Namenode fails to re-start after cluster shutdown


    What are the situations that make reformatting necessary? Testing, we
    seem
    to hit a lot of cases where we have to reformat. We're wondering how
    much of
    a real production issue this is.

    --
    Steve Sapovits
    Invite Media - http://www.invitemedia.com
    ssapovits@invitemedia.com
  • Steve Sapovits at Feb 22, 2008 at 10:06 pm

    dhruba Borthakur wrote:

    Reformatting should never be necessary if you are using released version
    of hadoop. Hadoop-2783 refers to a bug that got introduced into trunk
    (not in any released versions).
    Interesting. We're running only released versions. We have cases where the
    name node won't come up unless we reformat. We know in some cases
    that's been due to boxes going down while it's running. We haven't looked too
    deep so maybe there's something else going on.

    --
    Steve Sapovits
    Invite Media - http://www.invitemedia.com
    ssapovits@invitemedia.com
  • Raghu Angadi at Feb 22, 2008 at 10:18 pm
    Please report such problems if you think it was because of HDFS, as
    opposed to some hardware or disk failures.

    Raghu.

    Steve Sapovits wrote:
    dhruba Borthakur wrote:
    Reformatting should never be necessary if you are using released version
    of hadoop. Hadoop-2783 refers to a bug that got introduced into trunk
    (not in any released versions).
    Interesting. We're running only released versions. We have cases where
    the
    name node won't come up unless we reformat. We know in some cases
    that's been due to boxes going down while it's running. We haven't
    looked too
    deep so maybe there's something else going on.
  • Steve Sapovits at Feb 22, 2008 at 10:34 pm

    Raghu Angadi wrote:

    Please report such problems if you think it was because of HDFS, as
    opposed to some hardware or disk failures.
    Will do. I suspect it's something else. I'm testing on a notebook in pseudo-distributed
    mode (per the quick start guide). My IP changes when I take that box between home
    and work so that could be it -- even though I'm running everything localhost I've seen
    other issues if my hostname can't get properly resolved. Also, with everything in /tmp
    by default, shutdowns of that box may be removing files.

    --
    Steve Sapovits
    Invite Media - http://www.invitemedia.com
    ssapovits@invitemedia.com
  • dhruba Borthakur at Feb 22, 2008 at 11:14 pm
    If your file system metadata is in /tmp, then you are likely to see
    these kinds of problems. It would be nice if you can move the location
    of your metadata files away from /tmp. If you still see the problem, can
    you pl send us the logs from the log directory?

    Thanks a bunch,
    Dhruba


    -----Original Message-----
    From: Steve Sapovits
    Sent: Friday, February 22, 2008 2:34 PM
    To: core-user@hadoop.apache.org
    Subject: Re: Namenode fails to re-start after cluster shutdown

    Raghu Angadi wrote:
    Please report such problems if you think it was because of HDFS, as
    opposed to some hardware or disk failures.
    Will do. I suspect it's something else. I'm testing on a notebook in
    pseudo-distributed
    mode (per the quick start guide). My IP changes when I take that box
    between home
    and work so that could be it -- even though I'm running everything
    localhost I've seen
    other issues if my hostname can't get properly resolved. Also, with
    everything in /tmp
    by default, shutdowns of that box may be removing files.

    --
    Steve Sapovits
    Invite Media - http://www.invitemedia.com
    ssapovits@invitemedia.com
  • André Martin at Feb 25, 2008 at 6:25 pm
    Hi everyone,
    I applied the patch provided at
    https://issues.apache.org/jira/browse/HADOOP-2873 and my
    namenode/cluster is up again without formatting it :-) Thanks for the
    quick help!
    Reading and writing to the cluster seems to work fine except for MapRed
    jobs: All counters say 0 even I can access the files (mentioned in the
    MapTask splits) properly through the WebUI.
    I will try to dig the logs and let you know if I could figure something
    out...
    Also, the namenode still says: "Upgrade for version -13 has been
    completed. Upgrade is not finalized." even 15 hours after launching it :-/

    Cu on the 'net,
    Bye - bye,

    <<<<< André <<<< >>>> e`rbnA >>>>>

    Konstantin Shvachko wrote:
    André,
    You can try to rollback.
    You did use upgrade when you switched to the new trunk, right?
    --Konstantin
  • Konstantin Shvachko at Mar 3, 2008 at 7:31 pm

    Also, the namenode still says: "Upgrade for version -13 has been
    completed. Upgrade is not finalized." even 15 hours after launching it :-/
    You can -finalizeUpgrade if you don't need the previous version anymore.

    http://hadoop.apache.org/core/docs/current/hdfs_user_guide.html#Upgrade+and+Rollback
  • André Martin at Mar 4, 2008 at 1:31 pm
    OK, that makes sense - thx!

    Cu on the 'net,
    Bye - bye,

    <<<<< André <<<< >>>> e`rbnA >>>>>


    Konstantin Shvachko wrote:
    Also, the namenode still says: "Upgrade for version -13 has been
    completed. Upgrade is not finalized." even 15 hours after launching
    it :-/
    You can -finalizeUpgrade if you don't need the previous version anymore.

    http://hadoop.apache.org/core/docs/current/hdfs_user_guide.html#Upgrade+and+Rollback

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedFeb 21, '08 at 1:29p
activeMar 4, '08 at 1:31p
posts16
users6
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase