FAQ
Our hadoop cluster went down last night when the namenode ran out of hard
drive space. Trying to restart fails with this exception (see below).

Since I don't really care that much about losing a days worth of data or so
I'm fine with blowing away the edits file if that's what it takes (we don't
have a secondary namenode to restore from). I tried removing the edits file
from the namenode directory, but then it complained about not finding an
edits file. I touched a blank edits file and I got the exact same exception.

Any thoughts? I googled around a bit, but to no avail.

-mike


2010-03-04 10:50:44,768 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
Initializing RPC Metrics with hostName=NameNode, port=54310
2010-03-04 10:50:44,772 INFO
org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
carr.projectlounge.com/10.0.16.91:54310
2010-03-04 10:50:44,773 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
Initializing JVM Metrics with processName=NameNode, sessionId=null
2010-03-04 10:50:44,774 INFO
org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics: Initializing
NameNodeMeterics using context
object:org.apache.hadoop.metrics.spi.NullContext
2010-03-04 10:50:44,816 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=pubget,pubget
2010-03-04 10:50:44,817 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup
2010-03-04 10:50:44,817 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
isPermissionEnabled=true
2010-03-04 10:50:44,823 INFO
org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
Initializing FSNamesystemMetrics using context
object:org.apache.hadoop.metrics.spi.NullContext
2010-03-04 10:50:44,825 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
FSNamesystemStatusMBean
2010-03-04 10:50:44,849 INFO org.apache.hadoop.hdfs.server.common.Storage:
Number of files = 2687
2010-03-04 10:50:45,092 INFO org.apache.hadoop.hdfs.server.common.Storage:
Number of files under construction = 7
2010-03-04 10:50:45,095 INFO org.apache.hadoop.hdfs.server.common.Storage:
Image file of size 347821 loaded in 0 seconds.
2010-03-04 10:50:45,104 INFO org.apache.hadoop.hdfs.server.common.Storage:
Edits file /mnt/hadoop/name/current/edits of size 4653 edits # 39 loaded in
0 seconds.
2010-03-04 10:50:45,114 ERROR
org.apache.hadoop.hdfs.server.namenode.NameNode:
java.lang.NumberFormatException: For input string: ""
at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
at java.lang.Long.parseLong(Long.java:424)
at java.lang.Long.parseLong(Long.java:461)
at
org.apache.hadoop.hdfs.server.namenode.FSEditLog.readLong(FSEditLog.java:1273)
at
org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:670)
at
org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:997)
at
org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:812)
at
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:364)
at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.(NameNode.java:201)
at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:956)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965)

2010-03-04 10:50:45,115 INFO
org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at carr.projectlounge.com/10.0.16.91
************************************************************/

Search Discussions

  • Todd Lipcon at Mar 4, 2010 at 5:01 pm
    Hi Mike,

    Was your namenode configured with multiple dfs.name.dir settings?

    If so, can you please reply with "ls -l" from each dfs.name.dir?

    Thanks
    -Todd
    On Thu, Mar 4, 2010 at 8:57 AM, mike anderson wrote:

    Our hadoop cluster went down last night when the namenode ran out of hard
    drive space. Trying to restart fails with this exception (see below).

    Since I don't really care that much about losing a days worth of data or so
    I'm fine with blowing away the edits file if that's what it takes (we don't
    have a secondary namenode to restore from). I tried removing the edits file
    from the namenode directory, but then it complained about not finding an
    edits file. I touched a blank edits file and I got the exact same
    exception.

    Any thoughts? I googled around a bit, but to no avail.

    -mike


    2010-03-04 10:50:44,768 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
    Initializing RPC Metrics with hostName=NameNode, port=54310
    2010-03-04 10:50:44,772 INFO
    org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
    carr.projectlounge.com/10.0.16.91:54310
    2010-03-04 <http://carr.projectlounge.com/10.0.16.91:54310%0A2010-03-04>10:50:44,773 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
    Initializing JVM Metrics with processName=NameNode, sessionId=null
    2010-03-04 10:50:44,774 INFO
    org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
    Initializing
    NameNodeMeterics using context
    object:org.apache.hadoop.metrics.spi.NullContext
    2010-03-04 10:50:44,816 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=pubget,pubget
    2010-03-04 10:50:44,817 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup
    2010-03-04 10:50:44,817 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
    isPermissionEnabled=true
    2010-03-04 10:50:44,823 INFO
    org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
    Initializing FSNamesystemMetrics using context
    object:org.apache.hadoop.metrics.spi.NullContext
    2010-03-04 10:50:44,825 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
    FSNamesystemStatusMBean
    2010-03-04 10:50:44,849 INFO org.apache.hadoop.hdfs.server.common.Storage:
    Number of files = 2687
    2010-03-04 10:50:45,092 INFO org.apache.hadoop.hdfs.server.common.Storage:
    Number of files under construction = 7
    2010-03-04 10:50:45,095 INFO org.apache.hadoop.hdfs.server.common.Storage:
    Image file of size 347821 loaded in 0 seconds.
    2010-03-04 10:50:45,104 INFO org.apache.hadoop.hdfs.server.common.Storage:
    Edits file /mnt/hadoop/name/current/edits of size 4653 edits # 39 loaded in
    0 seconds.
    2010-03-04 10:50:45,114 ERROR
    org.apache.hadoop.hdfs.server.namenode.NameNode:
    java.lang.NumberFormatException: For input string: ""
    at

    java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
    at java.lang.Long.parseLong(Long.java:424)
    at java.lang.Long.parseLong(Long.java:461)
    at

    org.apache.hadoop.hdfs.server.namenode.FSEditLog.readLong(FSEditLog.java:1273)
    at

    org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:670)
    at

    org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:997)
    at

    org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:812)
    at

    org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:364)
    at

    org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
    at

    org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
    at

    org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:292)
    at

    org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201)
    at
    org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279)
    at

    org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965)

    2010-03-04 10:50:45,115 INFO
    org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
    /************************************************************
    SHUTDOWN_MSG: Shutting down NameNode at carr.projectlounge.com/10.0.16.91
    ************************************************************/
  • Mike anderson at Mar 4, 2010 at 5:06 pm
    We have a single dfs.name.dir directory, in case it's useful the contents
    are:

    [mike@carr name]$ ls -l
    total 8
    drwxrwxr-x 2 mike mike 4096 Mar 4 11:18 current
    drwxrwxr-x 2 mike mike 4096 Oct 8 16:38 image



    On Thu, Mar 4, 2010 at 12:00 PM, Todd Lipcon wrote:

    Hi Mike,

    Was your namenode configured with multiple dfs.name.dir settings?

    If so, can you please reply with "ls -l" from each dfs.name.dir?

    Thanks
    -Todd

    On Thu, Mar 4, 2010 at 8:57 AM, mike anderson <saidtherobot@gmail.com
    wrote:
    Our hadoop cluster went down last night when the namenode ran out of hard
    drive space. Trying to restart fails with this exception (see below).

    Since I don't really care that much about losing a days worth of data or so
    I'm fine with blowing away the edits file if that's what it takes (we don't
    have a secondary namenode to restore from). I tried removing the edits file
    from the namenode directory, but then it complained about not finding an
    edits file. I touched a blank edits file and I got the exact same
    exception.

    Any thoughts? I googled around a bit, but to no avail.

    -mike


    2010-03-04 10:50:44,768 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
    Initializing RPC Metrics with hostName=NameNode, port=54310
    2010-03-04 10:50:44,772 INFO
    org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
    carr.projectlounge.com/10.0.16.91:54310
    2010-03-04 <http://carr.projectlounge.com/10.0.16.91:54310%0A2010-03-04>10:50:44,773
    INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
    Initializing JVM Metrics with processName=NameNode, sessionId=null
    2010-03-04 10:50:44,774 INFO
    org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
    Initializing
    NameNodeMeterics using context
    object:org.apache.hadoop.metrics.spi.NullContext
    2010-03-04 10:50:44,816 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
    fsOwner=pubget,pubget
    2010-03-04 10:50:44,817 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
    supergroup=supergroup
    2010-03-04 10:50:44,817 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
    isPermissionEnabled=true
    2010-03-04 10:50:44,823 INFO
    org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
    Initializing FSNamesystemMetrics using context
    object:org.apache.hadoop.metrics.spi.NullContext
    2010-03-04 10:50:44,825 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
    FSNamesystemStatusMBean
    2010-03-04 10:50:44,849 INFO
    org.apache.hadoop.hdfs.server.common.Storage:
    Number of files = 2687
    2010-03-04 10:50:45,092 INFO
    org.apache.hadoop.hdfs.server.common.Storage:
    Number of files under construction = 7
    2010-03-04 10:50:45,095 INFO
    org.apache.hadoop.hdfs.server.common.Storage:
    Image file of size 347821 loaded in 0 seconds.
    2010-03-04 10:50:45,104 INFO
    org.apache.hadoop.hdfs.server.common.Storage:
    Edits file /mnt/hadoop/name/current/edits of size 4653 edits # 39 loaded in
    0 seconds.
    2010-03-04 10:50:45,114 ERROR
    org.apache.hadoop.hdfs.server.namenode.NameNode:
    java.lang.NumberFormatException: For input string: ""
    at

    java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
    at java.lang.Long.parseLong(Long.java:424)
    at java.lang.Long.parseLong(Long.java:461)
    at

    org.apache.hadoop.hdfs.server.namenode.FSEditLog.readLong(FSEditLog.java:1273)
    at

    org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:670)
    at

    org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:997)
    at

    org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:812)
    at

    org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:364)
    at

    org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
    at

    org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
    at

    org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:292)
    at

    org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201)
    at
    org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279)
    at

    org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956)
    at
    org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965)
    2010-03-04 10:50:45,115 INFO
    org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
    /************************************************************
    SHUTDOWN_MSG: Shutting down NameNode at
    carr.projectlounge.com/10.0.16.91
    ************************************************************/
  • Konstantin Shvachko at Mar 4, 2010 at 6:40 pm
    You've probably got current/edits.new corrupted.
    If it is an empty file you can simply delete it and start the NN.
    There shouldn't be any data loss.
    If it is not empty then you will loose data from the start of
    the latest checkpoint, which should recent I believe.

    --Konstantin

    On 3/4/2010 9:05 AM, mike anderson wrote:
    We have a single dfs.name.dir directory, in case it's useful the contents
    are:

    [mike@carr name]$ ls -l
    total 8
    drwxrwxr-x 2 mike mike 4096 Mar 4 11:18 current
    drwxrwxr-x 2 mike mike 4096 Oct 8 16:38 image




    On Thu, Mar 4, 2010 at 12:00 PM, Todd Lipconwrote:
    Hi Mike,

    Was your namenode configured with multiple dfs.name.dir settings?

    If so, can you please reply with "ls -l" from each dfs.name.dir?

    Thanks
    -Todd

    On Thu, Mar 4, 2010 at 8:57 AM, mike anderson<saidtherobot@gmail.com
    wrote:
    Our hadoop cluster went down last night when the namenode ran out of hard
    drive space. Trying to restart fails with this exception (see below).

    Since I don't really care that much about losing a days worth of data or so
    I'm fine with blowing away the edits file if that's what it takes (we don't
    have a secondary namenode to restore from). I tried removing the edits file
    from the namenode directory, but then it complained about not finding an
    edits file. I touched a blank edits file and I got the exact same
    exception.

    Any thoughts? I googled around a bit, but to no avail.

    -mike


    2010-03-04 10:50:44,768 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
    Initializing RPC Metrics with hostName=NameNode, port=54310
    2010-03-04 10:50:44,772 INFO
    org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
    carr.projectlounge.com/10.0.16.91:54310
    2010-03-04<http://carr.projectlounge.com/10.0.16.91:54310%0A2010-03-04>10:50:44,773
    INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
    Initializing JVM Metrics with processName=NameNode, sessionId=null
    2010-03-04 10:50:44,774 INFO
    org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
    Initializing
    NameNodeMeterics using context
    object:org.apache.hadoop.metrics.spi.NullContext
    2010-03-04 10:50:44,816 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
    fsOwner=pubget,pubget
    2010-03-04 10:50:44,817 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
    supergroup=supergroup
    2010-03-04 10:50:44,817 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
    isPermissionEnabled=true
    2010-03-04 10:50:44,823 INFO
    org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
    Initializing FSNamesystemMetrics using context
    object:org.apache.hadoop.metrics.spi.NullContext
    2010-03-04 10:50:44,825 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
    FSNamesystemStatusMBean
    2010-03-04 10:50:44,849 INFO
    org.apache.hadoop.hdfs.server.common.Storage:
    Number of files = 2687
    2010-03-04 10:50:45,092 INFO
    org.apache.hadoop.hdfs.server.common.Storage:
    Number of files under construction = 7
    2010-03-04 10:50:45,095 INFO
    org.apache.hadoop.hdfs.server.common.Storage:
    Image file of size 347821 loaded in 0 seconds.
    2010-03-04 10:50:45,104 INFO
    org.apache.hadoop.hdfs.server.common.Storage:
    Edits file /mnt/hadoop/name/current/edits of size 4653 edits # 39 loaded in
    0 seconds.
    2010-03-04 10:50:45,114 ERROR
    org.apache.hadoop.hdfs.server.namenode.NameNode:
    java.lang.NumberFormatException: For input string: ""
    at

    java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
    at java.lang.Long.parseLong(Long.java:424)
    at java.lang.Long.parseLong(Long.java:461)
    at

    org.apache.hadoop.hdfs.server.namenode.FSEditLog.readLong(FSEditLog.java:1273)
    at

    org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:670)
    at

    org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:997)
    at

    org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:812)
    at

    org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:364)
    at

    org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
    at

    org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
    at

    org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:292)
    at

    org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201)
    at
    org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279)
    at

    org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956)
    at
    org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965)
    2010-03-04 10:50:45,115 INFO
    org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
    /************************************************************
    SHUTDOWN_MSG: Shutting down NameNode at
    carr.projectlounge.com/10.0.16.91
    ************************************************************/
  • Todd Lipcon at Mar 4, 2010 at 6:55 pm
    Sorry, I actually meant ls -l from name.dir/current/

    Having only one dfs.name.dir isn't recommended - after you get your system
    back up and running I would strongly suggest running with at least two,
    preferably with one on a separate server via NFS.

    Thanks
    -Todd
    On Thu, Mar 4, 2010 at 9:05 AM, mike anderson wrote:

    We have a single dfs.name.dir directory, in case it's useful the contents
    are:

    [mike@carr name]$ ls -l
    total 8
    drwxrwxr-x 2 mike mike 4096 Mar 4 11:18 current
    drwxrwxr-x 2 mike mike 4096 Oct 8 16:38 image



    On Thu, Mar 4, 2010 at 12:00 PM, Todd Lipcon wrote:

    Hi Mike,

    Was your namenode configured with multiple dfs.name.dir settings?

    If so, can you please reply with "ls -l" from each dfs.name.dir?

    Thanks
    -Todd

    On Thu, Mar 4, 2010 at 8:57 AM, mike anderson <saidtherobot@gmail.com
    wrote:
    Our hadoop cluster went down last night when the namenode ran out of
    hard
    drive space. Trying to restart fails with this exception (see below).

    Since I don't really care that much about losing a days worth of data
    or
    so
    I'm fine with blowing away the edits file if that's what it takes (we don't
    have a secondary namenode to restore from). I tried removing the edits file
    from the namenode directory, but then it complained about not finding
    an
    edits file. I touched a blank edits file and I got the exact same
    exception.

    Any thoughts? I googled around a bit, but to no avail.

    -mike


    2010-03-04 10:50:44,768 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
    Initializing RPC Metrics with hostName=NameNode, port=54310
    2010-03-04 10:50:44,772 INFO
    org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
    carr.projectlounge.com/10.0.16.91:54310
    2010-03-04 <
    http://carr.projectlounge.com/10.0.16.91:54310%0A2010-03-04>10:50:44,773
    INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
    Initializing JVM Metrics with processName=NameNode, sessionId=null
    2010-03-04 10:50:44,774 INFO
    org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
    Initializing
    NameNodeMeterics using context
    object:org.apache.hadoop.metrics.spi.NullContext
    2010-03-04 10:50:44,816 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
    fsOwner=pubget,pubget
    2010-03-04 10:50:44,817 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
    supergroup=supergroup
    2010-03-04 10:50:44,817 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
    isPermissionEnabled=true
    2010-03-04 10:50:44,823 INFO
    org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
    Initializing FSNamesystemMetrics using context
    object:org.apache.hadoop.metrics.spi.NullContext
    2010-03-04 10:50:44,825 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
    FSNamesystemStatusMBean
    2010-03-04 10:50:44,849 INFO
    org.apache.hadoop.hdfs.server.common.Storage:
    Number of files = 2687
    2010-03-04 10:50:45,092 INFO
    org.apache.hadoop.hdfs.server.common.Storage:
    Number of files under construction = 7
    2010-03-04 10:50:45,095 INFO
    org.apache.hadoop.hdfs.server.common.Storage:
    Image file of size 347821 loaded in 0 seconds.
    2010-03-04 10:50:45,104 INFO
    org.apache.hadoop.hdfs.server.common.Storage:
    Edits file /mnt/hadoop/name/current/edits of size 4653 edits # 39
    loaded
    in
    0 seconds.
    2010-03-04 10:50:45,114 ERROR
    org.apache.hadoop.hdfs.server.namenode.NameNode:
    java.lang.NumberFormatException: For input string: ""
    at
    java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
    at java.lang.Long.parseLong(Long.java:424)
    at java.lang.Long.parseLong(Long.java:461)
    at
    org.apache.hadoop.hdfs.server.namenode.FSEditLog.readLong(FSEditLog.java:1273)
    at
    org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:670)
    at
    org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:997)
    at
    org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:812)
    at
    org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:364)
    at
    org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
    at
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
    at
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:292)
    at
    org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201)
    at
    org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279)
    at
    org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956)
    at
    org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965)
    2010-03-04 10:50:45,115 INFO
    org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
    /************************************************************
    SHUTDOWN_MSG: Shutting down NameNode at
    carr.projectlounge.com/10.0.16.91
    ************************************************************/


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Mike anderson at Mar 4, 2010 at 7:38 pm
    Removing edits.new and starting worked, though it didn't seem that
    happy about it. It started up nonetheless, in safe mode. Saying that
    "The ratio of reported blocks 0.9948 has not reached the threshold
    0.9990. Safe mode will be turned off automatically." Unfortunately
    this is holding up the restart of hbase.

    About how long does it take to exit safe mode? is there anything I can
    do to expedite the process?


    On Thu, Mar 4, 2010 at 1:54 PM, Todd Lipcon wrote:

    Sorry, I actually meant ls -l from name.dir/current/

    Having only one dfs.name.dir isn't recommended - after you get your system
    back up and running I would strongly suggest running with at least two,
    preferably with one on a separate server via NFS.

    Thanks
    -Todd
    On Thu, Mar 4, 2010 at 9:05 AM, mike anderson wrote:

    We have a single dfs.name.dir directory, in case it's useful the contents
    are:

    [mike@carr name]$ ls -l
    total 8
    drwxrwxr-x 2 mike mike 4096 Mar  4 11:18 current
    drwxrwxr-x 2 mike mike 4096 Oct  8 16:38 image



    On Thu, Mar 4, 2010 at 12:00 PM, Todd Lipcon wrote:

    Hi Mike,

    Was your namenode configured with multiple dfs.name.dir settings?

    If so, can you please reply with "ls -l" from each dfs.name.dir?

    Thanks
    -Todd

    On Thu, Mar 4, 2010 at 8:57 AM, mike anderson <saidtherobot@gmail.com
    wrote:
    Our hadoop cluster went down last night when the namenode ran out of
    hard
    drive space. Trying to restart fails with this exception (see below).

    Since I don't really care that much about losing a days worth of data
    or
    so
    I'm fine with blowing away the edits file if that's what it takes (we don't
    have a secondary namenode to restore from). I tried removing the edits file
    from the namenode directory, but then it complained about not finding
    an
    edits file. I touched a blank edits file and I got the exact same
    exception.

    Any thoughts? I googled around a bit, but to no avail.

    -mike


    2010-03-04 10:50:44,768 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
    Initializing RPC Metrics with hostName=NameNode, port=54310
    2010-03-04 10:50:44,772 INFO
    org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
    carr.projectlounge.com/10.0.16.91:54310
    2010-03-04 <
    http://carr.projectlounge.com/10.0.16.91:54310%0A2010-03-04>10:50:44,773
    INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
    Initializing JVM Metrics with processName=NameNode, sessionId=null
    2010-03-04 10:50:44,774 INFO
    org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
    Initializing
    NameNodeMeterics using context
    object:org.apache.hadoop.metrics.spi.NullContext
    2010-03-04 10:50:44,816 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
    fsOwner=pubget,pubget
    2010-03-04 10:50:44,817 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
    supergroup=supergroup
    2010-03-04 10:50:44,817 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
    isPermissionEnabled=true
    2010-03-04 10:50:44,823 INFO
    org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
    Initializing FSNamesystemMetrics using context
    object:org.apache.hadoop.metrics.spi.NullContext
    2010-03-04 10:50:44,825 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
    FSNamesystemStatusMBean
    2010-03-04 10:50:44,849 INFO
    org.apache.hadoop.hdfs.server.common.Storage:
    Number of files = 2687
    2010-03-04 10:50:45,092 INFO
    org.apache.hadoop.hdfs.server.common.Storage:
    Number of files under construction = 7
    2010-03-04 10:50:45,095 INFO
    org.apache.hadoop.hdfs.server.common.Storage:
    Image file of size 347821 loaded in 0 seconds.
    2010-03-04 10:50:45,104 INFO
    org.apache.hadoop.hdfs.server.common.Storage:
    Edits file /mnt/hadoop/name/current/edits of size 4653 edits # 39
    loaded
    in
    0 seconds.
    2010-03-04 10:50:45,114 ERROR
    org.apache.hadoop.hdfs.server.namenode.NameNode:
    java.lang.NumberFormatException: For input string: ""
    at
    java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
    at java.lang.Long.parseLong(Long.java:424)
    at java.lang.Long.parseLong(Long.java:461)
    at
    org.apache.hadoop.hdfs.server.namenode.FSEditLog.readLong(FSEditLog.java:1273)
    at
    org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:670)
    at
    org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:997)
    at
    org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:812)
    at
    org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:364)
    at
    org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
    at
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
    at
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:292)
    at
    org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201)
    at
    org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279)
    at
    org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956)
    at
    org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965)
    2010-03-04 10:50:45,115 INFO
    org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
    /************************************************************
    SHUTDOWN_MSG: Shutting down NameNode at
    carr.projectlounge.com/10.0.16.91
    ************************************************************/


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Todd Lipcon at Mar 4, 2010 at 7:53 pm
    Hi Mike,

    Since you removed the edits, you restored to an earlier version of the
    namesystem. Thus, any files that were deleted since the last checkpoint will
    have come back. But, the blocks will have been removed from the datanodes.
    So, the NN is complaining since there are some files that have missing
    blocks. That is to say, some of your files are corrupt (ie unreadable
    because the data is gone but the metadata is still there)

    In order to force it out of safemode, you can run hadoop dfsadmin -safemode
    leave
    You should also run "hadoop fsck" in order to determine which files are
    broken, and then probably use the -delete option to remove their metadata.

    Thanks
    -Todd
    On Thu, Mar 4, 2010 at 11:37 AM, mike anderson wrote:

    Removing edits.new and starting worked, though it didn't seem that
    happy about it. It started up nonetheless, in safe mode. Saying that
    "The ratio of reported blocks 0.9948 has not reached the threshold
    0.9990. Safe mode will be turned off automatically." Unfortunately
    this is holding up the restart of hbase.

    About how long does it take to exit safe mode? is there anything I can
    do to expedite the process?


    On Thu, Mar 4, 2010 at 1:54 PM, Todd Lipcon wrote:

    Sorry, I actually meant ls -l from name.dir/current/

    Having only one dfs.name.dir isn't recommended - after you get your system
    back up and running I would strongly suggest running with at least two,
    preferably with one on a separate server via NFS.

    Thanks
    -Todd

    On Thu, Mar 4, 2010 at 9:05 AM, mike anderson <saidtherobot@gmail.com
    wrote:
    We have a single dfs.name.dir directory, in case it's useful the
    contents
    are:

    [mike@carr name]$ ls -l
    total 8
    drwxrwxr-x 2 mike mike 4096 Mar 4 11:18 current
    drwxrwxr-x 2 mike mike 4096 Oct 8 16:38 image



    On Thu, Mar 4, 2010 at 12:00 PM, Todd Lipcon wrote:

    Hi Mike,

    Was your namenode configured with multiple dfs.name.dir settings?

    If so, can you please reply with "ls -l" from each dfs.name.dir?

    Thanks
    -Todd

    On Thu, Mar 4, 2010 at 8:57 AM, mike anderson <
    saidtherobot@gmail.com
    wrote:
    Our hadoop cluster went down last night when the namenode ran out
    of
    hard
    drive space. Trying to restart fails with this exception (see
    below).
    Since I don't really care that much about losing a days worth of
    data
    or
    so
    I'm fine with blowing away the edits file if that's what it takes
    (we
    don't
    have a secondary namenode to restore from). I tried removing the
    edits
    file
    from the namenode directory, but then it complained about not
    finding
    an
    edits file. I touched a blank edits file and I got the exact same
    exception.

    Any thoughts? I googled around a bit, but to no avail.

    -mike


    2010-03-04 10:50:44,768 INFO
    org.apache.hadoop.ipc.metrics.RpcMetrics:
    Initializing RPC Metrics with hostName=NameNode, port=54310
    2010-03-04 10:50:44,772 INFO
    org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
    carr.projectlounge.com/10.0.16.91:54310
    2010-03-04 <
    http://carr.projectlounge.com/10.0.16.91:54310%0A2010-03-04
    10:50:44,773
    INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
    Initializing JVM Metrics with processName=NameNode, sessionId=null
    2010-03-04 10:50:44,774 INFO
    org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
    Initializing
    NameNodeMeterics using context
    object:org.apache.hadoop.metrics.spi.NullContext
    2010-03-04 10:50:44,816 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
    fsOwner=pubget,pubget
    2010-03-04 10:50:44,817 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
    supergroup=supergroup
    2010-03-04 10:50:44,817 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
    isPermissionEnabled=true
    2010-03-04 10:50:44,823 INFO
    org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
    Initializing FSNamesystemMetrics using context
    object:org.apache.hadoop.metrics.spi.NullContext
    2010-03-04 10:50:44,825 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
    FSNamesystemStatusMBean
    2010-03-04 10:50:44,849 INFO
    org.apache.hadoop.hdfs.server.common.Storage:
    Number of files = 2687
    2010-03-04 10:50:45,092 INFO
    org.apache.hadoop.hdfs.server.common.Storage:
    Number of files under construction = 7
    2010-03-04 10:50:45,095 INFO
    org.apache.hadoop.hdfs.server.common.Storage:
    Image file of size 347821 loaded in 0 seconds.
    2010-03-04 10:50:45,104 INFO
    org.apache.hadoop.hdfs.server.common.Storage:
    Edits file /mnt/hadoop/name/current/edits of size 4653 edits # 39
    loaded
    in
    0 seconds.
    2010-03-04 10:50:45,114 ERROR
    org.apache.hadoop.hdfs.server.namenode.NameNode:
    java.lang.NumberFormatException: For input string: ""
    at
    java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
    at java.lang.Long.parseLong(Long.java:424)
    at java.lang.Long.parseLong(Long.java:461)
    at
    org.apache.hadoop.hdfs.server.namenode.FSEditLog.readLong(FSEditLog.java:1273)
    at
    org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:670)
    at
    org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:997)
    at
    org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:812)
    at
    org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:364)
    at
    org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
    at
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
    at
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:292)
    at
    org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201)
    at
    org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279)
    at
    org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956)
    at
    org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965)
    2010-03-04 10:50:45,115 INFO
    org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
    /************************************************************
    SHUTDOWN_MSG: Shutting down NameNode at
    carr.projectlounge.com/10.0.16.91
    ************************************************************/


    --
    Todd Lipcon
    Software Engineer, Cloudera


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Mike anderson at Mar 4, 2010 at 8:22 pm
    Todd, That did the trick. Thanks to everyone for the quick responses
    and effective suggestions.

    -Mike

    On Thu, Mar 4, 2010 at 2:50 PM, Todd Lipcon wrote:
    Hi Mike,

    Since you removed the edits, you restored to an earlier version of the
    namesystem. Thus, any files that were deleted since the last checkpoint will
    have come back. But, the blocks will have been removed from the datanodes.
    So, the NN is complaining since there are some files that have missing
    blocks. That is to say, some of your files are corrupt (ie unreadable
    because the data is gone but the metadata is still there)

    In order to force it out of safemode, you can run hadoop dfsadmin -safemode
    leave
    You should also run "hadoop fsck" in order to determine which files are
    broken, and then probably use the -delete option to remove their metadata.

    Thanks
    -Todd
    On Thu, Mar 4, 2010 at 11:37 AM, mike anderson wrote:

    Removing edits.new and starting worked, though it didn't seem that
    happy about it. It started up nonetheless, in safe mode. Saying that
    "The ratio of reported blocks 0.9948 has not reached the threshold
    0.9990. Safe mode will be turned off automatically." Unfortunately
    this is holding up the restart of hbase.

    About how long does it take to exit safe mode? is there anything I can
    do to expedite the process?


    On Thu, Mar 4, 2010 at 1:54 PM, Todd Lipcon wrote:

    Sorry, I actually meant ls -l from name.dir/current/

    Having only one dfs.name.dir isn't recommended - after you get your system
    back up and running I would strongly suggest running with at least two,
    preferably with one on a separate server via NFS.

    Thanks
    -Todd

    On Thu, Mar 4, 2010 at 9:05 AM, mike anderson <saidtherobot@gmail.com
    wrote:
    We have a single dfs.name.dir directory, in case it's useful the
    contents
    are:

    [mike@carr name]$ ls -l
    total 8
    drwxrwxr-x 2 mike mike 4096 Mar  4 11:18 current
    drwxrwxr-x 2 mike mike 4096 Oct  8 16:38 image




    On Thu, Mar 4, 2010 at 12:00 PM, Todd Lipcon <todd@cloudera.com>
    wrote:
    Hi Mike,

    Was your namenode configured with multiple dfs.name.dir settings?

    If so, can you please reply with "ls -l" from each dfs.name.dir?

    Thanks
    -Todd

    On Thu, Mar 4, 2010 at 8:57 AM, mike anderson <
    saidtherobot@gmail.com
    wrote:
    Our hadoop cluster went down last night when the namenode ran out
    of
    hard
    drive space. Trying to restart fails with this exception (see
    below).
    Since I don't really care that much about losing a days worth of
    data
    or
    so
    I'm fine with blowing away the edits file if that's what it takes
    (we
    don't
    have a secondary namenode to restore from). I tried removing the
    edits
    file
    from the namenode directory, but then it complained about not
    finding
    an
    edits file. I touched a blank edits file and I got the exact same
    exception.

    Any thoughts? I googled around a bit, but to no avail.

    -mike


    2010-03-04 10:50:44,768 INFO
    org.apache.hadoop.ipc.metrics.RpcMetrics:
    Initializing RPC Metrics with hostName=NameNode, port=54310
    2010-03-04 10:50:44,772 INFO
    org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
    carr.projectlounge.com/10.0.16.91:54310
    2010-03-04 <
    http://carr.projectlounge.com/10.0.16.91:54310%0A2010-03-04
    10:50:44,773
    INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
    Initializing JVM Metrics with processName=NameNode, sessionId=null
    2010-03-04 10:50:44,774 INFO
    org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
    Initializing
    NameNodeMeterics using context
    object:org.apache.hadoop.metrics.spi.NullContext
    2010-03-04 10:50:44,816 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
    fsOwner=pubget,pubget
    2010-03-04 10:50:44,817 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
    supergroup=supergroup
    2010-03-04 10:50:44,817 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
    isPermissionEnabled=true
    2010-03-04 10:50:44,823 INFO
    org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
    Initializing FSNamesystemMetrics using context
    object:org.apache.hadoop.metrics.spi.NullContext
    2010-03-04 10:50:44,825 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
    FSNamesystemStatusMBean
    2010-03-04 10:50:44,849 INFO
    org.apache.hadoop.hdfs.server.common.Storage:
    Number of files = 2687
    2010-03-04 10:50:45,092 INFO
    org.apache.hadoop.hdfs.server.common.Storage:
    Number of files under construction = 7
    2010-03-04 10:50:45,095 INFO
    org.apache.hadoop.hdfs.server.common.Storage:
    Image file of size 347821 loaded in 0 seconds.
    2010-03-04 10:50:45,104 INFO
    org.apache.hadoop.hdfs.server.common.Storage:
    Edits file /mnt/hadoop/name/current/edits of size 4653 edits # 39
    loaded
    in
    0 seconds.
    2010-03-04 10:50:45,114 ERROR
    org.apache.hadoop.hdfs.server.namenode.NameNode:
    java.lang.NumberFormatException: For input string: ""
    at
    java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
    at java.lang.Long.parseLong(Long.java:424)
    at java.lang.Long.parseLong(Long.java:461)
    at
    org.apache.hadoop.hdfs.server.namenode.FSEditLog.readLong(FSEditLog.java:1273)
    at
    org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:670)
    at
    org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:997)
    at
    org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:812)
    at
    org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:364)
    at
    org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
    at
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
    at
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:292)
    at
    org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201)
    at
    org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279)
    at
    org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956)
    at
    org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965)
    2010-03-04 10:50:45,115 INFO
    org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
    /************************************************************
    SHUTDOWN_MSG: Shutting down NameNode at
    carr.projectlounge.com/10.0.16.91
    ************************************************************/


    --
    Todd Lipcon
    Software Engineer, Cloudera


    --
    Todd Lipcon
    Software Engineer, Cloudera
  • Ted Yu at May 19, 2010 at 9:33 am
    We encountered a similar issue with hadoop-0.20.2+228 in QA:

    2010-05-19 07:12:19,976 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
    Initializing JVM Metrics with processName=NameNode, sessionId=null
    2010-05-19 07:12:19,978 INFO
    org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics: Initializing
    NameNodeMeterics using context
    object:org.apache.hadoop.metrics.spi.NullContext
    2010-05-19 07:12:20,041 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=hadoop,hadoop
    2010-05-19 07:12:20,041 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup
    2010-05-19 07:12:20,041 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
    isPermissionEnabled=true
    2010-05-19 07:12:20,050 INFO
    org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
    Initializing FSNamesystemMetrics using context
    object:org.apache.hadoop.metrics.spi.NullContext
    2010-05-19 07:12:20,052 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
    FSNamesystemStatusMBean
    2010-05-19 07:12:20,091 INFO org.apache.hadoop.hdfs.server.common.Storage:
    Number of files = 1874
    2010-05-19 07:12:20,503 INFO org.apache.hadoop.hdfs.server.common.Storage:
    Number of files under construction = 2
    2010-05-19 07:12:20,787 INFO org.apache.hadoop.hdfs.server.common.Storage:
    Image file of size 259450 loaded in 0 seconds.
    2010-05-19 07:12:21,176 ERROR
    org.apache.hadoop.hdfs.server.namenode.NameNode:
    java.lang.NumberFormatException: For input string: ""
    at
    java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
    at java.lang.Long.parseLong(Long.java:431)
    at java.lang.Long.parseLong(Long.java:468)
    at
    org.apache.hadoop.hdfs.server.namenode.FSEditLog.readLong(FSEditLog.java:1273)
    at
    org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:656)
    at
    org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:999)
    at
    org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:812)
    at
    org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:364)
    at
    org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:88)
    at
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:312)
    at
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem.(NameNode.java:224)
    at
    org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:1004)
    at
    org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1013)

    2010-05-19 07:12:21,177 INFO
    org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:

    ------------------
    I don't see edits.new under name.dir/current/

    Please advise what to do next.

    Thanks
    On Thu, Mar 4, 2010 at 12:50 PM, Todd Lipcon wrote:

    Hi Mike,

    Since you removed the edits, you restored to an earlier version of the
    namesystem. Thus, any files that were deleted since the last checkpoint
    will
    have come back. But, the blocks will have been removed from the datanodes.
    So, the NN is complaining since there are some files that have missing
    blocks. That is to say, some of your files are corrupt (ie unreadable
    because the data is gone but the metadata is still there)

    In order to force it out of safemode, you can run hadoop dfsadmin -safemode
    leave
    You should also run "hadoop fsck" in order to determine which files are
    broken, and then probably use the -delete option to remove their metadata.

    Thanks
    -Todd

    On Thu, Mar 4, 2010 at 11:37 AM, mike anderson <saidtherobot@gmail.com
    wrote:
    Removing edits.new and starting worked, though it didn't seem that
    happy about it. It started up nonetheless, in safe mode. Saying that
    "The ratio of reported blocks 0.9948 has not reached the threshold
    0.9990. Safe mode will be turned off automatically." Unfortunately
    this is holding up the restart of hbase.

    About how long does it take to exit safe mode? is there anything I can
    do to expedite the process?


    On Thu, Mar 4, 2010 at 1:54 PM, Todd Lipcon wrote:

    Sorry, I actually meant ls -l from name.dir/current/

    Having only one dfs.name.dir isn't recommended - after you get your system
    back up and running I would strongly suggest running with at least two,
    preferably with one on a separate server via NFS.

    Thanks
    -Todd

    On Thu, Mar 4, 2010 at 9:05 AM, mike anderson <saidtherobot@gmail.com
    wrote:
    We have a single dfs.name.dir directory, in case it's useful the
    contents
    are:

    [mike@carr name]$ ls -l
    total 8
    drwxrwxr-x 2 mike mike 4096 Mar 4 11:18 current
    drwxrwxr-x 2 mike mike 4096 Oct 8 16:38 image




    On Thu, Mar 4, 2010 at 12:00 PM, Todd Lipcon <todd@cloudera.com>
    wrote:
    Hi Mike,

    Was your namenode configured with multiple dfs.name.dir settings?

    If so, can you please reply with "ls -l" from each dfs.name.dir?

    Thanks
    -Todd

    On Thu, Mar 4, 2010 at 8:57 AM, mike anderson <
    saidtherobot@gmail.com
    wrote:
    Our hadoop cluster went down last night when the namenode ran out
    of
    hard
    drive space. Trying to restart fails with this exception (see
    below).
    Since I don't really care that much about losing a days worth of
    data
    or
    so
    I'm fine with blowing away the edits file if that's what it takes
    (we
    don't
    have a secondary namenode to restore from). I tried removing the
    edits
    file
    from the namenode directory, but then it complained about not
    finding
    an
    edits file. I touched a blank edits file and I got the exact same
    exception.

    Any thoughts? I googled around a bit, but to no avail.

    -mike


    2010-03-04 10:50:44,768 INFO
    org.apache.hadoop.ipc.metrics.RpcMetrics:
    Initializing RPC Metrics with hostName=NameNode, port=54310
    2010-03-04 10:50:44,772 INFO
    org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
    carr.projectlounge.com/10.0.16.91:54310
    2010-03-04 <
    http://carr.projectlounge.com/10.0.16.91:54310%0A2010-03-04
    10:50:44,773
    INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
    Initializing JVM Metrics with processName=NameNode,
    sessionId=null
    2010-03-04 10:50:44,774 INFO
    org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
    Initializing
    NameNodeMeterics using context
    object:org.apache.hadoop.metrics.spi.NullContext
    2010-03-04 10:50:44,816 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
    fsOwner=pubget,pubget
    2010-03-04 10:50:44,817 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
    supergroup=supergroup
    2010-03-04 10:50:44,817 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
    isPermissionEnabled=true
    2010-03-04 10:50:44,823 INFO
    org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
    Initializing FSNamesystemMetrics using context
    object:org.apache.hadoop.metrics.spi.NullContext
    2010-03-04 10:50:44,825 INFO
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
    FSNamesystemStatusMBean
    2010-03-04 10:50:44,849 INFO
    org.apache.hadoop.hdfs.server.common.Storage:
    Number of files = 2687
    2010-03-04 10:50:45,092 INFO
    org.apache.hadoop.hdfs.server.common.Storage:
    Number of files under construction = 7
    2010-03-04 10:50:45,095 INFO
    org.apache.hadoop.hdfs.server.common.Storage:
    Image file of size 347821 loaded in 0 seconds.
    2010-03-04 10:50:45,104 INFO
    org.apache.hadoop.hdfs.server.common.Storage:
    Edits file /mnt/hadoop/name/current/edits of size 4653 edits # 39
    loaded
    in
    0 seconds.
    2010-03-04 10:50:45,114 ERROR
    org.apache.hadoop.hdfs.server.namenode.NameNode:
    java.lang.NumberFormatException: For input string: ""
    at
    java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
    at java.lang.Long.parseLong(Long.java:424)
    at java.lang.Long.parseLong(Long.java:461)
    at
    org.apache.hadoop.hdfs.server.namenode.FSEditLog.readLong(FSEditLog.java:1273)
    at
    org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:670)
    at
    org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:997)
    at
    org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:812)
    at
    org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:364)
    at
    org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
    at
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
    at
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:292)
    at
    org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201)
    at
    org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279)
    at
    org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956)
    at
    org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965)
    2010-03-04 10:50:45,115 INFO
    org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
    /************************************************************
    SHUTDOWN_MSG: Shutting down NameNode at
    carr.projectlounge.com/10.0.16.91
    ************************************************************/


    --
    Todd Lipcon
    Software Engineer, Cloudera


    --
    Todd Lipcon
    Software Engineer, Cloudera

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedMar 4, '10 at 4:57p
activeMay 19, '10 at 9:33a
posts9
users4
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase