FAQ
Hi John,

But *what* is the problem exactly? Why are you trying to run the
-recover function? That isn't clear. The -recover tool is for a very
specific case of edit log corruption, and will not solve the problem
of an erased directory.

Anyhow, to just answer your question, please visit on the NN,
/var/run/cloudera-scm-agent/process/, run a "ls -ltr *NAMENODE*" to
find the latest dir, cd into it, export HADOOP_CONF_DIR=$PWD, then run
"hadoop namenode -recover" and it will pick up proper name directory
configs.
On Sat, Jul 20, 2013 at 7:40 AM, John Meza wrote:
Yes, the problem is unchanged.
With hadoop namenode -recover it's returning:

13/07/19 16:07:15 INFO hdfs.StateChange: STATE* Safe mode is ON.
Use "hdfs dfsadmin -safemode leave" to turn safe mode off.
13/07/19 16:07:15 WARN common.Storage: Storage directory
/tmp/hadoop-linux/dfs/name does not exist
13/07/19 16:07:15 INFO namenode.MetaRecoveryContext: RECOVERY FAILED: caught
exception
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory
/tmp/hadoop-linux/dfs/name is in an inconsistent state: storage directory
does not exist or is not accessible.
at
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:295)
at
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:201)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:592)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:435)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:397)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.doRecovery(NameNode.java:1064)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1136)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1205)
13/07/19 16:07:15 FATAL namenode.NameNode: Exception in namenode join
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory
/tmp/hadoop-linux/dfs/name is in an inconsistent state: storage directory
does not exist or is not accessible.
at
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:295)
at
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:201)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:592)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:435)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:397)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.doRecovery(NameNode.java:1064)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1136)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1205)
13/07/19 16:07:15 INFO util.ExitUtil: Exiting with status 1
13/07/19 16:07:15 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at zip4.esri.com/10.47.102.147
************************************************************/


I'll look for a copy of the dfs.name.dir directory from the secondary
namenode backup and retry the
"namenode -recover"?

thanks for the quick reply,
John

On Fri, Jul 19, 2013 at 6:19 PM, Harsh J wrote:

Are you facing an issue with the NameNode startup right now? What is
it logging when you try to start it?

If you've deleted dfs.data.dir, then there usually isn't a problem as
you should have replicas elsewhere. The only loss would be some of the
single-replica blocks, if you had single-replica files.

If you've deleted dfs.name.dir, you need to place back a current/
directory from the secondary namenode backup and boot up.

The "namenode -recover" is to only be used for editlog corruptions,
which am not sure you're running into, based on your details.
On Sat, Jul 20, 2013 at 5:27 AM, John Meza wrote:
While this is not a scm specific issue I thought I'd asked here first
since
I installed with scm. I can move the question to user@hadoop.apache.org
if
it's appropriate.
--- info ----
- 10 node cluster is used for testing
-version = 2.0.0-cdh4.2.0
-namenode is also datanode (that machine is zip4)
-----
I had ip address issues with the nodes. Removed a problem node from the
cluster,
-planned to do some cleanup on that node
-then add it back
-then rebalance the cluster
But I deleted the dfs.data.dir on the namenode/datanode.
--> it's been a long day.

cm shows the zip4 datanode as fine.!!? But the namenode on zip4 is
stopped.
I tried
hadoop namenode -recover
-this generated:
hdfs.StateChange: STATE* Safe mode is ON.
So i tried : "hdfs dfsadmin -safemode leave" to turn safe mode
off as
suggested.
no-go:
... to zip4:8020 failed on connection exception
-that failed with :
WARN common.Storage: Storage directory /tmp/hadoop-linux/dfs/name
does
not exist
-which caused:
InconsistentFSStateException: Directory /tmp/hadoop-linux/dfs/name
is
in an inconsistent state: storage directory does not exist or....

This is a test cluster, but has alot of good test data. I would prefer
to
not lose the data. But if I do it's not the end of the world.
Any suggestions?
thanks
John






--
Harsh J


--
Harsh J

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 4 of 7 | next ›
Discussion Overview
groupscm-users @
categorieshadoop
postedJul 19, '13 at 11:57p
activeJul 25, '13 at 12:43a
posts7
users2
websitecloudera.com
irc#hadoop

2 users in discussion

John Meza: 4 posts Harsh J: 3 posts

People

Translate

site design / logo © 2022 Grokbase