FAQ
Dear all,

I have configured several times a Hadoop Cluster of 2,3,5,8 nodes but
one doubt in my mind always occur.
Why it is necessary to format Hadoop Namenode by *bin/hadoop -namenode
format *command.
What is the reason and logic behind this.

Please justify if someone knows.


Thanks & best Regards,

Adarsh Sharma

Search Discussions

  • Harsh J at Mar 10, 2011 at 5:02 am
    Formatting the NameNode initializes the FSNameSystem in the
    dfs.name.dir directories, to prepare for use.

    The format command typically writes a VERSION file that specifies what
    the NamespaceID for this FS instance is, what was its ctime, and what
    is the version (of the file's layout) in use.

    This is helpful in making every NameNode instance unique, among other
    things. DataNode blocks carry the namespace-id information that lets
    them relate blocks to a NameNode (and thereby validate, etc.).

    --
    Harsh J
    www.harshj.com
  • Adarsh Sharma at Mar 10, 2011 at 5:44 am
    Thanks Harsh, i.e why if we again format namenode after loading some
    data INCOMATIBLE NAMESPACE ID's error occurs.


    Best Regards,

    Adarsh Sharma




    Harsh J wrote:
    Formatting the NameNode initializes the FSNameSystem in the
    dfs.name.dir directories, to prepare for use.

    The format command typically writes a VERSION file that specifies what
    the NamespaceID for this FS instance is, what was its ctime, and what
    is the version (of the file's layout) in use.

    This is helpful in making every NameNode instance unique, among other
    things. DataNode blocks carry the namespace-id information that lets
    them relate blocks to a NameNode (and thereby validate, etc.).
  • Edward Capriolo at Mar 10, 2011 at 10:48 pm

    On Thu, Mar 10, 2011 at 12:48 AM, Adarsh Sharma wrote:
    Thanks Harsh, i.e why if we again format namenode after loading some data
    INCOMATIBLE NAMESPACE ID's error occurs.


    Best Regards,

    Adarsh Sharma




    Harsh J wrote:
    Formatting the NameNode initializes the FSNameSystem in the
    dfs.name.dir directories, to prepare for use.

    The format command typically writes a VERSION file that specifies what
    the NamespaceID for this FS instance is, what was its ctime, and what
    is the version (of the file's layout) in use.

    This is helpful in making every NameNode instance unique, among other
    things. DataNode blocks carry the namespace-id information that lets
    them relate blocks to a NameNode (and thereby validate, etc.).
    If you do not tell where you NN to store data it stores it to /tmp.
    And your operating system cleans up temp.

    The reason for the error you see is datanodes don't like to suddenly
    connect to new namenodes. So as a safety they do not start up until
    they are cleared.
  • Boris Shkolnik at Mar 10, 2011 at 11:24 pm
    On the first run you want namenode to initialize its directories (where it
    store VERSION file, fsimage and edits).
    On the subsequent formats - you are making sure you have a new EMPTY file
    system. If you don't do format NameNode will load up fsimage and edits.
    There is also matter of generating new space id, which is matched against
    Datanode's ones. So if you format Namenode you need to cleanup data from
    Datanodes.

    On the other hand, if you just add Datanodes to a running cluster - you
    don't have to format NN.

    Boris.

    On 3/9/11 8:27 PM, "Adarsh Sharma" wrote:

    Dear all,

    I have configured several times a Hadoop Cluster of 2,3,5,8 nodes but
    one doubt in my mind always occur.
    Why it is necessary to format Hadoop Namenode by *bin/hadoop -namenode
    format *command.
    What is the reason and logic behind this.

    Please justify if someone knows.


    Thanks & best Regards,

    Adarsh Sharma

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedMar 10, '11 at 4:23a
activeMar 10, '11 at 11:24p
posts5
users4
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase