FAQ
I am currently stuck with hadoop namenode that won't start.

When I type "start-all.sh", everything prints out fine.
But when I type "jps", only JobTracker is activated.

When this happens, I usually format the namenode.

But the problem is that there are 500gigs of date in HDFS.
So I really want to save the data.

Can anyone give help please?

Search Discussions

  • Harsh J at Aug 5, 2010 at 4:14 am
    Could you check the NameNode/SecondaryNameNode logs and try to find
    the exact issue? Post the errors (if) it contains here, so we can try
    to help you better.
    On Thu, Aug 5, 2010 at 9:39 AM, edward choi wrote:
    I am currently stuck with hadoop namenode that won't start.

    When I type "start-all.sh", everything prints out fine.
    But when I type "jps", only JobTracker is activated.

    When this happens, I usually format the namenode.

    But the problem is that there are 500gigs of date in HDFS.
    So I really want to save the data.

    Can anyone give help please?


    --
    Harsh J
    www.harshj.com
  • Edward choi at Aug 5, 2010 at 6:08 am
    I've fixed the problem.

    The reason namenode won't start was that I accidentally started the cluster
    with root account.
    This somehow changed the ownership of some hadoop-related files(ex: log
    files, and hadoop.tmp.dir/dfs/name/current/edits) from hadoop:hadoop to
    root:root.
    After I fixed the ownership issue, everything went fine.
    Thanks for the concern.

    2010/8/5 Harsh J <[email protected]>
    Could you check the NameNode/SecondaryNameNode logs and try to find
    the exact issue? Post the errors (if) it contains here, so we can try
    to help you better.
    On Thu, Aug 5, 2010 at 9:39 AM, edward choi wrote:
    I am currently stuck with hadoop namenode that won't start.

    When I type "start-all.sh", everything prints out fine.
    But when I type "jps", only JobTracker is activated.

    When this happens, I usually format the namenode.

    But the problem is that there are 500gigs of date in HDFS.
    So I really want to save the data.

    Can anyone give help please?


    --
    Harsh J
    www.harshj.com
  • Raj V at Aug 5, 2010 at 3:34 pm
    I run a 512 node hadoop cluster. Yesterday I moved 30Gb of compressed data from
    a NFS mounted partition by running on the namenode

    hadoop fs -copyFromLocal /mnt/data/data1 /mnt/data/data2 mnt/data/data3
    hdfs:/data

    When the job completed the local disk on the namenode was 40% full ( Most of it
    used by the dfs dierctories) while the others had 1% disk utilization.

    Just to see if there was an issue, I deleted the hdfs:/data directory and
    restarted the move from a datanode.

    Once again the disk space on that data node was substantially over utilized.

    I would have assumed that the disk space would be more or less uniformly
    consumed on all the data nodes.

    Is there a reason why one disk would be over utilized?

    Do I have to run balancer everytime I copy data?

    Am I missing something?

    Raj
  • Gang Luo at Aug 5, 2010 at 4:37 pm
    Hi all,
    to create a RecordReader in new API, we needs a TaskAttemptContext object, which
    seems to me the RecordReader should only be created on each split that has been
    assigned a task ID. However, I want to do a centralized sampling and create
    record reader on some splits before the job is submitted. What I am doing is
    create a dummy TaskAttemptContext and use it to create record reader, but not
    sure whether there is some side-effects. Is there any better way to do this? Why
    we are not supposed to create record reader centrally as indicated by the new
    API?

    Thanks,
    -Gang
  • Dmitry Pushkarev at Aug 5, 2010 at 6:02 pm
    when you copy files and have a local datanode - first copy will end up
    there.
    Just stop datanode at the node from which you copy files, and they will end
    up on random nodes.

    Also don't run datanode at the same machine as namenode.

    -----Original Message-----
    From: Raj V
    Sent: Thursday, August 05, 2010 8:33 AM
    To: [email protected]
    Subject: hdfs space problem.



    I run a 512 node hadoop cluster. Yesterday I moved 30Gb of compressed data
    from
    a NFS mounted partition by running on the namenode

    hadoop fs -copyFromLocal /mnt/data/data1 /mnt/data/data2 mnt/data/data3
    hdfs:/data

    When the job completed the local disk on the namenode was 40% full ( Most of
    it
    used by the dfs dierctories) while the others had 1% disk utilization.

    Just to see if there was an issue, I deleted the hdfs:/data directory and
    restarted the move from a datanode.

    Once again the disk space on that data node was substantially over utilized.

    I would have assumed that the disk space would be more or less uniformly
    consumed on all the data nodes.

    Is there a reason why one disk would be over utilized?

    Do I have to run balancer everytime I copy data?

    Am I missing something?

    Raj
  • Raj V at Aug 5, 2010 at 6:29 pm
    Thank you. I realized that I was running the datanode on the namenode and
    stopped it, but did not know that the first copy went to the local node.

    Raj




    ________________________________
    From: Dmitry Pushkarev <[email protected]>
    To: [email protected]
    Sent: Thu, August 5, 2010 11:02:08 AM
    Subject: RE: hdfs space problem.

    when you copy files and have a local datanode - first copy will end up
    there.
    Just stop datanode at the node from which you copy files, and they will end
    up on random nodes.

    Also don't run datanode at the same machine as namenode.

    -----Original Message-----
    From: Raj V
    Sent: Thursday, August 05, 2010 8:33 AM
    To: [email protected]
    Subject: hdfs space problem.



    I run a 512 node hadoop cluster. Yesterday I moved 30Gb of compressed data
    from
    a NFS mounted partition by running on the namenode

    hadoop fs -copyFromLocal /mnt/data/data1 /mnt/data/data2 mnt/data/data3
    hdfs:/data

    When the job completed the local disk on the namenode was 40% full ( Most of
    it
    used by the dfs dierctories) while the others had 1% disk utilization.

    Just to see if there was an issue, I deleted the hdfs:/data directory and
    restarted the move from a datanode.

    Once again the disk space on that data node was substantially over utilized.

    I would have assumed that the disk space would be more or less uniformly
    consumed on all the data nodes.

    Is there a reason why one disk would be over utilized?

    Do I have to run balancer everytime I copy data?

    Am I missing something?

    Raj
  • Steve Loughran at Aug 9, 2010 at 9:38 am

    On 05/08/10 19:28, Raj V wrote:
    Thank you. I realized that I was running the datanode on the namenode and
    stopped it, but did not know that the first copy went to the local node.

    Raj
    It's a placement decision that makes sense for code running as MR jobs,
    ensuring that the output of work goes to the local machine and not
    somewhere random, but on big imports like your's you get penalised.

    Some datacentres have one or two IO nodes in the cluster that aren't
    running hadoop HDFS or task trackers, but let you get at the data at
    full datacentre rates, just to help with these kind of problems.
    Otherwies, if you can implement your import as a MapReduce job, Hadoop
    can do the work for you

    -steve

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedAug 5, '10 at 4:10a
activeAug 9, '10 at 9:38a
posts8
users6
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2023 Grokbase