FAQ
How should I be creating a new Job instance in 0.21. It looks like
Job(Configuration conf, String jobName) has been deprecated. It looks
like Job(Cluster cluster) is the new way but I'm unsure of how to get a
handle to the current cluster. Can someone advise. Thanks!

Search Discussions

  • Owen O'Malley at Aug 29, 2010 at 11:52 pm

    On Sun, Aug 29, 2010 at 4:39 PM, Mark wrote:
    How should I be creating a new Job instance in 0.21. It looks like
    Job(Configuration conf, String jobName) has been deprecated.
    Go ahead and use that method. I have a jira open to undeprecate it.

    -- Owen
  • Gang Luo at Aug 30, 2010 at 2:50 am
    Hi all,
    I am trying to configure and start a hadoop cluster on EC2. I got some problems
    here.


    1. Can I share hadoop code and its configuration across nodes? Say I have a
    distributed file system running in the cluster and all the nodes could see the
    hadoop code and conf there. So all the nodes will use the same copy of code and
    conf to run. Is it possible?

    2. if all the nodes could share hadoop and conf, does it mean I can launch
    hadoop (bin/start-dfs.sh, bin/start-mapred.sh) from any node (even slave node)?

    3. I think I specify and master and slave correctly. When I launch hadoop from
    master node, no tasktracker or datanode was launched on slave nodes. The log on
    slave nodes says:

    ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException:
    Incompatible namespaceIDs in /mnt/hadoop/dfs/data: namenode namespaceID =
    1048149291; datanode namespaceID = 313740560

    what is the problem?

    Thanks,
    -Gang
  • Xiujin yang at Aug 30, 2010 at 3:44 am

    Date: Mon, 30 Aug 2010 10:49:50 +0800
    From: lgpublic@yahoo.com.cn
    Subject: cluster startup problem
    To: common-user@hadoop.apache.org

    Hi all,
    I am trying to configure and start a hadoop cluster on EC2. I got some problems
    here.


    1. Can I share hadoop code and its configuration across nodes? Say I have a
    distributed file system running in the cluster and all the nodes could see the
    hadoop code and conf there. So all the nodes will use the same copy of code and
    conf to run. Is it possible?
    Use rsync

    2. if all the nodes could share hadoop and conf, does it mean I can launch
    hadoop (bin/start-dfs.sh, bin/start-mapred.sh) from any node (even slave node)?
    Just have a try. You will get answer.
    3. I think I specify and master and slave correctly. When I launch hadoop from
    master node, no tasktracker or datanode was launched on slave nodes. The log on
    slave nodes says:

    ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException:
    Incompatible namespaceIDs in /mnt/hadoop/dfs/data: namenode namespaceID =
    1048149291; datanode namespaceID = 313740560

    what is the problem?
    If HDFS data are useless, just delete HDFS data from datanodes one by one.

    Thanks,
    -Gang


  • Xiujin yang at Aug 30, 2010 at 5:38 am
    Hadoop Version: 0.20.2
    Scheduler: Fair scheduler

    Now I use fair scheduler to arrange job, but I found the scheduler didn't support preemption?

    Does 0.20.2 support preemption?

    I know 0.21.0 will support,
    https://issues.apache.org/jira/browse/MAPREDUCE-551


    Thank you in advance.


    Best,

    Xiujin Yang
  • Matei Zaharia at Aug 30, 2010 at 5:48 am
    The one in 0.20.2 doesn't support it. However, the Cloudera
    Distribution of Hadoop has backported preemption (and the other fair
    scheduler features in 0.21), so you could try that if you want
    preemption on a 0.20 cluster.

    Matei
    On 8/29/2010 10:37 PM, xiujin yang wrote:
    Hadoop Version: 0.20.2
    Scheduler: Fair scheduler

    Now I use fair scheduler to arrange job, but I found the scheduler didn't support preemption?

    Does 0.20.2 support preemption?

    I know 0.21.0 will support,
    https://issues.apache.org/jira/browse/MAPREDUCE-551


    Thank you in advance.


    Best,

    Xiujin Yang

  • Hemanth Yamijala at Aug 31, 2010 at 2:18 am
    Hi,
    On Mon, Aug 30, 2010 at 8:19 AM, Gang Luo wrote:
    Hi all,
    I am trying to configure and start a hadoop cluster on EC2. I got some problems
    here.


    1. Can I share hadoop code and its configuration across nodes? Say I have a
    distributed file system running in the cluster and all the nodes could see the
    hadoop code and conf there. So all the nodes will use the same copy of code and
    conf to run. Is it possible?
    If they are on the same path, technically it should be possible.
    However, I am not sure it is advisable at all. We've tried to do
    something like this using NFS and it fails in ways that make debugging
    extremely hard. In short, have local copies on all nodes pointing to
    the same path is the recommended option.
    2. if all the nodes could share hadoop and conf, does it mean I can launch
    hadoop (bin/start-dfs.sh, bin/start-mapred.sh) from any node (even slave node)?

    3. I think I specify and master and slave correctly. When I launch hadoop from
    master node, no tasktracker or datanode was launched on slave nodes. The log on
    slave nodes says:

    ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException:
    Incompatible namespaceIDs in /mnt/hadoop/dfs/data: namenode namespaceID =
    1048149291; datanode namespaceID = 313740560

    what is the problem?

    Thanks,
    -Gang



Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedAug 29, '10 at 11:39p
activeAug 31, '10 at 2:18a
posts7
users6
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase