FAQ
I have a Hadoop 0.19.0 cluster of 3 machines (storm, mystique, batman). It
seemed as if problems were occurring on mystique (I was noticing errors with
tasks that executed on mystique). So I decided to remove mystique. I did so
by calling stop-mapred.sh (I'm using S3 Native, not HDFS), removing mystique
from the $HADOOP_HOME/conf/slaves file on storm and batman. I then called
start-mapred.sh and verified (via the output of start-mapred.sh) that
tasktrackers were started only on batman and storm. When I started my
MapReduce program I viewed the task tracker machine list web interface and
saw that not only was mystique listed as one of the task trackers but that a
task had been assigned to it. How can I keep a machine from being included
in a cluster?

Any help is appreciated.

Thanks,
John

Search Discussions

  • Amandeep Khurana at Feb 17, 2009 at 10:21 pm
    You have to decommission the node. Look at
    http://wiki.apache.org/hadoop/FAQ#17

    Amandeep


    Amandeep Khurana
    Computer Science Graduate Student
    University of California, Santa Cruz

    On Tue, Feb 17, 2009 at 2:14 PM, S D wrote:

    I have a Hadoop 0.19.0 cluster of 3 machines (storm, mystique, batman). It
    seemed as if problems were occurring on mystique (I was noticing errors
    with
    tasks that executed on mystique). So I decided to remove mystique. I did so
    by calling stop-mapred.sh (I'm using S3 Native, not HDFS), removing
    mystique
    from the $HADOOP_HOME/conf/slaves file on storm and batman. I then called
    start-mapred.sh and verified (via the output of start-mapred.sh) that
    tasktrackers were started only on batman and storm. When I started my
    MapReduce program I viewed the task tracker machine list web interface and
    saw that not only was mystique listed as one of the task trackers but that
    a
    task had been assigned to it. How can I keep a machine from being included
    in a cluster?

    Any help is appreciated.

    Thanks,
    John
  • S D at Feb 18, 2009 at 1:32 am
    Thanks for your response. For clarification, I'm using S3 Native instead of
    HDFS. Hence, I'm not even calling start-dfs.sh since I'm not using a
    distributed filesystem. Given such a situation, is decommissioning nodes
    applicable? When I ran 'hadoop dfsadmin -refreshNodes' I received the
    following response:

    FileSystem is s3n://<bucketname>

    Thanks,
    John
    On Tue, Feb 17, 2009 at 4:20 PM, Amandeep Khurana wrote:

    You have to decommission the node. Look at
    http://wiki.apache.org/hadoop/FAQ#17

    Amandeep


    Amandeep Khurana
    Computer Science Graduate Student
    University of California, Santa Cruz

    On Tue, Feb 17, 2009 at 2:14 PM, S D wrote:

    I have a Hadoop 0.19.0 cluster of 3 machines (storm, mystique, batman). It
    seemed as if problems were occurring on mystique (I was noticing errors
    with
    tasks that executed on mystique). So I decided to remove mystique. I did so
    by calling stop-mapred.sh (I'm using S3 Native, not HDFS), removing
    mystique
    from the $HADOOP_HOME/conf/slaves file on storm and batman. I then called
    start-mapred.sh and verified (via the output of start-mapred.sh) that
    tasktrackers were started only on batman and storm. When I started my
    MapReduce program I viewed the task tracker machine list web interface and
    saw that not only was mystique listed as one of the task trackers but that
    a
    task had been assigned to it. How can I keep a machine from being included
    in a cluster?

    Any help is appreciated.

    Thanks,
    John
  • Tom White at Feb 18, 2009 at 1:42 am
    The decommission process is for data nodes - which you are not
    running. Have a look at the mapred.hosts.exclude property for how to
    exclude tasktrackers.

    Tom
    On Tue, Feb 17, 2009 at 5:31 PM, S D wrote:
    Thanks for your response. For clarification, I'm using S3 Native instead of
    HDFS. Hence, I'm not even calling start-dfs.sh since I'm not using a
    distributed filesystem. Given such a situation, is decommissioning nodes
    applicable? When I ran 'hadoop dfsadmin -refreshNodes' I received the
    following response:

    FileSystem is s3n://<bucketname>

    Thanks,
    John
    On Tue, Feb 17, 2009 at 4:20 PM, Amandeep Khurana wrote:

    You have to decommission the node. Look at
    http://wiki.apache.org/hadoop/FAQ#17

    Amandeep


    Amandeep Khurana
    Computer Science Graduate Student
    University of California, Santa Cruz

    On Tue, Feb 17, 2009 at 2:14 PM, S D wrote:

    I have a Hadoop 0.19.0 cluster of 3 machines (storm, mystique, batman). It
    seemed as if problems were occurring on mystique (I was noticing errors
    with
    tasks that executed on mystique). So I decided to remove mystique. I did so
    by calling stop-mapred.sh (I'm using S3 Native, not HDFS), removing
    mystique
    from the $HADOOP_HOME/conf/slaves file on storm and batman. I then called
    start-mapred.sh and verified (via the output of start-mapred.sh) that
    tasktrackers were started only on batman and storm. When I started my
    MapReduce program I viewed the task tracker machine list web interface and
    saw that not only was mystique listed as one of the task trackers but that
    a
    task had been assigned to it. How can I keep a machine from being included
    in a cluster?

    Any help is appreciated.

    Thanks,
    John
  • S D at Feb 18, 2009 at 2:37 am
    Thanks for this. I've set that property in my hadoop-site.xml file.
    Personally, that property seems a bit redundant given the slaves file. I
    think a better design would either use the value found in the file specified
    by mapred.hosts or use the contents of the slaves file but not both. In my
    case I set the value of mapred.hosts to point to the slaves file.

    ...but back to the first solution given in this thread...it seems that
    running 'hadoop dfsadmin -refreshNodes' did the trick. After running that
    command and then restarting the Map/Reduce framework (stop-mapred.sh
    followed by start-mapred.sh) the nodes were updated successfully. Prior
    attempts to stop and start the Map/Reduce framework didn't seem to work....

    Thanks,
    John
    On Tue, Feb 17, 2009 at 7:41 PM, Tom White wrote:

    The decommission process is for data nodes - which you are not
    running. Have a look at the mapred.hosts.exclude property for how to
    exclude tasktrackers.

    Tom
    On Tue, Feb 17, 2009 at 5:31 PM, S D wrote:
    Thanks for your response. For clarification, I'm using S3 Native instead of
    HDFS. Hence, I'm not even calling start-dfs.sh since I'm not using a
    distributed filesystem. Given such a situation, is decommissioning nodes
    applicable? When I ran 'hadoop dfsadmin -refreshNodes' I received the
    following response:

    FileSystem is s3n://<bucketname>

    Thanks,
    John
    On Tue, Feb 17, 2009 at 4:20 PM, Amandeep Khurana wrote:

    You have to decommission the node. Look at
    http://wiki.apache.org/hadoop/FAQ#17

    Amandeep


    Amandeep Khurana
    Computer Science Graduate Student
    University of California, Santa Cruz

    On Tue, Feb 17, 2009 at 2:14 PM, S D wrote:

    I have a Hadoop 0.19.0 cluster of 3 machines (storm, mystique,
    batman).
    It
    seemed as if problems were occurring on mystique (I was noticing
    errors
    with
    tasks that executed on mystique). So I decided to remove mystique. I
    did
    so
    by calling stop-mapred.sh (I'm using S3 Native, not HDFS), removing
    mystique
    from the $HADOOP_HOME/conf/slaves file on storm and batman. I then
    called
    start-mapred.sh and verified (via the output of start-mapred.sh) that
    tasktrackers were started only on batman and storm. When I started my
    MapReduce program I viewed the task tracker machine list web interface and
    saw that not only was mystique listed as one of the task trackers but that
    a
    task had been assigned to it. How can I keep a machine from being included
    in a cluster?

    Any help is appreciated.

    Thanks,
    John
  • Arv Mistry at Feb 18, 2009 at 3:06 am
    I am using hadoop 18.3, I have a single datanode and it appears to be up and running fine. I'm able to read/write data to it.

    However, when I try to spawn a map/reduce job it fails with "Could not obtain block: blk_3263745172951227264_1155 file =/opt/kindsight/hadoop/data/mapred/system/job_200902171547_0001/job.xml"

    I noticed in the logs the following exception. I have not specified a mapred.fairscheduler.allocation.file but I have another machine(s) running the same configuration and that seems to work. If I do have to specify that file how do I do it and whats the format of the file?

    Any help would be appreciated.

    2009-02-17 15:47:03,304 WARN org.apache.hadoop.mapred.PoolManager: No mapred.fairscheduler.allocation.file given in jobconf - the fair scheduler will not use any queues.
    2009-02-17 15:47:03,319 INFO org.apache.hadoop.mapred.FairScheduler: Successfully configured FairScheduler
    2009-02-17 15:47:17,338 ERROR org.apache.hadoop.mapred.PoolManager: Failed to reload allocations file - will use existing allocations.
    java.lang.NullPointerException
    at java.io.File.(PoolManager.java:116)
    at org.apache.hadoop.mapred.FairScheduler.assignTasks(FairScheduler.java:226)
    at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1288)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
    2009-02-17 15:52:29,611 INFO org.apache.hadoop.dfs.DFSClient: Could not obtain block blk_3263745172951227264_1155 from any node: java.io.IOException: No live nodes contain current block
    2009-02-17 15:52:32,617 INFO org.apache.hadoop.dfs.DFSClient: Could not obtain block blk_3263745172951227264_1155 from any node: java.io.IOException: No live nodes contain current block
    2009-02-17 15:52:35,626 INFO org.apache.hadoop.dfs.DFSClient: Could not obtain block blk_3263745172951227264_1155 from any node: java.io.IOException: No live nodes contain current block
    2009-02-17 15:52:38,631 WARN org.apache.hadoop.dfs.DFSClient: DFS Read: java.io.IOException: Could not obtain block: blk_3263745172951227264_1155 file =/opt/kindsight/hadoop/data/mapred/system/job_200902171547_0001/job.xml
    at org.apache.hadoop.dfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1470)
    at org.apache.hadoop.dfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1320)
    at org.apache.hadoop.dfs.DFSClient$DFSInputStream.read(DFSClient.java:1425)
    at java.io.DataInputStream.read(DataInputStream.java:83)
    at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:47)

    Cheers Arv

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedFeb 17, '09 at 10:14p
activeFeb 18, '09 at 3:06a
posts6
users4
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2023 Grokbase