FAQ
All,

I'm setting up my first hadoop full cluster. I did the cygwin thing and
everything works. I'm having problems with the cluster.

The cluster is five nodes of matched hardware running Ubuntu 8.04. I
believe I have ssh working properly. The master node is named hbase1, but
I'm not doing anything with hbase.

I run start-dfs.sh, Jps shows NameNode running, and the logs are free of
error. The data nodes, however, appear to be complaining. "Retrying
connect to server: hbase1"

I run start-mapred.sh, Jps shows NameNode and JobTracker running.

The namenode log says, "jobtracker.info could only be replicated to 0 nodes,
instead of 1".

The jobtracker log says two things of significance:

1. "It might be because the JobTracker failed to read/write system files
(hdfs://hbase1:30000/hdfs/mapred/system/jobtracker.info /
hdfs://hbase1:30000/hdfs/mapred/system/jobtracker.info.recover) or the
system file hdfs://hbase1:30000/hdfs/mapred/system/jobtracker.info is
missing!"

How can this be? If I do bin/hadoop fs -lsr / , jobtracker.info shows.

2. "Problem binding to hbase1/127.0.1.1:30001 : Address already in use"

How can this be in use by anything other that JobTracker? I have nothing
else on the machine that could be conflicting. I have also verified that I
have no previous hadoop processes hanging around.

I have included my config files just in case.

Many thanks for any help.

****************
core-site.xml

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://hbase1:30000</value>
</property>
</configuration>

*****************
hdfs-site.xml

<configuration>
<property>
<name>dfs.name.dir</name>
<value>/hdfs/name1</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/hdfs/data1</value>
</property>
<property>
<name>fs.checkpoint.dir</name>
<value>/hdfs/check1</value>
</property>
<property>
<name>dfs.replication</name>
<value>5</value>
</property>
</configuration>

***********************
mapred-site.xml

<configuration>
<property>
<name>mapred.job.tracker</name>
<value>hbase1:30001</value>
</property>
<property>
<name>mapred.system.dir</name>
<value>/hdfs/mapred/system</value>
</property>
<property>
<name>mapred.local.dir</name>
<value>/hdfs/mapred/local</value>
</property>
<property>
<name>mapred.tasktracker.map.tasks</name>
<value>50</value>
</property>
<property>
<name>mapred.tasktracker.reduce.tasks</name>
<value>35</value>
</property>
</configuration>


--
View this message in context: http://www.nabble.com/hadoop-0.20.0-jobtracker.info-could-only-be-replicated-to-0-nodes-tp25373475p25373475.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Search Discussions

  • Chandraprakash Bhagtani at Sep 10, 2009 at 2:11 pm
    You can try running JobTracker on some other port. This port might me in
    use.

    --
    Thanks & Regards,
    Chandra Prakash Bhagtani,
    On Thu, Sep 10, 2009 at 2:58 AM, gcr44 wrote:


    All,

    I'm setting up my first hadoop full cluster. I did the cygwin thing and
    everything works. I'm having problems with the cluster.

    The cluster is five nodes of matched hardware running Ubuntu 8.04. I
    believe I have ssh working properly. The master node is named hbase1, but
    I'm not doing anything with hbase.

    I run start-dfs.sh, Jps shows NameNode running, and the logs are free of
    error. The data nodes, however, appear to be complaining. "Retrying
    connect to server: hbase1"

    I run start-mapred.sh, Jps shows NameNode and JobTracker running.

    The namenode log says, "jobtracker.info could only be replicated to 0
    nodes,
    instead of 1".

    The jobtracker log says two things of significance:

    1. "It might be because the JobTracker failed to read/write system files
    (hdfs://hbase1:30000/hdfs/mapred/system/jobtracker.info /
    hdfs://hbase1:30000/hdfs/mapred/system/jobtracker.info.recover) or the
    system file hdfs://hbase1:30000/hdfs/mapred/system/jobtracker.info is
    missing!"

    How can this be? If I do bin/hadoop fs -lsr / , jobtracker.info shows.

    2. "Problem binding to hbase1/127.0.1.1:30001 : Address already in use"

    How can this be in use by anything other that JobTracker? I have nothing
    else on the machine that could be conflicting. I have also verified that I
    have no previous hadoop processes hanging around.

    I have included my config files just in case.

    Many thanks for any help.

    ****************
    core-site.xml

    <configuration>
    <property>
    <name>fs.default.name</name>
    <value>hdfs://hbase1:30000</value>
    </property>
    </configuration>

    *****************
    hdfs-site.xml

    <configuration>
    <property>
    <name>dfs.name.dir</name>
    <value>/hdfs/name1</value>
    </property>
    <property>
    <name>dfs.data.dir</name>
    <value>/hdfs/data1</value>
    </property>
    <property>
    <name>fs.checkpoint.dir</name>
    <value>/hdfs/check1</value>
    </property>
    <property>
    <name>dfs.replication</name>
    <value>5</value>
    </property>
    </configuration>

    ***********************
    mapred-site.xml

    <configuration>
    <property>
    <name>mapred.job.tracker</name>
    <value>hbase1:30001</value>
    </property>
    <property>
    <name>mapred.system.dir</name>
    <value>/hdfs/mapred/system</value>
    </property>
    <property>
    <name>mapred.local.dir</name>
    <value>/hdfs/mapred/local</value>
    </property>
    <property>
    <name>mapred.tasktracker.map.tasks</name>
    <value>50</value>
    </property>
    <property>
    <name>mapred.tasktracker.reduce.tasks</name>
    <value>35</value>
    </property>
    </configuration>


    --
    View this message in context:
    http://www.nabble.com/hadoop-0.20.0-jobtracker.info-could-only-be-replicated-to-0-nodes-tp25373475p25373475.html
    Sent from the Hadoop core-user mailing list archive at Nabble.com.
  • Gcr44 at Sep 10, 2009 at 2:47 pm
    Thanks for the response.

    I have already tried moving JobTracker to several different ports always
    with the same result.


    Chandraprakash Bhagtani wrote:
    You can try running JobTracker on some other port. This port might me in
    use.

    --
    Thanks & Regards,
    Chandra Prakash Bhagtani,
    On Thu, Sep 10, 2009 at 2:58 AM, gcr44 wrote:


    All,

    I'm setting up my first hadoop full cluster. I did the cygwin thing and
    everything works. I'm having problems with the cluster.

    The cluster is five nodes of matched hardware running Ubuntu 8.04. I
    believe I have ssh working properly. The master node is named hbase1, but
    I'm not doing anything with hbase.

    I run start-dfs.sh, Jps shows NameNode running, and the logs are free of
    error. The data nodes, however, appear to be complaining. "Retrying
    connect to server: hbase1"

    I run start-mapred.sh, Jps shows NameNode and JobTracker running.

    The namenode log says, "jobtracker.info could only be replicated to 0
    nodes,
    instead of 1".

    The jobtracker log says two things of significance:

    1. "It might be because the JobTracker failed to read/write system files
    (hdfs://hbase1:30000/hdfs/mapred/system/jobtracker.info /
    hdfs://hbase1:30000/hdfs/mapred/system/jobtracker.info.recover) or the
    system file hdfs://hbase1:30000/hdfs/mapred/system/jobtracker.info is
    missing!"

    How can this be? If I do bin/hadoop fs -lsr / , jobtracker.info shows.

    2. "Problem binding to hbase1/127.0.1.1:30001 : Address already in use"

    How can this be in use by anything other that JobTracker? I have nothing
    else on the machine that could be conflicting. I have also verified that
    I
    have no previous hadoop processes hanging around.

    I have included my config files just in case.

    Many thanks for any help.

    ****************
    core-site.xml

    <configuration>
    <property>
    <name>fs.default.name</name>
    <value>hdfs://hbase1:30000</value>
    </property>
    </configuration>

    *****************
    hdfs-site.xml

    <configuration>
    <property>
    <name>dfs.name.dir</name>
    <value>/hdfs/name1</value>
    </property>
    <property>
    <name>dfs.data.dir</name>
    <value>/hdfs/data1</value>
    </property>
    <property>
    <name>fs.checkpoint.dir</name>
    <value>/hdfs/check1</value>
    </property>
    <property>
    <name>dfs.replication</name>
    <value>5</value>
    </property>
    </configuration>

    ***********************
    mapred-site.xml

    <configuration>
    <property>
    <name>mapred.job.tracker</name>
    <value>hbase1:30001</value>
    </property>
    <property>
    <name>mapred.system.dir</name>
    <value>/hdfs/mapred/system</value>
    </property>
    <property>
    <name>mapred.local.dir</name>
    <value>/hdfs/mapred/local</value>
    </property>
    <property>
    <name>mapred.tasktracker.map.tasks</name>
    <value>50</value>
    </property>
    <property>
    <name>mapred.tasktracker.reduce.tasks</name>
    <value>35</value>
    </property>
    </configuration>


    --
    View this message in context:
    http://www.nabble.com/hadoop-0.20.0-jobtracker.info-could-only-be-replicated-to-0-nodes-tp25373475p25373475.html
    Sent from the Hadoop core-user mailing list archive at Nabble.com.
    --
    View this message in context: http://www.nabble.com/hadoop-0.20.0-jobtracker.info-could-only-be-replicated-to-0-nodes-tp25373475p25384693.html
    Sent from the Hadoop core-user mailing list archive at Nabble.com.
  • Steve Loughran at Sep 10, 2009 at 3:48 pm

    gcr44 wrote:
    Thanks for the response.

    I have already tried moving JobTracker to several different ports always
    with the same result.


    Chandraprakash Bhagtani wrote:
    You can try running JobTracker on some other port. This port might me in
    use.

    --
    Thanks & Regards,
    Chandra Prakash Bhagtani,
    On Thu, Sep 10, 2009 at 2:58 AM, gcr44 wrote:

    All,

    I'm setting up my first hadoop full cluster. I did the cygwin thing and
    everything works. I'm having problems with the cluster.

    The cluster is five nodes of matched hardware running Ubuntu 8.04. I
    believe I have ssh working properly. The master node is named hbase1, but
    I'm not doing anything with hbase.

    I run start-dfs.sh, Jps shows NameNode running, and the logs are free of
    error. The data nodes, however, appear to be complaining. "Retrying
    connect to server: hbase1"

    I run start-mapred.sh, Jps shows NameNode and JobTracker running.

    The namenode log says, "jobtracker.info could only be replicated to 0
    nodes,
    instead of 1".

    The jobtracker log says two things of significance:

    1. "It might be because the JobTracker failed to read/write system files
    (hdfs://hbase1:30000/hdfs/mapred/system/jobtracker.info /
    hdfs://hbase1:30000/hdfs/mapred/system/jobtracker.info.recover) or the
    system file hdfs://hbase1:30000/hdfs/mapred/system/jobtracker.info is
    missing!"
    don't worry about the job tracker until you have HDFS -that is namenodes
    and datanodes, up and running. Do you have any datanodes up? Because
    complaints about not enough replication and missing files mean the
    filesystem isn't live yet
  • Brien colwell at Sep 10, 2009 at 4:01 pm
    Just an idea ... we've had trouble with Hadoop using internal instead of
    external addresses on Ubuntu. The data nodes can't connect to the
    namenode if it's listening on an internal address. On the namenode can
    you run 'netstat -na' ? What address is the namenode daemon bound to?



    Steve Loughran wrote:
    gcr44 wrote:
    Thanks for the response.

    I have already tried moving JobTracker to several different ports always
    with the same result.


    Chandraprakash Bhagtani wrote:
    You can try running JobTracker on some other port. This port might
    me in
    use.

    --
    Thanks & Regards,
    Chandra Prakash Bhagtani,

    On Thu, Sep 10, 2009 at 2:58 AM, gcr44 <geoffry.roberts@gmail.com>
    wrote:
    All,

    I'm setting up my first hadoop full cluster. I did the cygwin
    thing and
    everything works. I'm having problems with the cluster.

    The cluster is five nodes of matched hardware running Ubuntu 8.04. I
    believe I have ssh working properly. The master node is named
    hbase1, but
    I'm not doing anything with hbase.

    I run start-dfs.sh, Jps shows NameNode running, and the logs are
    free of
    error. The data nodes, however, appear to be complaining. "Retrying
    connect to server: hbase1"

    I run start-mapred.sh, Jps shows NameNode and JobTracker running.

    The namenode log says, "jobtracker.info could only be replicated to 0
    nodes,
    instead of 1".

    The jobtracker log says two things of significance:

    1. "It might be because the JobTracker failed to read/write system
    files
    (hdfs://hbase1:30000/hdfs/mapred/system/jobtracker.info /
    hdfs://hbase1:30000/hdfs/mapred/system/jobtracker.info.recover) or the
    system file hdfs://hbase1:30000/hdfs/mapred/system/jobtracker.info is
    missing!"
    don't worry about the job tracker until you have HDFS -that is
    namenodes and datanodes, up and running. Do you have any datanodes up?
    Because complaints about not enough replication and missing files mean
    the filesystem isn't live yet

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedSep 9, '09 at 9:29p
activeSep 10, '09 at 4:01p
posts5
users4
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase