FAQ
Hi Folks,

We have asked this question in common-users hadoop mail list, but not resolved for a week.

We try to get hbase and hadoop running on clusters, take 2 Solaris servers(also tried 1 linux, 1 Solaris) for now.

Because of the incompatibility issue between hbase and hadoop, we have to stick with hadoop 0.20.2-append release.

It is very straight forward to make hadoop-0.20.203 running, but stuck for several days with hadoop-0.20.2, even the official release, not the append version.

1. Once try to run start-mapred.sh(hadoop-daemon.sh --config $HADOOP_CONF_DIR start jobtracker), following errors shown in namenode and jobtracker logs:

2011-05-26 12:30:29,169 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place enough replicas, still in need of 1
2011-05-26 12:30:29,175 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 on 9000, call addBlock(/tmp/hadoop-cfadm/mapred/system/jobtracker.info, DFSCl
ient_2146408809) from 169.193.181.212:55334: error: java.io.IOException: File /tmp/hadoop-cfadm/mapred/system/jobtracker.info could only be replicated to 0 n
odes, instead of 1
java.io.IOException: File /tmp/hadoop-cfadm/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)


2. Also, Configured Capacity is 0, cannot put any file to HDFS.

3. in datanode server, no error in logs, but tasktracker logs has the following suspicious thing:
2011-05-25 23:36:10,839 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2011-05-25 23:36:10,839 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 41904: starting
2011-05-25 23:36:10,852 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 41904: starting
2011-05-25 23:36:10,853 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 41904: starting
2011-05-25 23:36:10,853 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 41904: starting
2011-05-25 23:36:10,853 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 41904: starting
.....
2011-05-25 23:36:10,855 INFO org.apache.hadoop.ipc.Server: IPC Server handler 63 on 41904: starting
2011-05-25 23:36:10,950 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker up at: localhost/127.0.0.1:41904
2011-05-25 23:36:10,950 INFO org.apache.hadoop.mapred.TaskTracker: Starting tracker tracker_loanps3d:localhost/127.0.0.1:41904


I have tried all suggestions found so far, including
1) remove hadoop-name and hadoop-data folders and reformat namenode;
2) clean up all temp files/folders under /tmp;

But nothing works.

Your help is greatly appreciated.

Thanks,

RX

Search Discussions

  • Harsh J at May 31, 2011 at 2:23 pm
    Xu,

    Please post the output of `hadoop dfsadmin -report` and attach the
    tail of a started DN's log?
    On Tue, May 31, 2011 at 7:44 PM, Xu, Richard wrote:
    2. Also, Configured Capacity is 0, cannot put any file to HDFS.
    This might easily be the cause. I'm not sure if its a Solaris thing
    that can lead to this though.
    3. in datanode server, no error in logs, but tasktracker logs has the following suspicious thing:
    I don't see any suspicious log message in what you'd posted. Anyhow,
    the TT does not matter here.

    --
    Harsh J
  • Yaozhen Pan at May 31, 2011 at 2:34 pm
    How many datanodes are in your cluster? and what is the value of
    "dfs.replication" in hdfs-site.xml (if not specified, default value is 3)?
    From the error log, it seems there are not enough datanodes to replicate the
    files in hdfs.

    在 2011 5 31 22:23,"Harsh J" <harsh@cloudera.com>写道:
    Xu,

    Please post the output of `hadoop dfsadmin -report` and attach the
    tail of a started DN's log?

    On Tue, May 31, 2011 at 7:44 PM, Xu, Richard wrote:
    2. Also, Configured Cap...
    This might easily be the cause. I'm not sure if its a Solaris thing
    that can lead to this though.

    3. in datanode server, no error in logs, but tasktracker logs has the
    following suspicious thing:...
    I don't see any suspicious log message in what you'd posted. Anyhow,
    the TT does not matter here.

    --
    Harsh J
  • Xu, Richard at May 31, 2011 at 2:37 pm
    1 namenode, 1 datanode. Dfs.replication=3. We also tried 0, 1, 2, same result.

    From: Yaozhen Pan
    Sent: Tuesday, May 31, 2011 10:34 AM
    To: hdfs-user@hadoop.apache.org
    Subject: Re: Unable to start hadoop-0.20.2 but able to start hadoop-0.20.203 cluster


    How many datanodes are in your cluster? and what is the value of "dfs.replication" in hdfs-site.xml (if not specified, default value is 3)?

    From the error log, it seems there are not enough datanodes to replicate the files in hdfs.
    在 2011 5 31 22:23,"Harsh J" <harsh@cloudera.com 写道:
    Xu,

    Please post the output of `hadoop dfsadmin -report` and attach the
    tail of a started DN's log?
    On Tue, May 31, 2011 at 7:44 PM, Xu, Richard wrote:
    2. Also, Configured Cap...
    This might easily be the cause. I'm not sure if its a Solaris thing
    that can lead to this though.
    3. in datanode server, no error in logs, but tasktracker logs has the following suspicious thing:...
    I don't see any suspicious log message in what you'd posted. Anyhow,
    the TT does not matter here.

    --
    Harsh J
  • Marcos Ortiz at May 31, 2011 at 3:14 pm

    On 05/31/2011 10:06 AM, Xu, Richard wrote:
    1 namenode, 1 datanode. Dfs.replication=3. We also tried 0, 1, 2, same
    result.

    *From:*Yaozhen Pan
    *Sent:* Tuesday, May 31, 2011 10:34 AM
    *To:* hdfs-user@hadoop.apache.org
    *Subject:* Re: Unable to start hadoop-0.20.2 but able to start
    hadoop-0.20.203 cluster

    How many datanodes are in your cluster? and what is the value of
    "dfs.replication" in hdfs-site.xml (if not specified, default value is
    3)?

    From the error log, it seems there are not enough datanodes to
    replicate the files in hdfs.

    在 2011 5 31 22:23,"Harsh J" <harsh@cloudera.com
    Xu,

    Please post the output of `hadoop dfsadmin -report` and attach the
    tail of a started DN's log?

    On Tue, May 31, 2011 at 7:44 PM, Xu, Richard wrote:
    2. Also, Configured Cap...
    This might easily be the cause. I'm not sure if its a Solaris thing
    that can lead to this though.

    3. in datanode server, no error in logs, but tasktracker logs has
    the following suspicious thing:...

    I don't see any suspicious log message in what you'd posted. Anyhow,
    the TT does not matter here.

    --
    Harsh J
    Regards, Xu
    When you installed on Solaris:
    - Did you syncronize the ntp server on all nodes:
    echo "server youservernetp.com" > /etc/inet/ntp.conf
    svcadm enable svc:/network/ntp:default

    - Are you using the same Java version on both systems (Ubuntu and Solaris)?

    - Can you test with one NN and two DN?



    --
    Marcos Luis Ortiz Valmaseda
    Software Engineer (Distributed Systems)
    http://uncubanitolinuxero.blogspot.com
  • Xu, Richard at May 31, 2011 at 3:21 pm
    “Are you using the same Java version on both systems”
    ---Yes.

    “Can you test with one NN and two DN?”
    ---We tested with 1 namenode and 4 datanode, and encountered this problem. We tried to narrow it down, so that tried with 1 NN and 1 DN.


    From: Marcos Ortiz
    Sent: Tuesday, May 31, 2011 11:46 AM
    To: hdfs-user@hadoop.apache.org
    Cc: Xu, Richard [ICG-IT]
    Subject: Re: Unable to start hadoop-0.20.2 but able to start hadoop-0.20.203 cluster

    On 05/31/2011 10:06 AM, Xu, Richard wrote:
    1 namenode, 1 datanode. Dfs.replication=3. We also tried 0, 1, 2, same result.

    From: Yaozhen Pan
    Sent: Tuesday, May 31, 2011 10:34 AM
    To: hdfs-user@hadoop.apache.org
    Subject: Re: Unable to start hadoop-0.20.2 but able to start hadoop-0.20.203 cluster


    How many datanodes are in your cluster? and what is the value of "dfs.replication" in hdfs-site.xml (if not specified, default value is 3)?

    From the error log, it seems there are not enough datanodes to replicate the files in hdfs.
    在 2011 5 31 22:23,"Harsh J" <harsh@cloudera.com 写道:
    Xu,

    Please post the output of `hadoop dfsadmin -report` and attach the
    tail of a started DN's log?
    On Tue, May 31, 2011 at 7:44 PM, Xu, Richard wrote:
    2. Also, Configured Cap...
    This might easily be the cause. I'm not sure if its a Solaris thing
    that can lead to this though.
    3. in datanode server, no error in logs, but tasktracker logs has the following suspicious thing:...
    I don't see any suspicious log message in what you'd posted. Anyhow,
    the TT does not matter here.

    --
    Harsh J
    Regards, Xu
    When you installed on Solaris:
    - Did you syncronize the ntp server on all nodes:
    echo "server youservernetp.com" > /etc/inet/ntp.conf
    svcadm enable svc:/network/ntp:default

    - Are you using the same Java version on both systems (Ubuntu and Solaris)?

    - Can you test with one NN and two DN?





    --

    Marcos Luis Ortiz Valmaseda

    Software Engineer (Distributed Systems)

    http://uncubanitolinuxero.blogspot.com
  • Xu, Richard at May 31, 2011 at 3:35 pm
    Running on namenode(hostname: loanps4d):
    :/opt/hadoop-install/hadoop-0.20.2/bin:59 > hadoop dfsadmin -report
    Configured Capacity: 0 (0 KB)
    Present Capacity: 3072 (3 KB)
    DFS Remaining: 0 (0 KB)
    DFS Used: 3072 (3 KB)
    DFS Used%: 100%
    Under replicated blocks: 0
    Blocks with corrupt replicas: 0
    Missing blocks: 0

    -------------------------------------------------
    Datanodes available: 1 (1 total, 0 dead)

    Name: 169.193.181.213:50010
    Decommission Status : Normal
    Configured Capacity: 0 (0 KB)
    DFS Used: 3072 (3 KB)
    Non DFS Used: 0 (0 KB)
    DFS Remaining: 0(0 KB)
    DFS Used%: 100%
    DFS Remaining%: 0%
    Last contact: Tue May 31 11:30:37 EDT 2011

    Datanode(hostname: loanps3d) log:
    :/opt/hadoop-install/hadoop-0.20.2/logs:60 > tail hadoop-cfadm-datanode-loanps3d.log
    2011-05-31 11:29:04,076 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 50020: starting
    2011-05-31 11:29:04,086 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 50020: starting
    2011-05-31 11:29:04,086 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 50020: starting
    2011-05-31 11:29:04,087 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 50020: starting
    2011-05-31 11:29:04,087 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: dnRegistration = DatanodeRegistration(loanps3d:50010, storageID=, infoPort=50075, ipcPort=50020)
    2011-05-31 11:29:04,142 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: New storage id DS-1373813798-169.193.181.213-50010-1306855744095 is assigned to data-node 169.193.181.213:50010
    2011-05-31 11:29:04,145 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(169.193.181.213:50010, storageID=DS-1373813798-169.193.181.213-50010-1306855744095, infoPort=50075, ipcPort=50020)In DataNode.run, data = FSDataset{dirpath='/opt/hadoop-install/hadoop-0.20.2/hadoop-data/current'}
    2011-05-31 11:29:04,146 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: using BLOCKREPORT_INTERVAL of 3600000msec Initial delay: 0msec
    2011-05-31 11:29:04,679 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 0 blocks got processed in 28 msecs
    2011-05-31 11:29:04,683 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Starting Periodic block scanner.


    -----Original Message-----
    From: Harsh J
    Sent: Tuesday, May 31, 2011 10:23 AM
    To: hdfs-user@hadoop.apache.org
    Subject: Re: Unable to start hadoop-0.20.2 but able to start hadoop-0.20.203 cluster

    Xu,

    Please post the output of `hadoop dfsadmin -report` and attach the
    tail of a started DN's log?
    On Tue, May 31, 2011 at 7:44 PM, Xu, Richard wrote:
    2. Also, Configured Capacity is 0, cannot put any file to HDFS.
    This might easily be the cause. I'm not sure if its a Solaris thing
    that can lead to this though.
    3. in datanode server, no error in logs, but tasktracker logs has the following suspicious thing:
    I don't see any suspicious log message in what you'd posted. Anyhow,
    the TT does not matter here.

    --
    Harsh J
  • Harsh J at May 31, 2011 at 5:31 pm
    Hello RX,
    On Tue, May 31, 2011 at 9:05 PM, Xu, Richard wrote:
    Running on namenode(hostname: loanps4d):
    :/opt/hadoop-install/hadoop-0.20.2/bin:59 > hadoop dfsadmin -report
    Configured Capacity: 0 (0 KB)
    Present Capacity: 3072 (3 KB)
    DFS Remaining: 0 (0 KB)
    DFS Used: 3072 (3 KB)
    DFS Used%: 100%
    Under replicated blocks: 0
    Blocks with corrupt replicas: 0
    Missing blocks: 0

    -------------------------------------------------
    Datanodes available: 1 (1 total, 0 dead)

    Name: 169.193.181.213:50010
    Decommission Status : Normal
    Configured Capacity: 0 (0 KB)
    DFS Used: 3072 (3 KB)
    Non DFS Used: 0 (0 KB)
    DFS Remaining: 0(0 KB)
    DFS Used%: 100%
    DFS Remaining%: 0%
    Last contact: Tue May 31 11:30:37 EDT 2011
    Yup, for some reason the DN's not picking up any space stats on your platform.

    Could you give me the local command outputs of the following from both
    your Solaris and Linux systems?

    $ df -k /opt/hadoop-install/hadoop-0.20.2/hadoop-data
    $ du -sk /opt/hadoop-install/hadoop-0.20.2/hadoop-data

    FWIW, the code am reading says that the DU and DF util classes have
    only been tested on Cygwin, Linux and FreeBSD. I think Solaris may
    need a bit of tweaking, but am not aware of a resource for this off
    the top of my head.

    --
    Harsh J

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouphdfs-user @
categorieshadoop
postedMay 31, '11 at 2:15p
activeMay 31, '11 at 5:31p
posts8
users4
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase