FAQ
Hi Folks,

We try to get hbase and hadoop running on clusters, take 2 Solaris servers for now.

Because of the incompatibility issue between hbase and hadoop, we have to stick with hadoop 0.20.2-append release.

It is very straight forward to make hadoop-0.20.203 running, but stuck for several days with hadoop-0.20.2, even the official release, not the append version.

1. Once try to run start-mapred.sh(hadoop-daemon.sh --config $HADOOP_CONF_DIR start jobtracker), following errors shown in namenode and jobtracker logs:

2011-05-26 12:30:29,169 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place enough replicas, still in need of 1
2011-05-26 12:30:29,175 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 on 9000, call addBlock(/tmp/hadoop-cfadm/mapred/system/jobtracker.info, DFSCl
ient_2146408809) from 169.193.181.212:55334: error: java.io.IOException: File /tmp/hadoop-cfadm/mapred/system/jobtracker.info could only be replicated to 0 n
odes, instead of 1
java.io.IOException: File /tmp/hadoop-cfadm/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)


2. Also, Configured Capacity is 0, cannot put any file to HDFS.

3. in datanode server, no error in logs, but tasktracker logs has the following suspicious thing:
2011-05-25 23:36:10,839 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2011-05-25 23:36:10,839 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 41904: starting
2011-05-25 23:36:10,852 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 41904: starting
2011-05-25 23:36:10,853 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 41904: starting
2011-05-25 23:36:10,853 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 41904: starting
2011-05-25 23:36:10,853 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 41904: starting
.....
2011-05-25 23:36:10,855 INFO org.apache.hadoop.ipc.Server: IPC Server handler 63 on 41904: starting
2011-05-25 23:36:10,950 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker up at: localhost/127.0.0.1:41904
2011-05-25 23:36:10,950 INFO org.apache.hadoop.mapred.TaskTracker: Starting tracker tracker_loanps3d:localhost/127.0.0.1:41904


I have tried all suggestions found so far, including
1) remove hadoop-name and hadoop-data folders and reformat namenode;
2) clean up all temp files/folders under /tmp;

But nothing works.

Your help is greatly appreciated.

Thanks,

RX

Search Discussions

  • DAN at May 27, 2011 at 2:23 am
    Hi, Richard

    Pay attention to "Not able to place enough replicas, still in need of 1". Pls confirm right
    setting of "dfs.replication" in hdfs-site.xml.

    Good luck!
    Dan

    --


    At 2011-05-27 08:01:37,"Xu, Richard " wrote:

    Hi Folks,

    We try to get hbase and hadoop running on clusters, take 2 Solaris servers for now.

    Because of the incompatibility issue between hbase and hadoop, we have to stick with hadoop 0.20.2-append release.

    It is very straight forward to make hadoop-0.20.203 running, but stuck for several days with hadoop-0.20.2, even the official release, not the append version.

    1. Once try to run start-mapred.sh(hadoop-daemon.sh --config $HADOOP_CONF_DIR start jobtracker), following errors shown in namenode and jobtracker logs:

    2011-05-26 12:30:29,169 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place enough replicas, still in need of 1
    2011-05-26 12:30:29,175 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 on 9000, call addBlock(/tmp/hadoop-cfadm/mapred/system/jobtracker.info, DFSCl
    ient_2146408809) from 169.193.181.212:55334: error: java.io.IOException: File /tmp/hadoop-cfadm/mapred/system/jobtracker.info could only be replicated to 0 n
    odes, instead of 1
    java.io.IOException: File /tmp/hadoop-cfadm/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)


    2. Also, Configured Capacity is 0, cannot put any file to HDFS.

    3. in datanode server, no error in logs, but tasktracker logs has the following suspicious thing:
    2011-05-25 23:36:10,839 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
    2011-05-25 23:36:10,839 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 41904: starting
    2011-05-25 23:36:10,852 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 41904: starting
    2011-05-25 23:36:10,853 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 41904: starting
    2011-05-25 23:36:10,853 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 41904: starting
    2011-05-25 23:36:10,853 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 41904: starting
    .....
    2011-05-25 23:36:10,855 INFO org.apache.hadoop.ipc.Server: IPC Server handler 63 on 41904: starting
    2011-05-25 23:36:10,950 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker up at: localhost/127.0.0.1:41904
    2011-05-25 23:36:10,950 INFO org.apache.hadoop.mapred.TaskTracker: Starting tracker tracker_loanps3d:localhost/127.0.0.1:41904


    I have tried all suggestions found so far, including
    1) remove hadoop-name and hadoop-data folders and reformat namenode;
    2) clean up all temp files/folders under /tmp;

    But nothing works.

    Your help is greatly appreciated.

    Thanks,

    RX
  • Xu, Richard at May 27, 2011 at 11:34 am
    That setting is 3.

    From: DAN
    Sent: Thursday, May 26, 2011 10:23 PM
    To: common-user@hadoop.apache.org; Xu, Richard [ICG-IT]
    Subject: Re:Unable to start hadoop-0.20.2 but able to start hadoop-0.20.203 cluster

    Hi, Richard

    Pay attention to "Not able to place enough replicas, still in need of 1". Pls confirm right
    setting of "dfs.replication" in hdfs-site.xml.

    Good luck!
    Dan
    --


    At 2011-05-27 08:01:37,"Xu, Richard " wrote:


    Hi Folks, >
    We try to get hbase and hadoop running on clusters, take 2 Solaris servers for now. >
    Because of the incompatibility issue between hbase and hadoop, we have to stick with hadoop 0.20.2-append release. >
    It is very straight forward to make hadoop-0.20.203 running, but stuck for several days with hadoop-0.20.2, even the official release, not the append version. >
    1. Once try to run start-mapred.sh(hadoop-daemon.sh --config $HADOOP_CONF_DIR start jobtracker), following errors shown in namenode and jobtracker logs: >
    2011-05-26 12:30:29,169 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place enough replicas, still in need of 1
    2011-05-26 12:30:29,175 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 on 9000, call addBlock(/tmp/hadoop-cfadm/mapred/system/jobtracker.info, DFSCl
    ient_2146408809) from 169.193.181.212:55334: error: java.io.IOException: File /tmp/hadoop-cfadm/mapred/system/jobtracker.info could only be replicated to 0 n
    odes, instead of 1
    java.io.IOException: File /tmp/hadoop-cfadm/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
    >

    >
    2. Also, Configured Capacity is 0, cannot put any file to HDFS. >
    3. in datanode server, no error in logs, but tasktracker logs has the following suspicious thing:
    2011-05-25 23:36:10,839 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
    2011-05-25 23:36:10,839 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 41904: starting
    2011-05-25 23:36:10,852 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 41904: starting
    2011-05-25 23:36:10,853 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 41904: starting
    2011-05-25 23:36:10,853 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 41904: starting
    2011-05-25 23:36:10,853 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 41904: starting
    .....
    2011-05-25 23:36:10,855 INFO org.apache.hadoop.ipc.Server: IPC Server handler 63 on 41904: starting
    2011-05-25 23:36:10,950 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker up at: localhost/127.0.0.1:41904
    2011-05-25 23:36:10,950 INFO org.apache.hadoop.mapred.TaskTracker: Starting tracker tracker_loanps3d:localhost/127.0.0.1:41904
    >

    >
    I have tried all suggestions found so far, including
    1) remove hadoop-name and hadoop-data folders and reformat namenode;
    2) clean up all temp files/folders under /tmp; >
    But nothing works. >
    Your help is greatly appreciated. >
    Thanks, >
    RX
  • Simon at May 27, 2011 at 12:31 pm
    First you need to make sure that your dfs daemons are running.
    You can start you namenode and datanode separately on the master and slave
    nodes, and see what happens with the following commands:

    hadoop namenode
    hadoop datanode

    The chancess are that your data node can not be started correctly.
    Let us know your error logs if there are errors.

    HTH~

    Thanks
    Simon

    2011/5/27 Xu, Richard <richard.xu@citi.com>
    That setting is 3.

    From: DAN
    Sent: Thursday, May 26, 2011 10:23 PM
    To: common-user@hadoop.apache.org; Xu, Richard [ICG-IT]
    Subject: Re:Unable to start hadoop-0.20.2 but able to start hadoop-0.20.203
    cluster

    Hi, Richard

    Pay attention to "Not able to place enough replicas, still in need of 1".
    Pls confirm right
    setting of "dfs.replication" in hdfs-site.xml.

    Good luck!
    Dan
    --


    At 2011-05-27 08:01:37,"Xu, Richard " <richard.xu@citi.com<mailto:
    richard.xu@citi.com>> wrote:


    Hi Folks,

    We try to get hbase and hadoop running on clusters, take 2 Solaris servers for now.

    Because of the incompatibility issue between hbase and hadoop, we have to
    stick with hadoop 0.20.2-append release.

    It is very straight forward to make hadoop-0.20.203 running, but stuck for
    several days with hadoop-0.20.2, even the official release, not the append
    version.

    1. Once try to run start-mapred.sh(hadoop-daemon.sh --config
    $HADOOP_CONF_DIR start jobtracker), following errors shown in namenode and
    jobtracker logs:

    2011-05-26 12:30:29,169 WARN
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place
    enough replicas, still in need of 1
    2011-05-26 12:30:29,175 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 4 on 9000, call addBlock(/tmp/hadoop-cfadm/mapred/system/
    jobtracker.info, DFSCl
    ient_2146408809) from 169.193.181.212:55334: error: java.io.IOException:
    File /tmp/hadoop-cfadm/mapred/system/jobtracker.info could only be
    replicated to 0 n
    odes, instead of 1
    java.io.IOException: File /tmp/hadoop-cfadm/mapred/system/jobtracker.infocould only be replicated to 0 nodes, instead of 1
    at
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
    at
    org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)


    2. Also, Configured Capacity is 0, cannot put any file to HDFS.

    3. in datanode server, no error in logs, but tasktracker logs has the
    following suspicious thing:
    2011-05-25 23:36:10,839 INFO org.apache.hadoop.ipc.Server: IPC Server
    Responder: starting
    2011-05-25 23:36:10,839 INFO org.apache.hadoop.ipc.Server: IPC Server
    listener on 41904: starting
    2011-05-25 23:36:10,852 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 0 on 41904: starting
    2011-05-25 23:36:10,853 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 1 on 41904: starting
    2011-05-25 23:36:10,853 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 2 on 41904: starting
    2011-05-25 23:36:10,853 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 3 on 41904: starting
    .....
    2011-05-25 23:36:10,855 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 63 on 41904: starting
    2011-05-25 23:36:10,950 INFO org.apache.hadoop.mapred.TaskTracker:
    TaskTracker up at: localhost/127.0.0.1:41904
    2011-05-25 23:36:10,950 INFO org.apache.hadoop.mapred.TaskTracker:
    Starting tracker tracker_loanps3d:localhost/127.0.0.1:41904


    I have tried all suggestions found so far, including
    1) remove hadoop-name and hadoop-data folders and reformat namenode;
    2) clean up all temp files/folders under /tmp;

    But nothing works.

    Your help is greatly appreciated.

    Thanks,

    RX

    --
    Regards,
    Simon
  • DAN at May 27, 2011 at 2:26 pm
    Hi, Richard

    You see you have "2 Solaris servers for now", and dfs.replication is setted as 3.
    These don't match.

    Good Luck
    Dan

    At 2011-05-27 19:34:10,"Xu, Richard " wrote:

    That setting is 3.

    From: DAN
    Sent: Thursday, May 26, 2011 10:23 PM
    To: common-user@hadoop.apache.org; Xu, Richard [ICG-IT]
    Subject: Re:Unable to start hadoop-0.20.2 but able to start hadoop-0.20.203 cluster

    Hi, Richard

    Pay attention to "Not able to place enough replicas, still in need of 1". Pls confirm right
    setting of "dfs.replication" in hdfs-site.xml.

    Good luck!
    Dan
    --


    At 2011-05-27 08:01:37,"Xu, Richard " wrote:


    Hi Folks,

    We try to get hbase and hadoop running on clusters, take 2 Solaris servers for now.

    Because of the incompatibility issue between hbase and hadoop, we have to stick with hadoop 0.20.2-append release.

    It is very straight forward to make hadoop-0.20.203 running, but stuck for several days with hadoop-0.20.2, even the official release, not the append version.

    1. Once try to run start-mapred.sh(hadoop-daemon.sh --config $HADOOP_CONF_DIR start jobtracker), following errors shown in namenode and jobtracker logs:

    2011-05-26 12:30:29,169 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place enough replicas, still in need of 1
    2011-05-26 12:30:29,175 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 on 9000, call addBlock(/tmp/hadoop-cfadm/mapred/system/jobtracker.info, DFSCl
    ient_2146408809) from 169.193.181.212:55334: error: java.io.IOException: File /tmp/hadoop-cfadm/mapred/system/jobtracker.info could only be replicated to 0 n
    odes, instead of 1
    java.io.IOException: File /tmp/hadoop-cfadm/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)


    2. Also, Configured Capacity is 0, cannot put any file to HDFS.

    3. in datanode server, no error in logs, but tasktracker logs has the following suspicious thing:
    2011-05-25 23:36:10,839 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
    2011-05-25 23:36:10,839 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 41904: starting
    2011-05-25 23:36:10,852 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 41904: starting
    2011-05-25 23:36:10,853 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 41904: starting
    2011-05-25 23:36:10,853 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 41904: starting
    2011-05-25 23:36:10,853 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 41904: starting
    .....
    2011-05-25 23:36:10,855 INFO org.apache.hadoop.ipc.Server: IPC Server handler 63 on 41904: starting
    2011-05-25 23:36:10,950 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker up at: localhost/127.0.0.1:41904
    2011-05-25 23:36:10,950 INFO org.apache.hadoop.mapred.TaskTracker: Starting tracker tracker_loanps3d:localhost/127.0.0.1:41904


    I have tried all suggestions found so far, including
    1) remove hadoop-name and hadoop-data folders and reformat namenode;
    2) clean up all temp files/folders under /tmp;

    But nothing works.

    Your help is greatly appreciated.

    Thanks,

    RX
  • Allen Wittenauer at May 27, 2011 at 3:42 pm

    On May 27, 2011, at 7:26 AM, DAN wrote:
    You see you have "2 Solaris servers for now", and dfs.replication is setted as 3.
    These don't match.

    That doesn't matter. HDFS will basically flag any files written with a warning that they are under-replicated.

    The problem is that the datanode processes aren't running and/or aren't communicating to the namenode. That's what the "java.io.IOException: File /tmp/hadoop-cfadm/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1" means.

    It should also be pointed out that writing to /tmp (the default) is a bad idea. This should get changed.

    Also, since you are running Solaris, check the FAQ on some settings you'll need to do in order to make Hadoop's broken username detection to work properly, amongst other things.
  • Xu, Richard at May 27, 2011 at 8:22 pm
    Hi Allen,

    Thanks a lot for your response.

    I agree with you that it does not matter with replication settings.

    What really bothered me is same environment, same configures, hadoop 0.20.203 takes us 3 mins, why 0.20.2 took 3 days.

    Can you pls. shed more light on how "to make Hadoop's broken username detection to work properly"?

    -----Original Message-----
    From: Allen Wittenauer
    Sent: Friday, May 27, 2011 11:42 AM
    To: common-user@hadoop.apache.org
    Cc: Xu, Richard [ICG-IT]
    Subject: Re: Unable to start hadoop-0.20.2 but able to start hadoop-0.20.203 cluster

    On May 27, 2011, at 7:26 AM, DAN wrote:
    You see you have "2 Solaris servers for now", and dfs.replication is setted as 3.
    These don't match.

    That doesn't matter. HDFS will basically flag any files written with a warning that they are under-replicated.

    The problem is that the datanode processes aren't running and/or aren't communicating to the namenode. That's what the "java.io.IOException: File /tmp/hadoop-cfadm/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1" means.

    It should also be pointed out that writing to /tmp (the default) is a bad idea. This should get changed.

    Also, since you are running Solaris, check the FAQ on some settings you'll need to do in order to make Hadoop's broken username detection to work properly, amongst other things.
  • Allen Wittenauer at May 27, 2011 at 10:52 pm

    On May 27, 2011, at 1:18 PM, Xu, Richard wrote:

    Hi Allen,

    Thanks a lot for your response.

    I agree with you that it does not matter with replication settings.

    What really bothered me is same environment, same configures, hadoop 0.20.203 takes us 3 mins, why 0.20.2 took 3 days.

    Can you pls. shed more light on how "to make Hadoop's broken username detection to work properly"?
    It's in the FAQ so that I don't have to do that.

    http://wiki.apache.org/hadoop/FAQ


    Also, check your logs. All your logs. Not just the namenode log.
  • Xu, Richard at May 27, 2011 at 9:33 pm
    Add more to that:

    I also tried start 0.20.2 on a linux machine in distributed mode, same error.

    I had successfully started 0.20.203 on this linux machine with same config.

    Seems that it is not related to Solaris.

    Could it caused by port? I checked a few, did not find anyone blocked.



    -----Original Message-----
    From: Xu, Richard [ICG-IT]
    Sent: Friday, May 27, 2011 4:18 PM
    To: 'Allen Wittenauer'; 'common-user@hadoop.apache.org'
    Subject: RE: Unable to start hadoop-0.20.2 but able to start hadoop-0.20.203 cluster

    Hi Allen,

    Thanks a lot for your response.

    I agree with you that it does not matter with replication settings.

    What really bothered me is same environment, same configures, hadoop 0.20.203 takes us 3 mins, why 0.20.2 took 3 days.

    Can you pls. shed more light on how "to make Hadoop's broken username detection to work properly"?

    -----Original Message-----
    From: Allen Wittenauer
    Sent: Friday, May 27, 2011 11:42 AM
    To: common-user@hadoop.apache.org
    Cc: Xu, Richard [ICG-IT]
    Subject: Re: Unable to start hadoop-0.20.2 but able to start hadoop-0.20.203 cluster

    On May 27, 2011, at 7:26 AM, DAN wrote:
    You see you have "2 Solaris servers for now", and dfs.replication is setted as 3.
    These don't match.

    That doesn't matter. HDFS will basically flag any files written with a warning that they are under-replicated.

    The problem is that the datanode processes aren't running and/or aren't communicating to the namenode. That's what the "java.io.IOException: File /tmp/hadoop-cfadm/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1" means.

    It should also be pointed out that writing to /tmp (the default) is a bad idea. This should get changed.

    Also, since you are running Solaris, check the FAQ on some settings you'll need to do in order to make Hadoop's broken username detection to work properly, amongst other things.
  • Konstantin Boudnik at May 27, 2011 at 10:30 am

    On Thu, May 26, 2011 at 07:01PM, Xu, Richard wrote:
    2011-05-26 12:30:29,175 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 on 9000, call addBlock(/tmp/hadoop-cfadm/mapred/system/jobtracker.info, DFSCl
    ient_2146408809) from 169.193.181.212:55334: error: java.io.IOException: File /tmp/hadoop-cfadm/mapred/system/jobtracker.info could only be replicated to 0 n
    odes, instead of 1
    java.io.IOException: File /tmp/hadoop-cfadm/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1
    Is your DFS up running, by any chance?

    Cos
  • Harsh J at May 27, 2011 at 1:21 pm
    Hello RX,

    Could you paste your DFS configuration and the DN end-to-end log into
    a mail/pastebin-link?
    On Fri, May 27, 2011 at 5:31 AM, Xu, Richard wrote:
    Hi Folks,

    We try to get hbase and hadoop running on clusters, take 2 Solaris servers for now.

    Because of the incompatibility issue between hbase and hadoop, we have to stick with hadoop 0.20.2-append release.

    It is very straight forward to make hadoop-0.20.203 running, but stuck for several days with hadoop-0.20.2, even the official release, not the append version.

    1. Once try to run start-mapred.sh(hadoop-daemon.sh --config $HADOOP_CONF_DIR start jobtracker), following errors shown in namenode and jobtracker logs:

    2011-05-26 12:30:29,169 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place enough replicas, still in need of 1
    2011-05-26 12:30:29,175 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 on 9000, call addBlock(/tmp/hadoop-cfadm/mapred/system/jobtracker.info, DFSCl
    ient_2146408809) from 169.193.181.212:55334: error: java.io.IOException: File /tmp/hadoop-cfadm/mapred/system/jobtracker.info could only be replicated to 0 n
    odes, instead of 1
    java.io.IOException: File /tmp/hadoop-cfadm/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)


    2. Also, Configured Capacity is 0, cannot put any file to HDFS.

    3. in datanode server, no error in logs, but tasktracker logs has the following suspicious thing:
    2011-05-25 23:36:10,839 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
    2011-05-25 23:36:10,839 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 41904: starting
    2011-05-25 23:36:10,852 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 41904: starting
    2011-05-25 23:36:10,853 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 41904: starting
    2011-05-25 23:36:10,853 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 41904: starting
    2011-05-25 23:36:10,853 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 41904: starting
    .....
    2011-05-25 23:36:10,855 INFO org.apache.hadoop.ipc.Server: IPC Server handler 63 on 41904: starting
    2011-05-25 23:36:10,950 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker up at: localhost/127.0.0.1:41904
    2011-05-25 23:36:10,950 INFO org.apache.hadoop.mapred.TaskTracker: Starting tracker tracker_loanps3d:localhost/127.0.0.1:41904


    I have tried all suggestions found so far, including
    1) remove hadoop-name and hadoop-data folders and reformat namenode;
    2) clean up all temp files/folders under /tmp;

    But nothing works.

    Your help is greatly appreciated.

    Thanks,

    RX


    --
    Harsh J
  • Srinivas at Jan 25, 2012 at 7:01 am

    Harsh J writes:


    Hello RX,

    Could you paste your DFS configuration and the DN end-to-end log into
    a mail/pastebin-link?
    On Fri, May 27, 2011 at 5:31 AM, Xu, Richard wrote:
    Hi Folks,

    We try to get hbase and hadoop running on clusters, take 2 Solaris servers
    for now.
    Because of the incompatibility issue between hbase and hadoop, we have to
    stick with hadoop
    0.20.2-append release.
    It is very straight forward to make hadoop-0.20.203 running, but stuck for
    several days with
    hadoop-0.20.2, even the official release, not the append version.
    1. Once try to run start-mapred.sh(hadoop-daemon.sh --config
    $HADOOP_CONF_DIR start jobtracker),
    following errors shown in namenode and jobtracker logs:
    2011-05-26 12:30:29,169 WARN
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to
    place enough replicas, still in need of 1
    2011-05-26 12:30:29,175 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 4 on 9000, call
    addBlock(/tmp/hadoop-cfadm/mapred/system/jobtracker.info, DFSCl
    ient_2146408809) from 169.193.181.212:55334: error: java.io.IOException:
    File
    /tmp/hadoop-cfadm/mapred/system/jobtracker.info could only be replicated to 0
    n
    odes, instead of 1
    java.io.IOException: File /tmp/hadoop-cfadm/mapred/system/jobtracker.info
    could only be
    replicated to 0 nodes, instead of 1
    at
    org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesys
    tem.java:1271)
    at
    org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.jav
    a:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)


    2. Also, Configured Capacity is 0, cannot put any file to HDFS.

    3. in datanode server, no error in logs, but tasktracker logs has the
    following suspicious thing:
    2011-05-25 23:36:10,839 INFO org.apache.hadoop.ipc.Server: IPC Server
    Responder: starting
    2011-05-25 23:36:10,839 INFO org.apache.hadoop.ipc.Server: IPC Server
    listener on 41904: starting
    2011-05-25 23:36:10,852 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 0 on 41904: starting
    2011-05-25 23:36:10,853 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 1 on 41904: starting
    2011-05-25 23:36:10,853 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 2 on 41904: starting
    2011-05-25 23:36:10,853 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 3 on 41904: starting
    .....
    2011-05-25 23:36:10,855 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 63 on 41904: starting
    2011-05-25 23:36:10,950 INFO org.apache.hadoop.mapred.TaskTracker:
    TaskTracker up at: localhost/127.0.0.1:41904
    2011-05-25 23:36:10,950 INFO org.apache.hadoop.mapred.TaskTracker: Starting
    tracker tracker_loanps3d:localhost/127.0.0.1:41904

    I have tried all suggestions found so far, including
    1) remove hadoop-name and hadoop-data folders and reformat namenode;
    2) clean up all temp files/folders under /tmp;

    But nothing works.

    Your help is greatly appreciated.

    Thanks,

    RX
    Hi,


    I am able to start name node and data node,but while starting the
    jobatracker,it's troughing an error like

    FATAL mapred.JobTracker: java.net.BindException: Problem binding to
    localhost/127.0.0.1:5102 : Address already in use

    kindly help me ASAP......................


    regards,
    Srinivas
  • Harsh J at Jan 25, 2012 at 7:18 am
    Hey Srinivas,

    Best to start your own new thread for questions instead of digging up
    an older one, but it seems like you already have JobTracker running,
    or something else bound to the port 5102 on your machine. I'd just
    check if JobTracker is already running, that might mostly be it.
    On Wed, Jan 25, 2012 at 12:24 PM, srinivas wrote:
    Hi,


    I am able to start name node and data node,but while starting the
    jobatracker,it's troughing an error like

    FATAL mapred.JobTracker: java.net.BindException: Problem binding to
    localhost/127.0.0.1:5102 : Address already in use

    kindly help me ASAP......................


    regards,
    Srinivas




    --
    Harsh J
    Customer Ops. Engineer, Cloudera

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedMay 27, '11 at 1:54a
activeJan 25, '12 at 7:18a
posts13
users7
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase