FAQ
Hi Everyone:

I am quite new to hadoop here. I am attempting to set up Hadoop locally in
two machines, connected by LAN. Both of them pass the single-node test.
However, I failed in two-node cluster setup, as shown in the 2 cases below:

1) set one as dedicated namenode and the other as dedicated datanode
2) set one as both name- and data-node, and the other as just datanode

I launch *start-dfs.sh *on the namenode. Since I have all the *ssh *issues
cleared, thus I can always observe the startup of daemon in every datanode.
However, by website of *http://(URI of namenode):50070 *it shows only 0 live
node for (1) and 1 live node for (2), which is the same as the output by
command-line *hadoop dfsadmin -report*

Generally it appears that from the namenode you cannot observe the remote
datanode alive, let alone a normal across-node MapReduce execution.

Could anyone give some hints / instructions at this point? I really
appreciate it!

Thank.

Best Regards
Yours Sincerely

Jingwei Lu

Search Discussions

  • GOEKE, MATTHEW (AG/1000) at Jun 27, 2011 at 8:28 pm
    Did you make sure to define the datanode/tasktracker in the slaves file in your conf directory and push that to both machines? Also have you checked the logs on either to see if there are any errors?

    Matt

    -----Original Message-----
    From: Jingwei Lu
    Sent: Monday, June 27, 2011 3:24 PM
    To: HADOOP MLIST
    Subject: Why I cannot see live nodes in a LAN-based cluster setup?

    Hi Everyone:

    I am quite new to hadoop here. I am attempting to set up Hadoop locally in
    two machines, connected by LAN. Both of them pass the single-node test.
    However, I failed in two-node cluster setup, as shown in the 2 cases below:

    1) set one as dedicated namenode and the other as dedicated datanode
    2) set one as both name- and data-node, and the other as just datanode

    I launch *start-dfs.sh *on the namenode. Since I have all the *ssh *issues
    cleared, thus I can always observe the startup of daemon in every datanode.
    However, by website of *http://(URI of namenode):50070 *it shows only 0 live
    node for (1) and 1 live node for (2), which is the same as the output by
    command-line *hadoop dfsadmin -report*

    Generally it appears that from the namenode you cannot observe the remote
    datanode alive, let alone a normal across-node MapReduce execution.

    Could anyone give some hints / instructions at this point? I really
    appreciate it!

    Thank.

    Best Regards
    Yours Sincerely

    Jingwei Lu
    This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled
    to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and
    all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited.

    All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto, including its
    subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of "Viruses" or other "Malware".
    Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such code transmitted by or accompanying
    this e-mail or any attachment.


    The information contained in this email may be subject to the export control laws and regulations of the United States, potentially
    including but not limited to the Export Administration Regulations (EAR) and sanctions regulations issued by the U.S. Department of
    Treasury, Office of Foreign Asset Controls (OFAC). As a recipient of this information you are obligated to comply with all
    applicable U.S. export laws and regulations.
  • Jingwei Lu at Jun 27, 2011 at 8:58 pm
    Hi,

    I just manually modify the masters & slaves files in the both machines.

    I found something wrong in the log files, as shown below:

    -- Master :
    namenote.log:

    ****************************************
    2011-06-27 13:44:47,055 INFO org.mortbay.log: jetty-6.1.14
    2011-06-27 13:44:47,394 INFO org.mortbay.log: Started
    SelectChannelConnector@0.0.0.0:50070
    2011-06-27 13:44:47,395 INFO
    org.apache.hadoop.hdfs.server.namenode.NameNode: Web-server up at:
    0.0.0.0:50070
    2011-06-27 13:44:47,395 INFO org.apache.hadoop.ipc.Server: IPC Server
    Responder: starting
    2011-06-27 13:44:47,395 INFO org.apache.hadoop.ipc.Server: IPC Server
    listener on 54310: starting
    2011-06-27 13:44:47,396 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 0 on 54310: starting
    2011-06-27 13:44:47,397 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 1 on 54310: starting
    2011-06-27 13:44:47,397 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 2 on 54310: starting
    2011-06-27 13:44:47,397 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 3 on 54310: starting
    2011-06-27 13:44:47,402 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 4 on 54310: starting
    2011-06-27 13:44:47,404 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 5 on 54310: starting
    2011-06-27 13:44:47,406 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 6 on 54310: starting
    2011-06-27 13:44:47,406 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 7 on 54310: starting
    2011-06-27 13:44:47,406 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 8 on 54310: starting
    2011-06-27 13:44:47,408 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 9 on 54310: starting
    2011-06-27 13:44:47,500 INFO org.apache.hadoop.ipc.Server: Error register
    getProtocolVersion
    java.lang.IllegalArgumentException: Duplicate metricsName:getProtocolVersion
    at
    org.apache.hadoop.metrics.util.MetricsRegistry.add(MetricsRegistry.java:53)
    at
    org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.(MetricsTimeVaryingRate.java:99)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:416)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
    2011-06-27 13:45:02,572 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
    NameSystem.registerDatanode: node registration from 127.0.0.1:50010 storage
    DS-87816363-127.0.0.1-50010-1309207502566
    ****************************************


    -- slave:
    datanode.log:

    ****************************************
    1 2011-06-27 13:45:00,335 INFO
    org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
    2 /************************************************************
    3 STARTUP_MSG: Starting DataNode
    4 STARTUP_MSG: host = hdl.ucsd.edu/127.0.0.1
    5 STARTUP_MSG: args = []
    6 STARTUP_MSG: version = 0.20.2
    7 STARTUP_MSG: build =
    https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r
    911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
    8 ************************************************************/
    9 2011-06-27 13:45:02,476 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 0 time(s).
    10 2011-06-27 13:45:03,549 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 1 time(s).
    11 2011-06-27 13:45:04,552 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 2 time(s).
    12 2011-06-27 13:45:05,609 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 3 time(s).
    13 2011-06-27 13:45:06,640 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 4 time(s).
    14 2011-06-27 13:45:07,643 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 5 time(s).
    15 2011-06-27 13:45:08,646 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 6 time(s).
    16 2011-06-27 13:45:09,661 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 7 time(s).
    17 2011-06-27 13:45:10,664 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 8 time(s).
    18 2011-06-27 13:45:11,678 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 9 time(s).
    19 2011-06-27 13:45:11,679 INFO org.apache.hadoop.ipc.RPC: Server at
    hdl.ucsd.edu/127.0.0.1:54310 not available yet, Zzzzz...
    ****************************************

    (just guess, is this due to some porting problem?)

    Any comments will be greatly appreciated!

    Best Regards
    Yours Sincerely

    Jingwei Lu


    On Mon, Jun 27, 2011 at 1:28 PM, GOEKE, MATTHEW (AG/1000) wrote:

    Did you make sure to define the datanode/tasktracker in the slaves file in
    your conf directory and push that to both machines? Also have you checked
    the logs on either to see if there are any errors?

    Matt

    -----Original Message-----
    From: Jingwei Lu
    Sent: Monday, June 27, 2011 3:24 PM
    To: HADOOP MLIST
    Subject: Why I cannot see live nodes in a LAN-based cluster setup?

    Hi Everyone:

    I am quite new to hadoop here. I am attempting to set up Hadoop locally in
    two machines, connected by LAN. Both of them pass the single-node test.
    However, I failed in two-node cluster setup, as shown in the 2 cases below:

    1) set one as dedicated namenode and the other as dedicated datanode
    2) set one as both name- and data-node, and the other as just datanode

    I launch *start-dfs.sh *on the namenode. Since I have all the *ssh *issues
    cleared, thus I can always observe the startup of daemon in every datanode.
    However, by website of *http://(URI of namenode):50070 *it shows only 0
    live
    node for (1) and 1 live node for (2), which is the same as the output by
    command-line *hadoop dfsadmin -report*

    Generally it appears that from the namenode you cannot observe the remote
    datanode alive, let alone a normal across-node MapReduce execution.

    Could anyone give some hints / instructions at this point? I really
    appreciate it!

    Thank.

    Best Regards
    Yours Sincerely

    Jingwei Lu
    This e-mail message may contain privileged and/or confidential information,
    and is intended to be received only by persons entitled
    to receive such information. If you have received this e-mail in error,
    please notify the sender immediately. Please delete it and
    all attachments from any servers, hard drives or any other media. Other use
    of this e-mail by you is strictly prohibited.

    All e-mails and attachments sent and received are subject to monitoring,
    reading and archival by Monsanto, including its
    subsidiaries. The recipient of this e-mail is solely responsible for
    checking for the presence of "Viruses" or other "Malware".
    Monsanto, along with its subsidiaries, accepts no liability for any damage
    caused by any such code transmitted by or accompanying
    this e-mail or any attachment.


    The information contained in this email may be subject to the export
    control laws and regulations of the United States, potentially
    including but not limited to the Export Administration Regulations (EAR)
    and sanctions regulations issued by the U.S. Department of
    Treasury, Office of Foreign Asset Controls (OFAC). As a recipient of this
    information you are obligated to comply with all
    applicable U.S. export laws and regulations.
  • Jeff Schmitz at Jun 27, 2011 at 9:08 pm
    http://www.mentby.com/tim-robertson/error-register-getprotocolversion.html



    -----Original Message-----
    From: Jingwei Lu
    Sent: Monday, June 27, 2011 3:58 PM
    To: common-user@hadoop.apache.org
    Subject: Re: Why I cannot see live nodes in a LAN-based cluster setup?

    Hi,

    I just manually modify the masters & slaves files in the both machines.

    I found something wrong in the log files, as shown below:

    -- Master :
    namenote.log:

    ****************************************
    2011-06-27 13:44:47,055 INFO org.mortbay.log: jetty-6.1.14
    2011-06-27 13:44:47,394 INFO org.mortbay.log: Started
    SelectChannelConnector@0.0.0.0:50070
    2011-06-27 13:44:47,395 INFO
    org.apache.hadoop.hdfs.server.namenode.NameNode: Web-server up at:
    0.0.0.0:50070
    2011-06-27 13:44:47,395 INFO org.apache.hadoop.ipc.Server: IPC Server
    Responder: starting
    2011-06-27 13:44:47,395 INFO org.apache.hadoop.ipc.Server: IPC Server
    listener on 54310: starting
    2011-06-27 13:44:47,396 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 0 on 54310: starting
    2011-06-27 13:44:47,397 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 1 on 54310: starting
    2011-06-27 13:44:47,397 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 2 on 54310: starting
    2011-06-27 13:44:47,397 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 3 on 54310: starting
    2011-06-27 13:44:47,402 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 4 on 54310: starting
    2011-06-27 13:44:47,404 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 5 on 54310: starting
    2011-06-27 13:44:47,406 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 6 on 54310: starting
    2011-06-27 13:44:47,406 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 7 on 54310: starting
    2011-06-27 13:44:47,406 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 8 on 54310: starting
    2011-06-27 13:44:47,408 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 9 on 54310: starting
    2011-06-27 13:44:47,500 INFO org.apache.hadoop.ipc.Server: Error register
    getProtocolVersion
    java.lang.IllegalArgumentException: Duplicate metricsName:getProtocolVersion
    at
    org.apache.hadoop.metrics.util.MetricsRegistry.add(MetricsRegistry.java:53)
    at
    org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.(MetricsTimeVaryingRate.java:99)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:416)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
    2011-06-27 13:45:02,572 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
    NameSystem.registerDatanode: node registration from 127.0.0.1:50010 storage
    DS-87816363-127.0.0.1-50010-1309207502566
    ****************************************


    -- slave:
    datanode.log:

    ****************************************
    1 2011-06-27 13:45:00,335 INFO
    org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
    2 /************************************************************
    3 STARTUP_MSG: Starting DataNode
    4 STARTUP_MSG: host = hdl.ucsd.edu/127.0.0.1
    5 STARTUP_MSG: args = []
    6 STARTUP_MSG: version = 0.20.2
    7 STARTUP_MSG: build =
    https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r
    911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
    8 ************************************************************/
    9 2011-06-27 13:45:02,476 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 0 time(s).
    10 2011-06-27 13:45:03,549 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 1 time(s).
    11 2011-06-27 13:45:04,552 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 2 time(s).
    12 2011-06-27 13:45:05,609 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 3 time(s).
    13 2011-06-27 13:45:06,640 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 4 time(s).
    14 2011-06-27 13:45:07,643 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 5 time(s).
    15 2011-06-27 13:45:08,646 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 6 time(s).
    16 2011-06-27 13:45:09,661 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 7 time(s).
    17 2011-06-27 13:45:10,664 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 8 time(s).
    18 2011-06-27 13:45:11,678 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 9 time(s).
    19 2011-06-27 13:45:11,679 INFO org.apache.hadoop.ipc.RPC: Server at
    hdl.ucsd.edu/127.0.0.1:54310 not available yet, Zzzzz...
    ****************************************

    (just guess, is this due to some porting problem?)

    Any comments will be greatly appreciated!

    Best Regards
    Yours Sincerely

    Jingwei Lu


    On Mon, Jun 27, 2011 at 1:28 PM, GOEKE, MATTHEW (AG/1000) wrote:

    Did you make sure to define the datanode/tasktracker in the slaves file in
    your conf directory and push that to both machines? Also have you checked
    the logs on either to see if there are any errors?

    Matt

    -----Original Message-----
    From: Jingwei Lu
    Sent: Monday, June 27, 2011 3:24 PM
    To: HADOOP MLIST
    Subject: Why I cannot see live nodes in a LAN-based cluster setup?

    Hi Everyone:

    I am quite new to hadoop here. I am attempting to set up Hadoop locally in
    two machines, connected by LAN. Both of them pass the single-node test.
    However, I failed in two-node cluster setup, as shown in the 2 cases below:

    1) set one as dedicated namenode and the other as dedicated datanode
    2) set one as both name- and data-node, and the other as just datanode

    I launch *start-dfs.sh *on the namenode. Since I have all the *ssh *issues
    cleared, thus I can always observe the startup of daemon in every datanode.
    However, by website of *http://(URI of namenode):50070 *it shows only 0
    live
    node for (1) and 1 live node for (2), which is the same as the output by
    command-line *hadoop dfsadmin -report*

    Generally it appears that from the namenode you cannot observe the remote
    datanode alive, let alone a normal across-node MapReduce execution.

    Could anyone give some hints / instructions at this point? I really
    appreciate it!

    Thank.

    Best Regards
    Yours Sincerely

    Jingwei Lu
    This e-mail message may contain privileged and/or confidential information,
    and is intended to be received only by persons entitled
    to receive such information. If you have received this e-mail in error,
    please notify the sender immediately. Please delete it and
    all attachments from any servers, hard drives or any other media. Other use
    of this e-mail by you is strictly prohibited.

    All e-mails and attachments sent and received are subject to monitoring,
    reading and archival by Monsanto, including its
    subsidiaries. The recipient of this e-mail is solely responsible for
    checking for the presence of "Viruses" or other "Malware".
    Monsanto, along with its subsidiaries, accepts no liability for any damage
    caused by any such code transmitted by or accompanying
    this e-mail or any attachment.


    The information contained in this email may be subject to the export
    control laws and regulations of the United States, potentially
    including but not limited to the Export Administration Regulations (EAR)
    and sanctions regulations issued by the U.S. Department of
    Treasury, Office of Foreign Asset Controls (OFAC). As a recipient of this
    information you are obligated to comply with all
    applicable U.S. export laws and regulations.
  • GOEKE, MATTHEW (AG/1000) at Jun 27, 2011 at 9:23 pm
    As a follow-up to what Jeff posted: go ahead and ignore the message you got on the NN for now.

    If you look at the address that the DN log shows it is 127.0.0.1 and the ip:port it is trying to connect to for the NN is 127.0.0.1:54310 ---> it is trying to bind to itself as if it was still in single machine mode. Make sure that you have correctly pushed the URI for the NN into the config files on both machines and then bounce DFS.

    Matt

    -----Original Message-----
    From: Jeff.Schmitz@shell.com
    Sent: Monday, June 27, 2011 4:08 PM
    To: common-user@hadoop.apache.org
    Subject: RE: Why I cannot see live nodes in a LAN-based cluster setup?

    http://www.mentby.com/tim-robertson/error-register-getprotocolversion.html



    -----Original Message-----
    From: Jingwei Lu
    Sent: Monday, June 27, 2011 3:58 PM
    To: common-user@hadoop.apache.org
    Subject: Re: Why I cannot see live nodes in a LAN-based cluster setup?

    Hi,

    I just manually modify the masters & slaves files in the both machines.

    I found something wrong in the log files, as shown below:

    -- Master :
    namenote.log:

    ****************************************
    2011-06-27 13:44:47,055 INFO org.mortbay.log: jetty-6.1.14
    2011-06-27 13:44:47,394 INFO org.mortbay.log: Started
    SelectChannelConnector@0.0.0.0:50070
    2011-06-27 13:44:47,395 INFO
    org.apache.hadoop.hdfs.server.namenode.NameNode: Web-server up at:
    0.0.0.0:50070
    2011-06-27 13:44:47,395 INFO org.apache.hadoop.ipc.Server: IPC Server
    Responder: starting
    2011-06-27 13:44:47,395 INFO org.apache.hadoop.ipc.Server: IPC Server
    listener on 54310: starting
    2011-06-27 13:44:47,396 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 0 on 54310: starting
    2011-06-27 13:44:47,397 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 1 on 54310: starting
    2011-06-27 13:44:47,397 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 2 on 54310: starting
    2011-06-27 13:44:47,397 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 3 on 54310: starting
    2011-06-27 13:44:47,402 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 4 on 54310: starting
    2011-06-27 13:44:47,404 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 5 on 54310: starting
    2011-06-27 13:44:47,406 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 6 on 54310: starting
    2011-06-27 13:44:47,406 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 7 on 54310: starting
    2011-06-27 13:44:47,406 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 8 on 54310: starting
    2011-06-27 13:44:47,408 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 9 on 54310: starting
    2011-06-27 13:44:47,500 INFO org.apache.hadoop.ipc.Server: Error register
    getProtocolVersion
    java.lang.IllegalArgumentException: Duplicate metricsName:getProtocolVersion
    at
    org.apache.hadoop.metrics.util.MetricsRegistry.add(MetricsRegistry.java:53)
    at
    org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.(MetricsTimeVaryingRate.java:99)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:416)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
    2011-06-27 13:45:02,572 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
    NameSystem.registerDatanode: node registration from 127.0.0.1:50010 storage
    DS-87816363-127.0.0.1-50010-1309207502566
    ****************************************


    -- slave:
    datanode.log:

    ****************************************
    1 2011-06-27 13:45:00,335 INFO
    org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
    2 /************************************************************
    3 STARTUP_MSG: Starting DataNode
    4 STARTUP_MSG: host = hdl.ucsd.edu/127.0.0.1
    5 STARTUP_MSG: args = []
    6 STARTUP_MSG: version = 0.20.2
    7 STARTUP_MSG: build =
    https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r
    911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
    8 ************************************************************/
    9 2011-06-27 13:45:02,476 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 0 time(s).
    10 2011-06-27 13:45:03,549 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 1 time(s).
    11 2011-06-27 13:45:04,552 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 2 time(s).
    12 2011-06-27 13:45:05,609 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 3 time(s).
    13 2011-06-27 13:45:06,640 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 4 time(s).
    14 2011-06-27 13:45:07,643 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 5 time(s).
    15 2011-06-27 13:45:08,646 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 6 time(s).
    16 2011-06-27 13:45:09,661 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 7 time(s).
    17 2011-06-27 13:45:10,664 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 8 time(s).
    18 2011-06-27 13:45:11,678 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 9 time(s).
    19 2011-06-27 13:45:11,679 INFO org.apache.hadoop.ipc.RPC: Server at
    hdl.ucsd.edu/127.0.0.1:54310 not available yet, Zzzzz...
    ****************************************

    (just guess, is this due to some porting problem?)

    Any comments will be greatly appreciated!

    Best Regards
    Yours Sincerely

    Jingwei Lu


    On Mon, Jun 27, 2011 at 1:28 PM, GOEKE, MATTHEW (AG/1000) wrote:

    Did you make sure to define the datanode/tasktracker in the slaves file in
    your conf directory and push that to both machines? Also have you checked
    the logs on either to see if there are any errors?

    Matt

    -----Original Message-----
    From: Jingwei Lu
    Sent: Monday, June 27, 2011 3:24 PM
    To: HADOOP MLIST
    Subject: Why I cannot see live nodes in a LAN-based cluster setup?

    Hi Everyone:

    I am quite new to hadoop here. I am attempting to set up Hadoop locally in
    two machines, connected by LAN. Both of them pass the single-node test.
    However, I failed in two-node cluster setup, as shown in the 2 cases below:

    1) set one as dedicated namenode and the other as dedicated datanode
    2) set one as both name- and data-node, and the other as just datanode

    I launch *start-dfs.sh *on the namenode. Since I have all the *ssh *issues
    cleared, thus I can always observe the startup of daemon in every datanode.
    However, by website of *http://(URI of namenode):50070 *it shows only 0
    live
    node for (1) and 1 live node for (2), which is the same as the output by
    command-line *hadoop dfsadmin -report*

    Generally it appears that from the namenode you cannot observe the remote
    datanode alive, let alone a normal across-node MapReduce execution.

    Could anyone give some hints / instructions at this point? I really
    appreciate it!

    Thank.

    Best Regards
    Yours Sincerely

    Jingwei Lu
    This e-mail message may contain privileged and/or confidential information,
    and is intended to be received only by persons entitled
    to receive such information. If you have received this e-mail in error,
    please notify the sender immediately. Please delete it and
    all attachments from any servers, hard drives or any other media. Other use
    of this e-mail by you is strictly prohibited.

    All e-mails and attachments sent and received are subject to monitoring,
    reading and archival by Monsanto, including its
    subsidiaries. The recipient of this e-mail is solely responsible for
    checking for the presence of "Viruses" or other "Malware".
    Monsanto, along with its subsidiaries, accepts no liability for any damage
    caused by any such code transmitted by or accompanying
    this e-mail or any attachment.


    The information contained in this email may be subject to the export
    control laws and regulations of the United States, potentially
    including but not limited to the Export Administration Regulations (EAR)
    and sanctions regulations issued by the U.S. Department of
    Treasury, Office of Foreign Asset Controls (OFAC). As a recipient of this
    information you are obligated to comply with all
    applicable U.S. export laws and regulations.
  • Jingwei Lu at Jun 27, 2011 at 10:38 pm
    Hi Matt and Jeff:

    Thanks a lot for your instructions. I corrected the mistakes in conf files
    of DN, and now the log on DN becomes:

    2011-06-27 15:32:36,025 INFO org.apache.hadoop.ipc.Client: Retrying connect
    to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 0 time(s).
    2011-06-27 15:32:37,028 INFO org.apache.hadoop.ipc.Client: Retrying connect
    to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 1 time(s).
    2011-06-27 15:32:38,031 INFO org.apache.hadoop.ipc.Client: Retrying connect
    to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 2 time(s).
    2011-06-27 15:32:39,034 INFO org.apache.hadoop.ipc.Client: Retrying connect
    to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 3 time(s).
    2011-06-27 15:32:40,037 INFO org.apache.hadoop.ipc.Client: Retrying connect
    to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 4 time(s).
    2011-06-27 15:32:41,040 INFO org.apache.hadoop.ipc.Client: Retrying connect
    to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 5 time(s).
    2011-06-27 15:32:42,043 INFO org.apache.hadoop.ipc.Client: Retrying connect
    to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 6 time(s).
    2011-06-27 15:32:43,046 INFO org.apache.hadoop.ipc.Client: Retrying connect
    to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 7 time(s).
    2011-06-27 15:32:44,049 INFO org.apache.hadoop.ipc.Client: Retrying connect
    to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 8 time(s).
    2011-06-27 15:32:45,052 INFO org.apache.hadoop.ipc.Client: Retrying connect
    to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 9 time(s).
    2011-06-27 15:32:45,053 INFO org.apache.hadoop.ipc.RPC: Server at
    clock.ucsd.edu/132.239.95.91:54310 not available yet, Zzzzz...

    Seems DN is trying to bind with NN but always fails...



    Best Regards
    Yours Sincerely

    Jingwei Lu


    On Mon, Jun 27, 2011 at 2:22 PM, GOEKE, MATTHEW (AG/1000) wrote:

    As a follow-up to what Jeff posted: go ahead and ignore the message you got
    on the NN for now.

    If you look at the address that the DN log shows it is 127.0.0.1 and the
    ip:port it is trying to connect to for the NN is 127.0.0.1:54310 ---> it
    is trying to bind to itself as if it was still in single machine mode. Make
    sure that you have correctly pushed the URI for the NN into the config files
    on both machines and then bounce DFS.

    Matt

    -----Original Message-----
    From: Jeff.Schmitz@shell.com
    Sent: Monday, June 27, 2011 4:08 PM
    To: common-user@hadoop.apache.org
    Subject: RE: Why I cannot see live nodes in a LAN-based cluster setup?

    http://www.mentby.com/tim-robertson/error-register-getprotocolversion.html



    -----Original Message-----
    From: Jingwei Lu
    Sent: Monday, June 27, 2011 3:58 PM
    To: common-user@hadoop.apache.org
    Subject: Re: Why I cannot see live nodes in a LAN-based cluster setup?

    Hi,

    I just manually modify the masters & slaves files in the both machines.

    I found something wrong in the log files, as shown below:

    -- Master :
    namenote.log:

    ****************************************
    2011-06-27 13:44:47,055 INFO org.mortbay.log: jetty-6.1.14
    2011-06-27 13:44:47,394 INFO org.mortbay.log: Started
    SelectChannelConnector@0.0.0.0:50070
    2011-06-27 13:44:47,395 INFO
    org.apache.hadoop.hdfs.server.namenode.NameNode: Web-server up at:
    0.0.0.0:50070
    2011-06-27 13:44:47,395 INFO org.apache.hadoop.ipc.Server: IPC Server
    Responder: starting
    2011-06-27 13:44:47,395 INFO org.apache.hadoop.ipc.Server: IPC Server
    listener on 54310: starting
    2011-06-27 13:44:47,396 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 0 on 54310: starting
    2011-06-27 13:44:47,397 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 1 on 54310: starting
    2011-06-27 13:44:47,397 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 2 on 54310: starting
    2011-06-27 13:44:47,397 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 3 on 54310: starting
    2011-06-27 13:44:47,402 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 4 on 54310: starting
    2011-06-27 13:44:47,404 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 5 on 54310: starting
    2011-06-27 13:44:47,406 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 6 on 54310: starting
    2011-06-27 13:44:47,406 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 7 on 54310: starting
    2011-06-27 13:44:47,406 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 8 on 54310: starting
    2011-06-27 13:44:47,408 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 9 on 54310: starting
    2011-06-27 13:44:47,500 INFO org.apache.hadoop.ipc.Server: Error register
    getProtocolVersion
    java.lang.IllegalArgumentException: Duplicate
    metricsName:getProtocolVersion
    at
    org.apache.hadoop.metrics.util.MetricsRegistry.add(MetricsRegistry.java:53)
    at

    org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.<init>(MetricsTimeVaryingRate.java:89)
    at

    org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.<init>(MetricsTimeVaryingRate.java:99)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:416)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
    2011-06-27 13:45:02,572 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
    NameSystem.registerDatanode: node registration from 127.0.0.1:50010storage
    DS-87816363-127.0.0.1-50010-1309207502566
    ****************************************


    -- slave:
    datanode.log:

    ****************************************
    1 2011-06-27 13:45:00,335 INFO
    org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
    2 /************************************************************
    3 STARTUP_MSG: Starting DataNode
    4 STARTUP_MSG: host = hdl.ucsd.edu/127.0.0.1
    5 STARTUP_MSG: args = []
    6 STARTUP_MSG: version = 0.20.2
    7 STARTUP_MSG: build =
    https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r
    911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
    8 ************************************************************/
    9 2011-06-27 13:45:02,476 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 0 time(s).
    10 2011-06-27 13:45:03,549 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 1 time(s).
    11 2011-06-27 13:45:04,552 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 2 time(s).
    12 2011-06-27 13:45:05,609 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 3 time(s).
    13 2011-06-27 13:45:06,640 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 4 time(s).
    14 2011-06-27 13:45:07,643 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 5 time(s).
    15 2011-06-27 13:45:08,646 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 6 time(s).
    16 2011-06-27 13:45:09,661 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 7 time(s).
    17 2011-06-27 13:45:10,664 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 8 time(s).
    18 2011-06-27 13:45:11,678 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 9 time(s).
    19 2011-06-27 13:45:11,679 INFO org.apache.hadoop.ipc.RPC: Server at
    hdl.ucsd.edu/127.0.0.1:54310 not available yet, Zzzzz...
    ****************************************

    (just guess, is this due to some porting problem?)

    Any comments will be greatly appreciated!

    Best Regards
    Yours Sincerely

    Jingwei Lu



    On Mon, Jun 27, 2011 at 1:28 PM, GOEKE, MATTHEW (AG/1000) <
    matthew.goeke@monsanto.com> wrote:
    Did you make sure to define the datanode/tasktracker in the slaves file in
    your conf directory and push that to both machines? Also have you checked
    the logs on either to see if there are any errors?

    Matt

    -----Original Message-----
    From: Jingwei Lu
    Sent: Monday, June 27, 2011 3:24 PM
    To: HADOOP MLIST
    Subject: Why I cannot see live nodes in a LAN-based cluster setup?

    Hi Everyone:

    I am quite new to hadoop here. I am attempting to set up Hadoop locally in
    two machines, connected by LAN. Both of them pass the single-node test.
    However, I failed in two-node cluster setup, as shown in the 2 cases below:
    1) set one as dedicated namenode and the other as dedicated datanode
    2) set one as both name- and data-node, and the other as just datanode

    I launch *start-dfs.sh *on the namenode. Since I have all the *ssh *issues
    cleared, thus I can always observe the startup of daemon in every datanode.
    However, by website of *http://(URI of namenode):50070 *it shows only 0
    live
    node for (1) and 1 live node for (2), which is the same as the output by
    command-line *hadoop dfsadmin -report*

    Generally it appears that from the namenode you cannot observe the remote
    datanode alive, let alone a normal across-node MapReduce execution.

    Could anyone give some hints / instructions at this point? I really
    appreciate it!

    Thank.

    Best Regards
    Yours Sincerely

    Jingwei Lu
    This e-mail message may contain privileged and/or confidential
    information,
    and is intended to be received only by persons entitled
    to receive such information. If you have received this e-mail in error,
    please notify the sender immediately. Please delete it and
    all attachments from any servers, hard drives or any other media. Other use
    of this e-mail by you is strictly prohibited.

    All e-mails and attachments sent and received are subject to monitoring,
    reading and archival by Monsanto, including its
    subsidiaries. The recipient of this e-mail is solely responsible for
    checking for the presence of "Viruses" or other "Malware".
    Monsanto, along with its subsidiaries, accepts no liability for any damage
    caused by any such code transmitted by or accompanying
    this e-mail or any attachment.


    The information contained in this email may be subject to the export
    control laws and regulations of the United States, potentially
    including but not limited to the Export Administration Regulations (EAR)
    and sanctions regulations issued by the U.S. Department of
    Treasury, Office of Foreign Asset Controls (OFAC). As a recipient of this
    information you are obligated to comply with all
    applicable U.S. export laws and regulations.
  • GOEKE, MATTHEW (AG/1000) at Jun 28, 2011 at 3:57 am
    At this point if that is the correct ip then I would see if you can actually ssh from the DN to the NN to make sure it can actually connect to the other box. If you can successfully connect through ssh then it's just a matter of figuring out why that port is having issues (netstat is your friend in this case). If you see it listening on 54310 then just power cycle the box and try again.

    Matt

    -----Original Message-----
    From: Jingwei Lu
    Sent: Monday, June 27, 2011 5:38 PM
    To: common-user@hadoop.apache.org
    Subject: Re: Why I cannot see live nodes in a LAN-based cluster setup?

    Hi Matt and Jeff:

    Thanks a lot for your instructions. I corrected the mistakes in conf files
    of DN, and now the log on DN becomes:

    2011-06-27 15:32:36,025 INFO org.apache.hadoop.ipc.Client: Retrying connect
    to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 0 time(s).
    2011-06-27 15:32:37,028 INFO org.apache.hadoop.ipc.Client: Retrying connect
    to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 1 time(s).
    2011-06-27 15:32:38,031 INFO org.apache.hadoop.ipc.Client: Retrying connect
    to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 2 time(s).
    2011-06-27 15:32:39,034 INFO org.apache.hadoop.ipc.Client: Retrying connect
    to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 3 time(s).
    2011-06-27 15:32:40,037 INFO org.apache.hadoop.ipc.Client: Retrying connect
    to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 4 time(s).
    2011-06-27 15:32:41,040 INFO org.apache.hadoop.ipc.Client: Retrying connect
    to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 5 time(s).
    2011-06-27 15:32:42,043 INFO org.apache.hadoop.ipc.Client: Retrying connect
    to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 6 time(s).
    2011-06-27 15:32:43,046 INFO org.apache.hadoop.ipc.Client: Retrying connect
    to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 7 time(s).
    2011-06-27 15:32:44,049 INFO org.apache.hadoop.ipc.Client: Retrying connect
    to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 8 time(s).
    2011-06-27 15:32:45,052 INFO org.apache.hadoop.ipc.Client: Retrying connect
    to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 9 time(s).
    2011-06-27 15:32:45,053 INFO org.apache.hadoop.ipc.RPC: Server at
    clock.ucsd.edu/132.239.95.91:54310 not available yet, Zzzzz...

    Seems DN is trying to bind with NN but always fails...



    Best Regards
    Yours Sincerely

    Jingwei Lu


    On Mon, Jun 27, 2011 at 2:22 PM, GOEKE, MATTHEW (AG/1000) wrote:

    As a follow-up to what Jeff posted: go ahead and ignore the message you got
    on the NN for now.

    If you look at the address that the DN log shows it is 127.0.0.1 and the
    ip:port it is trying to connect to for the NN is 127.0.0.1:54310 ---> it
    is trying to bind to itself as if it was still in single machine mode. Make
    sure that you have correctly pushed the URI for the NN into the config files
    on both machines and then bounce DFS.

    Matt

    -----Original Message-----
    From: Jeff.Schmitz@shell.com
    Sent: Monday, June 27, 2011 4:08 PM
    To: common-user@hadoop.apache.org
    Subject: RE: Why I cannot see live nodes in a LAN-based cluster setup?

    http://www.mentby.com/tim-robertson/error-register-getprotocolversion.html



    -----Original Message-----
    From: Jingwei Lu
    Sent: Monday, June 27, 2011 3:58 PM
    To: common-user@hadoop.apache.org
    Subject: Re: Why I cannot see live nodes in a LAN-based cluster setup?

    Hi,

    I just manually modify the masters & slaves files in the both machines.

    I found something wrong in the log files, as shown below:

    -- Master :
    namenote.log:

    ****************************************
    2011-06-27 13:44:47,055 INFO org.mortbay.log: jetty-6.1.14
    2011-06-27 13:44:47,394 INFO org.mortbay.log: Started
    SelectChannelConnector@0.0.0.0:50070
    2011-06-27 13:44:47,395 INFO
    org.apache.hadoop.hdfs.server.namenode.NameNode: Web-server up at:
    0.0.0.0:50070
    2011-06-27 13:44:47,395 INFO org.apache.hadoop.ipc.Server: IPC Server
    Responder: starting
    2011-06-27 13:44:47,395 INFO org.apache.hadoop.ipc.Server: IPC Server
    listener on 54310: starting
    2011-06-27 13:44:47,396 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 0 on 54310: starting
    2011-06-27 13:44:47,397 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 1 on 54310: starting
    2011-06-27 13:44:47,397 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 2 on 54310: starting
    2011-06-27 13:44:47,397 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 3 on 54310: starting
    2011-06-27 13:44:47,402 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 4 on 54310: starting
    2011-06-27 13:44:47,404 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 5 on 54310: starting
    2011-06-27 13:44:47,406 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 6 on 54310: starting
    2011-06-27 13:44:47,406 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 7 on 54310: starting
    2011-06-27 13:44:47,406 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 8 on 54310: starting
    2011-06-27 13:44:47,408 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 9 on 54310: starting
    2011-06-27 13:44:47,500 INFO org.apache.hadoop.ipc.Server: Error register
    getProtocolVersion
    java.lang.IllegalArgumentException: Duplicate
    metricsName:getProtocolVersion
    at
    org.apache.hadoop.metrics.util.MetricsRegistry.add(MetricsRegistry.java:53)
    at

    org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.<init>(MetricsTimeVaryingRate.java:89)
    at

    org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.<init>(MetricsTimeVaryingRate.java:99)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:416)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
    2011-06-27 13:45:02,572 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
    NameSystem.registerDatanode: node registration from 127.0.0.1:50010storage
    DS-87816363-127.0.0.1-50010-1309207502566
    ****************************************


    -- slave:
    datanode.log:

    ****************************************
    1 2011-06-27 13:45:00,335 INFO
    org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
    2 /************************************************************
    3 STARTUP_MSG: Starting DataNode
    4 STARTUP_MSG: host = hdl.ucsd.edu/127.0.0.1
    5 STARTUP_MSG: args = []
    6 STARTUP_MSG: version = 0.20.2
    7 STARTUP_MSG: build =
    https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r
    911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
    8 ************************************************************/
    9 2011-06-27 13:45:02,476 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 0 time(s).
    10 2011-06-27 13:45:03,549 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 1 time(s).
    11 2011-06-27 13:45:04,552 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 2 time(s).
    12 2011-06-27 13:45:05,609 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 3 time(s).
    13 2011-06-27 13:45:06,640 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 4 time(s).
    14 2011-06-27 13:45:07,643 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 5 time(s).
    15 2011-06-27 13:45:08,646 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 6 time(s).
    16 2011-06-27 13:45:09,661 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 7 time(s).
    17 2011-06-27 13:45:10,664 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 8 time(s).
    18 2011-06-27 13:45:11,678 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 9 time(s).
    19 2011-06-27 13:45:11,679 INFO org.apache.hadoop.ipc.RPC: Server at
    hdl.ucsd.edu/127.0.0.1:54310 not available yet, Zzzzz...
    ****************************************

    (just guess, is this due to some porting problem?)

    Any comments will be greatly appreciated!

    Best Regards
    Yours Sincerely

    Jingwei Lu



    On Mon, Jun 27, 2011 at 1:28 PM, GOEKE, MATTHEW (AG/1000) <
    matthew.goeke@monsanto.com> wrote:
    Did you make sure to define the datanode/tasktracker in the slaves file in
    your conf directory and push that to both machines? Also have you checked
    the logs on either to see if there are any errors?

    Matt

    -----Original Message-----
    From: Jingwei Lu
    Sent: Monday, June 27, 2011 3:24 PM
    To: HADOOP MLIST
    Subject: Why I cannot see live nodes in a LAN-based cluster setup?

    Hi Everyone:

    I am quite new to hadoop here. I am attempting to set up Hadoop locally in
    two machines, connected by LAN. Both of them pass the single-node test.
    However, I failed in two-node cluster setup, as shown in the 2 cases below:
    1) set one as dedicated namenode and the other as dedicated datanode
    2) set one as both name- and data-node, and the other as just datanode

    I launch *start-dfs.sh *on the namenode. Since I have all the *ssh *issues
    cleared, thus I can always observe the startup of daemon in every datanode.
    However, by website of *http://(URI of namenode):50070 *it shows only 0
    live
    node for (1) and 1 live node for (2), which is the same as the output by
    command-line *hadoop dfsadmin -report*

    Generally it appears that from the namenode you cannot observe the remote
    datanode alive, let alone a normal across-node MapReduce execution.

    Could anyone give some hints / instructions at this point? I really
    appreciate it!

    Thank.

    Best Regards
    Yours Sincerely

    Jingwei Lu
    This e-mail message may contain privileged and/or confidential
    information,
    and is intended to be received only by persons entitled
    to receive such information. If you have received this e-mail in error,
    please notify the sender immediately. Please delete it and
    all attachments from any servers, hard drives or any other media. Other use
    of this e-mail by you is strictly prohibited.

    All e-mails and attachments sent and received are subject to monitoring,
    reading and archival by Monsanto, including its
    subsidiaries. The recipient of this e-mail is solely responsible for
    checking for the presence of "Viruses" or other "Malware".
    Monsanto, along with its subsidiaries, accepts no liability for any damage
    caused by any such code transmitted by or accompanying
    this e-mail or any attachment.


    The information contained in this email may be subject to the export
    control laws and regulations of the United States, potentially
    including but not limited to the Export Administration Regulations (EAR)
    and sanctions regulations issued by the U.S. Department of
    Treasury, Office of Foreign Asset Controls (OFAC). As a recipient of this
    information you are obligated to comply with all
    applicable U.S. export laws and regulations.
  • Jeff Schmitz at Jun 28, 2011 at 2:31 pm
    You may also try removing the hadoop-"yourname" directory from /tmp - and reformatting HDFS - it may be corrupted

    -----Original Message-----
    From: GOEKE, MATTHEW (AG/1000)
    Sent: Monday, June 27, 2011 10:57 PM
    To: common-user@hadoop.apache.org
    Subject: RE: Why I cannot see live nodes in a LAN-based cluster setup?

    At this point if that is the correct ip then I would see if you can actually ssh from the DN to the NN to make sure it can actually connect to the other box. If you can successfully connect through ssh then it's just a matter of figuring out why that port is having issues (netstat is your friend in this case). If you see it listening on 54310 then just power cycle the box and try again.

    Matt

    -----Original Message-----
    From: Jingwei Lu
    Sent: Monday, June 27, 2011 5:38 PM
    To: common-user@hadoop.apache.org
    Subject: Re: Why I cannot see live nodes in a LAN-based cluster setup?

    Hi Matt and Jeff:

    Thanks a lot for your instructions. I corrected the mistakes in conf files
    of DN, and now the log on DN becomes:

    2011-06-27 15:32:36,025 INFO org.apache.hadoop.ipc.Client: Retrying connect
    to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 0 time(s).
    2011-06-27 15:32:37,028 INFO org.apache.hadoop.ipc.Client: Retrying connect
    to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 1 time(s).
    2011-06-27 15:32:38,031 INFO org.apache.hadoop.ipc.Client: Retrying connect
    to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 2 time(s).
    2011-06-27 15:32:39,034 INFO org.apache.hadoop.ipc.Client: Retrying connect
    to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 3 time(s).
    2011-06-27 15:32:40,037 INFO org.apache.hadoop.ipc.Client: Retrying connect
    to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 4 time(s).
    2011-06-27 15:32:41,040 INFO org.apache.hadoop.ipc.Client: Retrying connect
    to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 5 time(s).
    2011-06-27 15:32:42,043 INFO org.apache.hadoop.ipc.Client: Retrying connect
    to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 6 time(s).
    2011-06-27 15:32:43,046 INFO org.apache.hadoop.ipc.Client: Retrying connect
    to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 7 time(s).
    2011-06-27 15:32:44,049 INFO org.apache.hadoop.ipc.Client: Retrying connect
    to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 8 time(s).
    2011-06-27 15:32:45,052 INFO org.apache.hadoop.ipc.Client: Retrying connect
    to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 9 time(s).
    2011-06-27 15:32:45,053 INFO org.apache.hadoop.ipc.RPC: Server at
    clock.ucsd.edu/132.239.95.91:54310 not available yet, Zzzzz...

    Seems DN is trying to bind with NN but always fails...



    Best Regards
    Yours Sincerely

    Jingwei Lu


    On Mon, Jun 27, 2011 at 2:22 PM, GOEKE, MATTHEW (AG/1000) wrote:

    As a follow-up to what Jeff posted: go ahead and ignore the message you got
    on the NN for now.

    If you look at the address that the DN log shows it is 127.0.0.1 and the
    ip:port it is trying to connect to for the NN is 127.0.0.1:54310 ---> it
    is trying to bind to itself as if it was still in single machine mode. Make
    sure that you have correctly pushed the URI for the NN into the config files
    on both machines and then bounce DFS.

    Matt

    -----Original Message-----
    From: Jeff.Schmitz@shell.com
    Sent: Monday, June 27, 2011 4:08 PM
    To: common-user@hadoop.apache.org
    Subject: RE: Why I cannot see live nodes in a LAN-based cluster setup?

    http://www.mentby.com/tim-robertson/error-register-getprotocolversion.html



    -----Original Message-----
    From: Jingwei Lu
    Sent: Monday, June 27, 2011 3:58 PM
    To: common-user@hadoop.apache.org
    Subject: Re: Why I cannot see live nodes in a LAN-based cluster setup?

    Hi,

    I just manually modify the masters & slaves files in the both machines.

    I found something wrong in the log files, as shown below:

    -- Master :
    namenote.log:

    ****************************************
    2011-06-27 13:44:47,055 INFO org.mortbay.log: jetty-6.1.14
    2011-06-27 13:44:47,394 INFO org.mortbay.log: Started
    SelectChannelConnector@0.0.0.0:50070
    2011-06-27 13:44:47,395 INFO
    org.apache.hadoop.hdfs.server.namenode.NameNode: Web-server up at:
    0.0.0.0:50070
    2011-06-27 13:44:47,395 INFO org.apache.hadoop.ipc.Server: IPC Server
    Responder: starting
    2011-06-27 13:44:47,395 INFO org.apache.hadoop.ipc.Server: IPC Server
    listener on 54310: starting
    2011-06-27 13:44:47,396 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 0 on 54310: starting
    2011-06-27 13:44:47,397 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 1 on 54310: starting
    2011-06-27 13:44:47,397 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 2 on 54310: starting
    2011-06-27 13:44:47,397 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 3 on 54310: starting
    2011-06-27 13:44:47,402 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 4 on 54310: starting
    2011-06-27 13:44:47,404 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 5 on 54310: starting
    2011-06-27 13:44:47,406 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 6 on 54310: starting
    2011-06-27 13:44:47,406 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 7 on 54310: starting
    2011-06-27 13:44:47,406 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 8 on 54310: starting
    2011-06-27 13:44:47,408 INFO org.apache.hadoop.ipc.Server: IPC Server
    handler 9 on 54310: starting
    2011-06-27 13:44:47,500 INFO org.apache.hadoop.ipc.Server: Error register
    getProtocolVersion
    java.lang.IllegalArgumentException: Duplicate
    metricsName:getProtocolVersion
    at
    org.apache.hadoop.metrics.util.MetricsRegistry.add(MetricsRegistry.java:53)
    at

    org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.<init>(MetricsTimeVaryingRate.java:89)
    at

    org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.<init>(MetricsTimeVaryingRate.java:99)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:416)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
    2011-06-27 13:45:02,572 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
    NameSystem.registerDatanode: node registration from 127.0.0.1:50010storage
    DS-87816363-127.0.0.1-50010-1309207502566
    ****************************************


    -- slave:
    datanode.log:

    ****************************************
    1 2011-06-27 13:45:00,335 INFO
    org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
    2 /************************************************************
    3 STARTUP_MSG: Starting DataNode
    4 STARTUP_MSG: host = hdl.ucsd.edu/127.0.0.1
    5 STARTUP_MSG: args = []
    6 STARTUP_MSG: version = 0.20.2
    7 STARTUP_MSG: build =
    https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r
    911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
    8 ************************************************************/
    9 2011-06-27 13:45:02,476 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 0 time(s).
    10 2011-06-27 13:45:03,549 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 1 time(s).
    11 2011-06-27 13:45:04,552 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 2 time(s).
    12 2011-06-27 13:45:05,609 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 3 time(s).
    13 2011-06-27 13:45:06,640 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 4 time(s).
    14 2011-06-27 13:45:07,643 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 5 time(s).
    15 2011-06-27 13:45:08,646 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 6 time(s).
    16 2011-06-27 13:45:09,661 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 7 time(s).
    17 2011-06-27 13:45:10,664 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 8 time(s).
    18 2011-06-27 13:45:11,678 INFO org.apache.hadoop.ipc.Client: Retrying
    connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 9 time(s).
    19 2011-06-27 13:45:11,679 INFO org.apache.hadoop.ipc.RPC: Server at
    hdl.ucsd.edu/127.0.0.1:54310 not available yet, Zzzzz...
    ****************************************

    (just guess, is this due to some porting problem?)

    Any comments will be greatly appreciated!

    Best Regards
    Yours Sincerely

    Jingwei Lu



    On Mon, Jun 27, 2011 at 1:28 PM, GOEKE, MATTHEW (AG/1000) <
    matthew.goeke@monsanto.com> wrote:
    Did you make sure to define the datanode/tasktracker in the slaves file in
    your conf directory and push that to both machines? Also have you checked
    the logs on either to see if there are any errors?

    Matt

    -----Original Message-----
    From: Jingwei Lu
    Sent: Monday, June 27, 2011 3:24 PM
    To: HADOOP MLIST
    Subject: Why I cannot see live nodes in a LAN-based cluster setup?

    Hi Everyone:

    I am quite new to hadoop here. I am attempting to set up Hadoop locally in
    two machines, connected by LAN. Both of them pass the single-node test.
    However, I failed in two-node cluster setup, as shown in the 2 cases below:
    1) set one as dedicated namenode and the other as dedicated datanode
    2) set one as both name- and data-node, and the other as just datanode

    I launch *start-dfs.sh *on the namenode. Since I have all the *ssh *issues
    cleared, thus I can always observe the startup of daemon in every datanode.
    However, by website of *http://(URI of namenode):50070 *it shows only 0
    live
    node for (1) and 1 live node for (2), which is the same as the output by
    command-line *hadoop dfsadmin -report*

    Generally it appears that from the namenode you cannot observe the remote
    datanode alive, let alone a normal across-node MapReduce execution.

    Could anyone give some hints / instructions at this point? I really
    appreciate it!

    Thank.

    Best Regards
    Yours Sincerely

    Jingwei Lu
    This e-mail message may contain privileged and/or confidential
    information,
    and is intended to be received only by persons entitled
    to receive such information. If you have received this e-mail in error,
    please notify the sender immediately. Please delete it and
    all attachments from any servers, hard drives or any other media. Other use
    of this e-mail by you is strictly prohibited.

    All e-mails and attachments sent and received are subject to monitoring,
    reading and archival by Monsanto, including its
    subsidiaries. The recipient of this e-mail is solely responsible for
    checking for the presence of "Viruses" or other "Malware".
    Monsanto, along with its subsidiaries, accepts no liability for any damage
    caused by any such code transmitted by or accompanying
    this e-mail or any attachment.


    The information contained in this email may be subject to the export
    control laws and regulations of the United States, potentially
    including but not limited to the Export Administration Regulations (EAR)
    and sanctions regulations issued by the U.S. Department of
    Treasury, Office of Foreign Asset Controls (OFAC). As a recipient of this
    information you are obligated to comply with all
    applicable U.S. export laws and regulations.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedJun 27, '11 at 8:24p
activeJun 28, '11 at 2:31p
posts8
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase