FAQ
Hi,

I'm using CM 4.1 and want to get a cluster of 1 master and 4 slaves of CDH
4 going on my machines.
These machines are on the gogrid.com cloud, and they're running CentOS 6.0
64-bit.

I've installed CM on one instance (173.1.1.51), and I'm trying to get it to
deploy CDH to the cluster of IPs, 173.1.1.51, 173.1.1.52, 173.1.1.53,
173.1.1.58, 173.1.1.60.

I had to get the reverse DNS entries for the IPs to point to a new domain
(default was gogrid.com) so that the installer would not fail with the
error "could not contact scm server, giving up"

Now, at host 173.1.1.51, installer fails with this:
*BEGIN /sbin/service cloudera-scm-agent status | grep running *
*END (1) *
*BEGIN /sbin/service cloudera-scm-agent start *
Starting cloudera-scm-agent: [FAILED]
*END (1) *
scm agent could not be started, giving up

And for the other hosts, it fails with:
*BEGIN host -t PTR 173.1.1.51 *
/tmp/scm_prepare_node.4HMODE4Y/scm_prepare_node.sh: line 91: host: command
not found
*END (127) *
*BEGIN ping -c 1 173.1.1.51 *
PING 173.1.1.51 (173.1.1.51) 56(84) bytes of data.

--- 173.1.1.51 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 10000ms

*END (1) *
could not contact scm server, giving up

I have set my hostnames to match the domain name I'm pointing to in rDNS,
/etc/hosts has: 127.0.0.1 localhost, 127.0.1.1 localhost, and hostname and
hostname -f return the hostname that I have set.

I have unblocked ports 7180 and 7182 and allowed icmp echo-request and
echo-reply requests in my iptables file.

This is what /var/log/cloudera-scm-server/cloudera-scm-server.log contains:

2012-11-10 06:59:17,745 INFO
[NodeConfiguratorThread-2-3:direct.SessionChannel@
372] Sending channel request for `exec`
2012-11-10 06:59:17,748 INFO [reader:direct.SessionChannel@326] Received
window
adjustment for 2097152 bytes
2012-11-10 06:59:17,763 INFO [reader:direct.SessionChannel@314] Got chan
reques
t for `exit-status`
2012-11-10 06:59:17,763 INFO [reader:direct.SessionChannel@408] Got EOF
2012-11-10 06:59:17,763 INFO [reader:direct.SessionChannel@223] Got close
2012-11-10 06:59:17,763 INFO [reader:direct.SessionChannel@425] Sending EOF
2012-11-10 06:59:17,763 INFO [reader:direct.SessionChannel@287] Sending
close
2012-11-10 06:59:17,763 INFO [reader:connection.ConnectionImpl@84]
Forgetting `
session` channel (#0)
2012-11-10 06:59:17,764 INFO
[NodeConfiguratorThread-2-3:node.NodeConfiguratorP
rogress@464] 173.1.1.52: Transitioning from MAKE_TEMP_DIR (PT0.023S) to
COPY_FIL
ES
2012-11-10 06:59:17,764 INFO
[NodeConfiguratorThread-2-3:connection.ConnectionI
mpl@68] Attaching `session` channel (#1)
2012-11-10 06:59:17,804 INFO [reader:direct.SessionChannel@125]
Initialized - <
session channel: id=1, recipient=1, localWin=[winSize=2097152],
remoteWin=[winS
ize=0] >
2012-11-10 06:59:17,805 INFO
[NodeConfiguratorThread-2-3:direct.SessionChannel@
120] Will request to exec `scp -t -r '/tmp/scm_prepare_node.umgmT011'`
2012-11-10 06:59:17,805 INFO
[NodeConfiguratorThread-2-3:direct.SessionChannel@

And this is what /var/log/cloudera-scm-agent/cloudera-scm-agent.log
contains:

[09/Nov/2012 08:52:15 +0000] 7158 MainThread agent ERROR
Heartbeating
to 173.1.1.51:7182 failed.
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/src/cmf/agent.py", line 574, in send_heartbeat
response = self.requestor.request('heartbeat', dict(request=heartbeat))
File
"/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/avro-1.6.3-py
2.6.egg/avro/ipc.py", line 145, in request
return self.issue_request(call_request, message_name, request_datum)
File
"/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/avro-1.6.3-py
2.6.egg/avro/ipc.py", line 262, in issue_request
return self.read_call_response(message_name, buffer_decoder)
File
"/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/avro-1.6.3-py
2.6.egg/avro/ipc.py", line 242, in read_call_response
raise self.read_error(writers_schema, readers_schema, decoder)
File
"/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/avro-1.6.3-py
2.6.egg/avro/ipc.py", line 251, in read_error
return AvroRemoteException(datum_reader.read(decoder))
File
"/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/avro-1.6.3-py
2.6.egg/avro/io.py", line 444, in read
return self.read_data(self.writers_schema, self.readers_schema, decoder)
File
"/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/avro-1.6.3-py
2.6.egg/avro/io.py", line 448, in read_data
if not DatumReader.match_schemas(writers_schema, readers_schema):

I'm new to Cloudera and lost, could someone please point me in the right
direction? Thanks.

Search Discussions

  • Dhruv Jalota at Nov 11, 2012 at 1:04 am
    EDIT: The original error before changing rDNS was:

    BEGIN host -t PTR 173.1.1.51
    51.1.1.173.in-addr.arpa domain name pointer
    173.1.1.51.reverse.gogrid.com.
    END (0)
    using 173.1.1.51.reverse.gogrid.com as scm server hostname
    BEGIN ping -c 1 173.1.1.51.reverse.gogrid.com
    PING www.gogrid.com (69.59.136.181) 56(84) bytes of data.

    --- www.gogrid.com ping statistics ---
    1 packets transmitted, 0 received, 100% packet loss, time 10000ms

    END (1)
    could not contact scm server, giving up
    On Saturday, November 10, 2012 5:00:20 PM UTC-8, Dhruv Jalota wrote:

    Hi,

    I'm using CM 4.1 and want to get a cluster of 1 master and 4 slaves of CDH
    4 going on my machines.
    These machines are on the gogrid.com cloud, and they're running CentOS
    6.0 64-bit.

    I've installed CM on one instance (173.1.1.51), and I'm trying to get it
    to deploy CDH to the cluster of IPs, 173.1.1.51, 173.1.1.52, 173.1.1.53,
    173.1.1.58, 173.1.1.60.

    I had to get the reverse DNS entries for the IPs to point to a new domain
    (default was gogrid.com) so that the installer would not fail with the
    error "could not contact scm server, giving up"

    Now, at host 173.1.1.51, installer fails with this:
    *BEGIN /sbin/service cloudera-scm-agent status | grep running *
    *END (1) *
    *BEGIN /sbin/service cloudera-scm-agent start *
    Starting cloudera-scm-agent: [FAILED]
    *END (1) *
    scm agent could not be started, giving up

    And for the other hosts, it fails with:
    *BEGIN host -t PTR 173.1.1.51 *
    /tmp/scm_prepare_node.4HMODE4Y/scm_prepare_node.sh: line 91: host: command
    not found
    *END (127) *
    *BEGIN ping -c 1 173.1.1.51 *
    PING 173.1.1.51 (173.1.1.51) 56(84) bytes of data.

    --- 173.1.1.51 ping statistics ---
    1 packets transmitted, 0 received, 100% packet loss, time 10000ms

    *END (1) *
    could not contact scm server, giving up

    I have set my hostnames to match the domain name I'm pointing to in rDNS,
    /etc/hosts has: 127.0.0.1 localhost, 127.0.1.1 localhost, and hostname and
    hostname -f return the hostname that I have set.

    I have unblocked ports 7180 and 7182 and allowed icmp echo-request and
    echo-reply requests in my iptables file.

    This is what /var/log/cloudera-scm-server/cloudera-scm-server.log contains:

    2012-11-10 06:59:17,745 INFO
    [NodeConfiguratorThread-2-3:direct.SessionChannel@
    372] Sending channel request for `exec`
    2012-11-10 06:59:17,748 INFO [reader:direct.SessionChannel@326] Received
    window
    adjustment for 2097152 bytes
    2012-11-10 06:59:17,763 INFO [reader:direct.SessionChannel@314] Got chan
    reques
    t for `exit-status`
    2012-11-10 06:59:17,763 INFO [reader:direct.SessionChannel@408] Got EOF
    2012-11-10 06:59:17,763 INFO [reader:direct.SessionChannel@223] Got close
    2012-11-10 06:59:17,763 INFO [reader:direct.SessionChannel@425] Sending
    EOF
    2012-11-10 06:59:17,763 INFO [reader:direct.SessionChannel@287] Sending
    close
    2012-11-10 06:59:17,763 INFO [reader:connection.ConnectionImpl@84]
    Forgetting `
    session` channel (#0)
    2012-11-10 06:59:17,764 INFO
    [NodeConfiguratorThread-2-3:node.NodeConfiguratorP
    rogress@464] 173.1.1.52: Transitioning from MAKE_TEMP_DIR (PT0.023S) to
    COPY_FIL
    ES
    2012-11-10 06:59:17,764 INFO
    [NodeConfiguratorThread-2-3:connection.ConnectionI
    mpl@68] Attaching `session` channel (#1)
    2012-11-10 06:59:17,804 INFO [reader:direct.SessionChannel@125]
    Initialized - <
    session channel: id=1, recipient=1, localWin=[winSize=2097152],
    remoteWin=[winS
    ize=0] >
    2012-11-10 06:59:17,805 INFO
    [NodeConfiguratorThread-2-3:direct.SessionChannel@
    120] Will request to exec `scp -t -r '/tmp/scm_prepare_node.umgmT011'`
    2012-11-10 06:59:17,805 INFO
    [NodeConfiguratorThread-2-3:direct.SessionChannel@

    And this is what /var/log/cloudera-scm-agent/cloudera-scm-agent.log
    contains:

    [09/Nov/2012 08:52:15 +0000] 7158 MainThread agent ERROR
    Heartbeating
    to 173.1.1.51:7182 failed.
    Traceback (most recent call last):
    File "/usr/lib64/cmf/agent/src/cmf/agent.py", line 574, in send_heartbeat
    response = self.requestor.request('heartbeat', dict(request=heartbeat))
    File
    "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/avro-1.6.3-py
    2.6.egg/avro/ipc.py", line 145, in request
    return self.issue_request(call_request, message_name, request_datum)
    File
    "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/avro-1.6.3-py
    2.6.egg/avro/ipc.py", line 262, in issue_request
    return self.read_call_response(message_name, buffer_decoder)
    File
    "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/avro-1.6.3-py
    2.6.egg/avro/ipc.py", line 242, in read_call_response
    raise self.read_error(writers_schema, readers_schema, decoder)
    File
    "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/avro-1.6.3-py
    2.6.egg/avro/ipc.py", line 251, in read_error
    return AvroRemoteException(datum_reader.read(decoder))
    File
    "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/avro-1.6.3-py
    2.6.egg/avro/io.py", line 444, in read
    return self.read_data(self.writers_schema, self.readers_schema,
    decoder)
    File
    "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/avro-1.6.3-py
    2.6.egg/avro/io.py", line 448, in read_data
    if not DatumReader.match_schemas(writers_schema, readers_schema):

    I'm new to Cloudera and lost, could someone please point me in the right
    direction? Thanks.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupscm-users @
categorieshadoop
postedNov 11, '12 at 1:00a
activeNov 11, '12 at 1:04a
posts2
users1
websitecloudera.com
irc#hadoop

1 user in discussion

Dhruv Jalota: 2 posts

People

Translate

site design / logo © 2022 Grokbase