FAQ
Hi,

Problem:

It looks like the agent config file at /etc/cloudera-scm-agent/config is
using the server host name that it finds via reverse DNS lookup or
something. Now, reverse DNS is a bit esoteric for the average user, and
it's often set incorrectly or it's set by the service provider to be
something strange, and it's out of control of the server admin doing the
project.

With EC2, the IP of the server might change, if that happens the
reverse-lookup DNS will change, and all the agents will be wrong now.

Solution:

If the agent is being installed from a central server, and the central
server knows it's own hostname, then this value should be populated from
the server's hostname, and not from a reverse lookup. A defined
hostname (myserver1) will stay constant , even in the face of IP address
changes. And usually the cloudera software GUI is showing
forward-lookup fqdn's.

I apologize in advance if I have misread this whole situation.

Best Regards,
Sam

Search Discussions

  • Philip Langdale at Aug 28, 2012 at 4:54 pm
    Hi Sam,

    This implementation is intended to address another potential problem -
    which is
    that if the server is multi-homed and/or multi addressed, there's no way to
    know
    which addresses/hostnames are routable from the cluster nodes. To that end,
    what
    we do is look at the source IP address of the ssh connection we make to the
    node,
    and then, because we don't want to have to use an IP address, we do a
    reverse
    lookup to get a hostname.

    Yes, with EC2, this doesn't work if the server is restarted and the IP and
    DNS change,
    but it's not clear what a clean automatic solution for EC2 looks like.

    Asking the server what its IP and hostname is will just yield the same IP
    and address
    that the node discovered using the ssh/reverse-DNS method, so that doesn't
    help.

    You can assign an elastic IP to the instance, but even then, basic DNS
    operations
    will not lead you to this IP or the associated hostname. You need to do an
    EC2
    specific metadata query to discover the elastic IP/hostname, which we're not
    really in a position to do - and then there's the race condition due to not
    being
    able to assign the elastic IP until after the instance is started, and so
    on.

    Ultimately, it would come down to requiring the user to manually specify
    what
    hostname the server should claim to have. At that point you could type in
    the
    elastic hostname and it would work, but we've been trying to avoid a manual
    solution. This is something we can consider, although I can't promise
    if/when
    it would show up.

    Have I missed something in how you're configuring your ec2 nodes that would
    allow an automated mechanism to find the 'right' hostname?

    --phil


    On 28 August 2012 06:38, Sam Darwin wrote:

    Hi,

    Problem:

    It looks like the agent config file at /etc/cloudera-scm-agent/config is
    using the server host name that it finds via reverse DNS lookup or
    something. Now, reverse DNS is a bit esoteric for the average user, and
    it's often set incorrectly or it's set by the service provider to be
    something strange, and it's out of control of the server admin doing the
    project.

    With EC2, the IP of the server might change, if that happens the
    reverse-lookup DNS will change, and all the agents will be wrong now.

    Solution:

    If the agent is being installed from a central server, and the central
    server knows it's own hostname, then this value should be populated from
    the server's hostname, and not from a reverse lookup. A defined
    hostname (myserver1) will stay constant , even in the face of IP address
    changes. And usually the cloudera software GUI is showing
    forward-lookup fqdn's.

    I apologize in advance if I have misread this whole situation.

    Best Regards,
    Sam
  • Sam Darwin at Aug 29, 2012 at 7:25 am


    Hi Philip,
    This whole problem is a bit of a conundrum. I see your point.

    I am trying to imagine a more sophisticated algorithm... Ingest the FQDN
    and the incoming IP address from the server connection. If the name
    can be forward-resolved to the IP, then use the name. If it can't be,
    then follow your current method, or just put an IP.

    Not only doesn't Amazon let you control the reverse lookup's, but probably
    a lot of server hosting situations are going to be that way.

    As an aside, how are the IP addresses resolved? First, you name the
    servers (server1 server2 etc). Then... there are static /etc/hosts files
    on every host with the name to IP mapping. Alternatively, and even more
    correct: you have internal and external DNS, and update the appropriate
    DNS entries during an IP change.

    The current setup requires me to manually change those cloudera config
    files, which is more obscure than changing the "well known" items of DNS or
    /etc/hosts, which is what I expect to change during an IP address change.
        Well, it's an additional step - because all the steps have to be done.

    Sam
  • Sam Darwin at Aug 29, 2012 at 7:43 am
    On the other hand... you are getting some "stability" by using hardcoded
    IP's, and it prevents unexpected outages. By forcing the admin to
    change those IP's manually (which is the current situation, really..),
    there will be more well defined results and less surprises, perhaps.

    On Wednesday, August 29, 2012 9:25:56 AM UTC+2, Sam Darwin wrote:

    Hi Philip,
    This whole problem is a bit of a conundrum. I see your point.

    I am trying to imagine a more sophisticated algorithm... Ingest the FQDN
    and the incoming IP address from the server connection. If the name
    can be forward-resolved to the IP, then use the name. If it can't be,
    then follow your current method, or just put an IP.

    Not only doesn't Amazon let you control the reverse lookup's, but probably
    a lot of server hosting situations are going to be that way.

    As an aside, how are the IP addresses resolved? First, you name the
    servers (server1 server2 etc). Then... there are static /etc/hosts files
    on every host with the name to IP mapping. Alternatively, and even more
    correct: you have internal and external DNS, and update the appropriate
    DNS entries during an IP change.

    The current setup requires me to manually change those cloudera config
    files, which is more obscure than changing the "well known" items of DNS or
    /etc/hosts, which is what I expect to change during an IP address change.
    Well, it's an additional step - because all the steps have to be done.

    Sam

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupscm-users @
categorieshadoop
postedAug 28, '12 at 1:38p
activeAug 29, '12 at 7:43a
posts4
users2
websitecloudera.com
irc#hadoop

2 users in discussion

Sam Darwin: 3 posts Philip Langdale: 1 post

People

Translate

site design / logo © 2022 Grokbase