This implementation is intended to address another potential problem -
that if the server is multi-homed and/or multi addressed, there's no way to
which addresses/hostnames are routable from the cluster nodes. To that end,
we do is look at the source IP address of the ssh connection we make to the
and then, because we don't want to have to use an IP address, we do a
lookup to get a hostname.
Yes, with EC2, this doesn't work if the server is restarted and the IP and
but it's not clear what a clean automatic solution for EC2 looks like.
Asking the server what its IP and hostname is will just yield the same IP
that the node discovered using the ssh/reverse-DNS method, so that doesn't
You can assign an elastic IP to the instance, but even then, basic DNS
will not lead you to this IP or the associated hostname. You need to do an
specific metadata query to discover the elastic IP/hostname, which we're not
really in a position to do - and then there's the race condition due to not
able to assign the elastic IP until after the instance is started, and so
Ultimately, it would come down to requiring the user to manually specify
hostname the server should claim to have. At that point you could type in
elastic hostname and it would work, but we've been trying to avoid a manual
solution. This is something we can consider, although I can't promise
it would show up.
Have I missed something in how you're configuring your ec2 nodes that would
allow an automated mechanism to find the 'right' hostname?
On 28 August 2012 06:38, Sam Darwin wrote:
It looks like the agent config file at /etc/cloudera-scm-agent/config is
using the server host name that it finds via reverse DNS lookup or
something. Now, reverse DNS is a bit esoteric for the average user, and
it's often set incorrectly or it's set by the service provider to be
something strange, and it's out of control of the server admin doing the
With EC2, the IP of the server might change, if that happens the
reverse-lookup DNS will change, and all the agents will be wrong now.
If the agent is being installed from a central server, and the central
server knows it's own hostname, then this value should be populated from
the server's hostname, and not from a reverse lookup. A defined
hostname (myserver1) will stay constant , even in the face of IP address
changes. And usually the cloudera software GUI is showing
I apologize in advance if I have misread this whole situation.