Here is a patch for host name support in pg_hba.conf. I have reviewed
various past threads about this, and there appeared to have been a 50/50
split of for and against reverse lookup. I went with the reverse
lookup, because

0) I like it.

1) It is more secure.

2) It allows extending it to wildcards in the future.

3) Apache (Allow from) does it that way.

To clarify how it works: The client's IP address (known from the
kernel) is reverse looked up, which results in a host name. That host
name is compared with the line in pg_hba.conf. If it matches, a forward
lookup is performed on the host name to check if any of the resulting IP
addresses match the client's IP address. If yes, the line is considered
to match and the authentication method is selected.

Anyway, assuming we will go with this, you will also notice that in the
patch I changed the default pg_hba.conf to match against "localhost"
instead of numeric addresses. Initially thought of as a temporary
change for testing this patch, I think this might actually have some
permanent value because it saves you from having to change the IPv4 and
IPv6 lines in tandem most of the times, which is a moderately common
mistake. We already rely on localhost being (forward) resolvable for
the stats collector.

Something to think about: Maybe we need a quoting mechanism in case
someone names their hosts "samenet".

Search Discussions

  • Kevin Grittner at Aug 9, 2010 at 6:56 pm

    Peter Eisentraut wrote:

    The client's IP address (known from the kernel)
    Some machines have several IP addresses; how is that handled?
    is reverse looked up, which results in a host name.
    Some IP addresses have several host names, including in reverse
    lookup; how is that handled?

    -Kevin
  • Stephen Frost at Aug 9, 2010 at 7:29 pm

    * Kevin Grittner (Kevin.Grittner@wicourts.gov) wrote:
    The client's IP address (known from the kernel)
    Some machines have several IP addresses; how is that handled?
    Sounds like he already described it, or I read it wrong. The fact that
    some machines have several IP addresses hardly matters- whatever IP is
    used to connect to PG is what gets the reverse DNS lookup. That then
    returns a host. That host is then looked up, and as long as *one* of
    the IPs associated with that host matches the IP of the connector, it's
    good to go.
    is reverse looked up, which results in a host name.
    Some IP addresses have several host names, including in reverse
    lookup; how is that handled?
    Yeahhhh... That's just busted, imnsho. But then, that's probably
    because it breaks Kerberos too. :)

    Thanks,

    Stephen
  • Peter Eisentraut at Aug 10, 2010 at 5:26 am

    On mån, 2010-08-09 at 13:56 -0500, Kevin Grittner wrote:
    Peter Eisentraut wrote:
    The client's IP address (known from the kernel)
    Some machines have several IP addresses; how is that handled?
    A connection comes in over exactly one address.
    is reverse looked up, which results in a host name.
    Some IP addresses have several host names, including in reverse
    lookup; how is that handled?
    This is not possible, or at least the C library APIs don't expose it.
    Compare the getnameinfo() and getaddrinfo() man pages, for example.
  • Stephen Frost at Aug 10, 2010 at 3:23 pm

    * Peter Eisentraut (peter_e@gmx.net) wrote:
    On mån, 2010-08-09 at 13:56 -0500, Kevin Grittner wrote:
    Some IP addresses have several host names, including in reverse
    lookup; how is that handled?
    This is not possible, or at least the C library APIs don't expose it.
    Compare the getnameinfo() and getaddrinfo() man pages, for example.
    Don't know how it happens at a technical level, but I've definitely seen
    it happen before.. Particularly with Windows domains where they don't
    have "clean-up reverse DNS" enabled. Manifests itself by having
    different host names show up on successive requests... Evil in any
    case.

    Stephen
  • Steve Atkins at Aug 10, 2010 at 3:52 pm

    On Aug 10, 2010, at 8:23 AM, Stephen Frost wrote:

    * Peter Eisentraut (peter_e@gmx.net) wrote:
    On mån, 2010-08-09 at 13:56 -0500, Kevin Grittner wrote:
    Some IP addresses have several host names, including in reverse
    lookup; how is that handled?
    This is not possible, or at least the C library APIs don't expose it.
    Compare the getnameinfo() and getaddrinfo() man pages, for example.
    Don't know how it happens at a technical level, but I've definitely seen
    it happen before.. Particularly with Windows domains where they don't
    have "clean-up reverse DNS" enabled. Manifests itself by having
    different host names show up on successive requests... Evil in any
    case.

    Multiple hostnames for a given IP address are supported just fine
    by the DNS. Some C library APIs support this just fine, others
    (such as getnameinfo) have been simplified to make them more
    pleasant to use for the common case of displaying a text representation
    of an IP address in a friendly manner with simple code, at the expense
    of actually returning correct data.

    So getnameinfo() is not suitable for this particular usage. If an
    IP address has multiple hostnames then what getnameinfo() will
    return isn't well-defined (and I believe there's been a trickle of
    bugs in implementations such that sometimes they won't return
    any hostname if there are multiple ones configured in the DNS).

    Any approach to restrict based on hostnames will either need to
    just work with forward DNS resolution of hostnames configured
    in pg_hba.conf to create a list of IP addresses to compare against
    an incoming connection, or it'll need to use a more general
    interface to get the reverse DNS of an incoming connection (e.g.
    gethostbyaddr(), less elegant as that is) before checking forward
    DNS.

    The former approach won't work if we want to support wildcard
    hostnames ("accept connections from *.example.com") - and
    that's the only useful functionality that adding hostname based
    ACLs provides, I think. If we want to do that, we need to use
    gethostbyaddr() to get all the claimed hostnames via reverse
    DNS, and for each of those that matches our ACL do a
    getaddrinfo() to check it resolves to the connecting IP.

    This is something that's pretty common to do in the email world,
    so stealing some robust code from there might be an idea.

    Cheers,
    Steve
  • Thom Brown at Aug 9, 2010 at 6:57 pm

    On 9 August 2010 19:47, Peter Eisentraut wrote:
    Here is a patch for host name support in pg_hba.conf.  I have reviewed
    various past threads about this, and there appeared to have been a 50/50
    split of for and against reverse lookup.  I went with the reverse
    lookup, because

    0) I like it.

    1) It is more secure.

    2) It allows extending it to wildcards in the future.

    3) Apache (Allow from) does it that way.

    To clarify how it works:  The client's IP address (known from the
    kernel) is reverse looked up, which results in a host name.  That host
    name is compared with the line in pg_hba.conf.  If it matches, a forward
    lookup is performed on the host name to check if any of the resulting IP
    addresses match the client's IP address.  If yes, the line is considered
    to match and the authentication method is selected.

    Anyway, assuming we will go with this, you will also notice that in the
    patch I changed the default pg_hba.conf to match against "localhost"
    instead of numeric addresses.  Initially thought of as a temporary
    change for testing this patch, I think this might actually have some
    permanent value because it saves you from having to change the IPv4 and
    IPv6 lines in tandem most of the times, which is a moderately common
    mistake.  We already rely on localhost being (forward) resolvable for
    the stats collector.

    Something to think about: Maybe we need a quoting mechanism in case
    someone names their hosts "samenet".


    --
    A couple things:

    + matches. This field can contain either a host name, an IP
    + address range, one of the special key words mentioned below.
    + </para>

    s/, one/, or one/

    + If a host name is specified (anything that is not an IP address
    + or a special key word is processed as a potential host name), a
    + reverse DNS lookup is performed on the client's IP address,
    + then a forward DNS lookup on the resulting name to check if it
    + matches the original IP address (that is, at least one of the
    + potentially many IP addresses matches the original one), and
    + the name found in the reverse lookup is compared with the
    + specified host name.

    That's one loooong sentence!

    --
    Thom Brown
    Registered Linux user: #516935
  • Robert Haas at Aug 9, 2010 at 7:05 pm

    On Mon, Aug 9, 2010 at 2:47 PM, Peter Eisentraut wrote:
    Here is a patch for host name support in pg_hba.conf.  I have reviewed
    various past threads about this, and there appeared to have been a 50/50
    split of for and against reverse lookup.  I went with the reverse
    lookup, because

    0) I like it.

    1) It is more secure.

    2) It allows extending it to wildcards in the future.

    3) Apache (Allow from) does it that way.

    To clarify how it works:  The client's IP address (known from the
    kernel) is reverse looked up, which results in a host name.  That host
    name is compared with the line in pg_hba.conf.  If it matches, a forward
    lookup is performed on the host name to check if any of the resulting IP
    addresses match the client's IP address.  If yes, the line is considered
    to match and the authentication method is selected.
    Seems reasonable.
    Anyway, assuming we will go with this, you will also notice that in the
    patch I changed the default pg_hba.conf to match against "localhost"
    instead of numeric addresses.  Initially thought of as a temporary
    change for testing this patch, I think this might actually have some
    permanent value because it saves you from having to change the IPv4 and
    IPv6 lines in tandem most of the times, which is a moderately common
    mistake.  We already rely on localhost being (forward) resolvable for
    the stats collector.
    -1 from me, on this part. I think we should be trying to eliminate
    any dependencies we have on how localhost resolves, and we certainly
    should not add more. I have spent way too much time fighting with
    problems caused by localhost being present/absent in /etc/hosts; and
    more caused by whether it maps to 127.0.0.1 or some IP assigned to one
    of the boxes' Ethernet cards. What's really special is when you have
    two parts of the system, one of which will only work with one version
    of /etc/hosts and the other of which will only work with a different
    version. I humbly, but fervently, suggest that we try to avoid being
    polemical about this.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise Postgres Company
  • Peter Eisentraut at Aug 10, 2010 at 5:24 am

    On mån, 2010-08-09 at 15:05 -0400, Robert Haas wrote:
    -1 from me, on this part. I think we should be trying to eliminate
    any dependencies we have on how localhost resolves, and we certainly
    should not add more.
    Maybe this is something that distributors could add if they have more
    knowledge about the "localhost" policies on their hosts.
  • Tom Lane at Aug 9, 2010 at 7:25 pm

    Peter Eisentraut writes:
    Here is a patch for host name support in pg_hba.conf.
    My recollection is that the previous discussions got stuck on the cost
    of doing DNS lookups for every connect; and the obvious solution of
    trying to cache the names was shot down on the basis of not knowing when
    to flush the cache. Have you decided that people who want this feature
    will just have to pay that cost? If so, I think the documentation needs
    to be a bit more explicit about names being more expensive than IP
    addresses in pg_hba.conf.

    regards, tom lane
  • Kristian Larsson at Aug 9, 2010 at 9:31 pm

    On Mon, Aug 09, 2010 at 03:25:23PM -0400, Tom Lane wrote:
    Peter Eisentraut <peter_e@gmx.net> writes:
    Here is a patch for host name support in pg_hba.conf.
    My recollection is that the previous discussions got stuck on the cost
    of doing DNS lookups for every connect; and the obvious solution of
    trying to cache the names was shot down on the basis of not knowing when
    to flush the cache. Have you decided that people who want this feature
    will just have to pay that cost? If so, I think the documentation needs
    to be a bit more explicit about names being more expensive than IP
    addresses in pg_hba.conf.
    I am also a bit scared of the startup cost for each connection.

    In which order are things evaluated? What if I only include IP
    addresses in my pg_hba, will it still cost me a DNS lookup or
    two?

    Kind regards,
    Kristian.

    --
    Kristian Larsson KLL-RIPE
    +46 704 264511 kll@spritelink.net
  • Peter Eisentraut at Aug 10, 2010 at 5:27 am

    On mån, 2010-08-09 at 23:01 +0200, Kristian Larsson wrote:
    In which order are things evaluated? What if I only include IP
    addresses in my pg_hba, will it still cost me a DNS lookup or
    two?
    No, if you don't use this feature, it won't cost you anything.
  • Peter Eisentraut at Aug 10, 2010 at 5:29 am

    On mån, 2010-08-09 at 15:25 -0400, Tom Lane wrote:
    Peter Eisentraut <peter_e@gmx.net> writes:
    Here is a patch for host name support in pg_hba.conf.
    My recollection is that the previous discussions got stuck on the cost
    of doing DNS lookups for every connect; and the obvious solution of
    trying to cache the names was shot down on the basis of not knowing when
    to flush the cache. Have you decided that people who want this feature
    will just have to pay that cost?
    Either that or install nscd.

    While the speed of establishing a connection is critical to some setups,
    there are also plenty of setups where the connection setup speed is
    pretty much unimportant because of connection pooling.
    If so, I think the documentation needs
    to be a bit more explicit about names being more expensive than IP
    addresses in pg_hba.conf.
    Fair enough.
  • Kevin Grittner at Aug 10, 2010 at 12:32 pm

    Peter Eisentraut wrote:
    On mån, 2010-08-09 at 13:56 -0500, Kevin Grittner wrote:
    Peter Eisentraut wrote:
    is reverse looked up, which results in a host name.
    Some IP addresses have several host names, including in reverse
    lookup; how is that handled?
    This is not possible,
    http://en.wikipedia.org/wiki/Reverse_DNS_lookup#Multiple_pointer_records
    or at least the C library APIs don't expose it.
    That may explain the prevalence of bugs in code dealing with it.

    -Kevin
  • Peter Eisentraut at Aug 10, 2010 at 2:06 pm

    On tis, 2010-08-10 at 07:32 -0500, Kevin Grittner wrote:
    http://en.wikipedia.org/wiki/Reverse_DNS_lookup#Multiple_pointer_records
    Yeah, you can configure all kinds of nonsense and sometimes even get
    away with it, but the basic assumption throughout is that a system has
    one host name and between 1 and many IP addresses. We must make our
    implementation robust again other setups, but we don't have to (or
    rather cannot) support them.
  • Robert Haas at Aug 10, 2010 at 2:11 pm

    On Tue, Aug 10, 2010 at 10:05 AM, Peter Eisentraut wrote:
    On tis, 2010-08-10 at 07:32 -0500, Kevin Grittner wrote:
    http://en.wikipedia.org/wiki/Reverse_DNS_lookup#Multiple_pointer_records
    Yeah, you can configure all kinds of nonsense and sometimes even get
    away with it, but the basic assumption throughout is that a system has
    one host name and between 1 and many IP addresses.
    These days, I think it's more common the other way around: one IP
    address, and many host names.
    We must make our
    implementation robust again other setups, but we don't have to (or
    rather cannot) support them.
    "Cannot" is a good argument for not supporting just about anything.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise Postgres Company
  • Peter Eisentraut at Aug 10, 2010 at 2:29 pm

    On tis, 2010-08-10 at 10:11 -0400, Robert Haas wrote:
    These days, I think it's more common the other way around: one IP
    address, and many host names.
    Yes, that setup is very common, but it's actually only an illusion that
    DNS creates. The actual machine still has only one host name and some
    IP addresses, as far as the kernel is concerned. As you are surely
    aware, this situation creates all kinds of problems in practice.
  • Kevin Grittner at Aug 10, 2010 at 2:19 pm

    Peter Eisentraut wrote:

    Yeah, you can configure all kinds of nonsense and sometimes even
    get away with it, but the basic assumption throughout is that a
    system has one host name and between 1 and many IP addresses.
    It's hardly nonsense to have multiple names on a machine. While we
    usually avoid having multiple reverse lookup names, we have many
    in-house web applications and we neither want users to access them
    by IP address or have to worry about which web server is hosting
    which applications at the moment. So it's not unusual for one of
    our web servers to have 10 or 15 DNS names for forward lookup. If
    one machine becomes overloaded, we can move an application, change
    the DNS, and everyone's bookmark still works. This is precisely the
    sort of situation where using a hostname in pg_hba.conf would be
    most useful.
    We must make our implementation robust again other setups, but we
    don't have to (or rather cannot) support them.
    Without the logic to ensure that the hostname matches the reverse
    lookup, this might be useful for us. With that logic it is useless
    for us. I'm wondering how much you gain by having it in there. Why
    can't a forward lookup which matches the requesting IP be considered
    sufficient?

    -Kevin
  • Peter Eisentraut at Aug 10, 2010 at 2:32 pm

    On tis, 2010-08-10 at 09:18 -0500, Kevin Grittner wrote:
    Without the logic to ensure that the hostname matches the reverse
    lookup, this might be useful for us. With that logic it is useless
    for us. I'm wondering how much you gain by having it in there. Why
    can't a forward lookup which matches the requesting IP be considered
    sufficient?
    For one thing, because people might like to add wildcard support. So I
    might be able to say

    host all all appserver*.example.com md5
  • Tom Lane at Aug 10, 2010 at 2:59 pm

    Peter Eisentraut writes:
    On tis, 2010-08-10 at 09:18 -0500, Kevin Grittner wrote:
    Why can't a forward lookup which matches the requesting IP be considered
    sufficient?
    For one thing, because people might like to add wildcard support. So I
    might be able to say
    host all all appserver*.example.com md5
    I don't think that the possibility that we might support that in future
    can justify using a slow and not-too-reliable method for ordinary
    non-wildcard names.

    regards, tom lane
  • Tom Lane at Aug 10, 2010 at 2:39 pm

    "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
    Without the logic to ensure that the hostname matches the reverse
    lookup, this might be useful for us. With that logic it is useless
    for us. I'm wondering how much you gain by having it in there. Why
    can't a forward lookup which matches the requesting IP be considered
    sufficient?
    I was about to complain about that same thing. ISTM the logic ought
    to be that you do a forward DNS lookup on the name presented in
    pg_hba.conf, and if any of the returned IP addresses match the
    connection's remote IP address, then you have a match. This business
    with doing a reverse lookup is at least twice as expensive, far more
    fragile, and it seems completely bogus from a security viewpoint.
    Why should I trust the RDNS server for an attacker's IP address?

    regards, tom lane
  • Aidan Van Dyk at Aug 10, 2010 at 2:50 pm

    * Tom Lane [100810 10:39]:

    I was about to complain about that same thing. ISTM the logic ought
    to be that you do a forward DNS lookup on the name presented in
    pg_hba.conf, and if any of the returned IP addresses match the
    connection's remote IP address, then you have a match. This business
    with doing a reverse lookup is at least twice as expensive, far more
    fragile, and it seems completely bogus from a security viewpoint.
    Why should I trust the RDNS server for an attacker's IP address?
    Well, you don't trust the RDNS of the IP, you trust the normal lookup of
    the hostname returned by the RDNS. So if some other ip network is
    trying to give hostnames that should be authorized, you see that
    immediately when you resolve the "authorized" hostname and it doesn't
    give you that IP.

    The PTR query is a means to get the "hostname" to check against, so you
    d'nt have to pre-cache all thos possible results of all the hostnames.
    Pre-caching all the hostnames in pg_hba.conf is madness. How long do
    you cache them for? or do send out 1000 queries every connection? You
    can't support wildcards, or anythign usefull...

    AFAIK, every software I've used which allows hostnames as some
    connection control all do PTR->A/AAAA lookups as Peter proposed.

    a.


    --
    Aidan Van Dyk Create like a god,
    aidan@highrise.ca command like a king,
    http://www.highrise.ca/ work like a slave.
  • Stephen Frost at Aug 10, 2010 at 3:39 pm

    * Aidan Van Dyk (aidan@highrise.ca) wrote:
    The PTR query is a means to get the "hostname" to check against, so you
    d'nt have to pre-cache all thos possible results of all the hostnames.
    Pre-caching all the hostnames in pg_hba.conf is madness. How long do
    you cache them for? or do send out 1000 queries every connection? You
    can't support wildcards, or anythign usefull...

    AFAIK, every software I've used which allows hostnames as some
    connection control all do PTR->A/AAAA lookups as Peter proposed.
    Completely agreed. It's madness to precache all thse hostnames, but we
    need to figure out the hostname, thus, rDNS is used. The forward lookup
    is then to double-check that it matches. This is exactly how Kerberos
    works also. You certainly don't want to be repeatedly doing rDNS
    lookups to see if maybe that IP has other hosts. I also don't buy that
    there's an issue with setting up your rDNS to go to what you put in the
    pg_hba and then having the forward of that include the IP; again, it's
    how Kerberos works, and even if you don't believe in Kerberos, I hope
    you realize it's kind of popular.

    Thanks,

    Stephen
  • Peter Eisentraut at Aug 10, 2010 at 2:56 pm

    On tis, 2010-08-10 at 10:39 -0400, Tom Lane wrote:
    I was about to complain about that same thing. ISTM the logic ought
    to be that you do a forward DNS lookup on the name presented in
    pg_hba.conf, and if any of the returned IP addresses match the
    connection's remote IP address, then you have a match. This business
    with doing a reverse lookup is at least twice as expensive, far more
    fragile, and it seems completely bogus from a security viewpoint.
    If you have hundreds on lines in pg_hba.conf, then you need to do
    hundreds of DNS lookups per connection attempt (*), and each of those
    lookups could result in even more IP addresses, or could time out. So
    if some unrelated part of the system breaks (DNS down), it could take
    you hours to establish a connection. On the other hand, with the
    reverse DNS lookup, you would normally do about two DNS queries per
    successful connection attempt, and those would only be in relation to
    the machines actually involved in the connection. Also, if you are in a
    names-only environment, you might also like to turn on log_hostnames, in
    which case the reverse lookup is free (well, shared).

    (*) That could of course be addressed by your earlier idea of caching
    the resolved names when pg_hba.conf is read, but I don't think many
    people were on board with that idea.
  • Tom Lane at Aug 10, 2010 at 3:08 pm

    Peter Eisentraut writes:
    On tis, 2010-08-10 at 10:39 -0400, Tom Lane wrote:
    I was about to complain about that same thing. ISTM the logic ought
    to be that you do a forward DNS lookup on the name presented in
    pg_hba.conf, and if any of the returned IP addresses match the
    connection's remote IP address, then you have a match. This business
    with doing a reverse lookup is at least twice as expensive, far more
    fragile, and it seems completely bogus from a security viewpoint.
    If you have hundreds on lines in pg_hba.conf, then you need to do
    hundreds of DNS lookups per connection attempt (*), and each of those
    lookups could result in even more IP addresses, or could time out.
    If you have a configuration that would actually require that, then you
    would have a case for using a wildcard. My complaint is that you're
    trying to force everyone to pay for that feature whether it's of use to
    them or not. I think it's at least as likely that typical setups would
    need exactly *one*, non wildcard, entry, to wit appserver.mycompany.com
    (which'd necessarily yield IPs for all the machines running your app
    server code).

    regards, tom lane
  • Stephen Frost at Aug 10, 2010 at 3:45 pm

    * Kevin Grittner (Kevin.Grittner@wicourts.gov) wrote:
    It's hardly nonsense to have multiple names on a machine. While we
    usually avoid having multiple reverse lookup names, we have many
    in-house web applications and we neither want users to access them
    by IP address or have to worry about which web server is hosting
    which applications at the moment. So it's not unusual for one of
    our web servers to have 10 or 15 DNS names for forward lookup. If
    one machine becomes overloaded, we can move an application, change
    the DNS, and everyone's bookmark still works. This is precisely the
    sort of situation where using a hostname in pg_hba.conf would be
    most useful.
    We're talking about client systems connecting to PG here. Are you
    authenticating your web users by looking at their client addresses..?
    That strikes me as pretty unlikely. Your web servers will be connecting
    to your PG server from *one* address (whatever the main one is for that
    pariticular server), and that address just needs to have an rDNS entry
    that goes to a host whose forward DNS includes that IP.

    If you have multiple web servers that are connecting to the same PG
    database, then have multiple pg_hba entries, or make them all have the
    same hostname per reverse DNS (though I don't really see why you'd want
    to).
    We must make our implementation robust again other setups, but we
    don't have to (or rather cannot) support them.
    Without the logic to ensure that the hostname matches the reverse
    lookup, this might be useful for us. With that logic it is useless
    for us. I'm wondering how much you gain by having it in there. Why
    can't a forward lookup which matches the requesting IP be considered
    sufficient?
    Because "you can't get there from here". You'd either have to cache all
    the entries in pg_hba (which is horrible), or do a look-up on each one
    on every connection (which is going to be a hell of alot slower than
    doing one more DNS lookup here). This isn't magic. What we have is a
    bunch of host names and a single IP (the connecting one). Figuring out
    which one goes with which is the issue.

    Stephen
  • KaiGai Kohei at Oct 5, 2010 at 3:36 am
    It seems to me this patch has been left for a long time, although it tries
    to provide a useful functionality.

    In the previous discussion, several issues were pointed out.

    * Case handling when an IP-address has multiple hostnames.

    Steve Atkins noticed getaddrinfo() does not have capability to handle
    this case, because it returns just one hostname from the candidates.
    So, he suggested to use gethostbyaddr() instead, because it can return
    all the candidates of hostnames.

    I also think gethostbyaddr() is more prefereble here rather than
    getaddrinfo() with the primary hostname as a fallback.

    How about the following idea?

    1. Add a list of hostname on Port structure to store all the candidates
    of hostnames for the current remote host.
    2. Revise the pg_getnameinfo_all() to call the gethostbyaddr(), then
    it constructs the list from the returned hostent->h_aliases.
    3. Revise Port->remote_host to indicate contents of the list head.
    4. Revise the new check_hostname() to compare the supplied hostname
    from pg_hba.conf and each hostnames in the above list.

    If pg_getnameinfo_all() would have all the candidates of hostnames,
    we don't need to lookup dns twice on fallbacks.


    * "localhost" in the default pg_hba.conf

    Robert Haas disagreed to switch "localhost" in the default pg_hba.conf
    into numeric expression, because /etc/host should determine where the
    localhost shall be mapped.

    I agree with the opinion from Robert. We should not have any assumptions
    of /ets/hosts settings.


    * Cost for reverse lookup of DNS

    Tom Lane concerned about the cost to communicate with DNS for each
    connections. You answered either caching mechanism in PG or nscd is
    a solution, and agreed to add an explicit description about names
    being more expensive than IP addresses in pg_hba.conf.

    I also think it is fair enough. How about adding a recommendation
    to install nscd when people uses hostname in pg_hba.conf.

    Thanks,

    (2010/08/10 3:47), Peter Eisentraut wrote:
    Here is a patch for host name support in pg_hba.conf. I have reviewed
    various past threads about this, and there appeared to have been a 50/50
    split of for and against reverse lookup. I went with the reverse
    lookup, because

    0) I like it.

    1) It is more secure.

    2) It allows extending it to wildcards in the future.

    3) Apache (Allow from) does it that way.

    To clarify how it works: The client's IP address (known from the
    kernel) is reverse looked up, which results in a host name. That host
    name is compared with the line in pg_hba.conf. If it matches, a forward
    lookup is performed on the host name to check if any of the resulting IP
    addresses match the client's IP address. If yes, the line is considered
    to match and the authentication method is selected.

    Anyway, assuming we will go with this, you will also notice that in the
    patch I changed the default pg_hba.conf to match against "localhost"
    instead of numeric addresses. Initially thought of as a temporary
    change for testing this patch, I think this might actually have some
    permanent value because it saves you from having to change the IPv4 and
    IPv6 lines in tandem most of the times, which is a moderately common
    mistake. We already rely on localhost being (forward) resolvable for
    the stats collector.

    Something to think about: Maybe we need a quoting mechanism in case
    someone names their hosts "samenet".



    --
    KaiGai Kohei <kaigai@ak.jp.nec.com>
  • Peter Eisentraut at Oct 5, 2010 at 7:41 pm

    On tis, 2010-10-05 at 12:35 +0900, KaiGai Kohei wrote:
    It seems to me this patch has been left for a long time, although it tries
    to provide a useful functionality.

    In the previous discussion, several issues were pointed out.

    * Case handling when an IP-address has multiple hostnames.

    Steve Atkins noticed getaddrinfo() does not have capability to handle
    this case, because it returns just one hostname from the candidates.
    So, he suggested to use gethostbyaddr() instead, because it can return
    all the candidates of hostnames.

    I also think gethostbyaddr() is more prefereble here rather than
    getaddrinfo() with the primary hostname as a fallback.
    I have several concerns about this approach.

    The first is conceptual: Assume that an IP address resolves as host
    names "foo" and "bar", perhaps even by accident. You have log_hostnames
    on, which would be typical if you have an all-names pg_hba.conf.
    log_hostnames only logs the first host name, say, "foo". And then
    assume that pg_hba.conf is set up to only allow "bar" in. There you
    have a debugging and auditing nightmare, because what pg_hba.conf says
    and what you are logging doesn't match. To address this you would have
    to change the log_hostnames facility to log *all* host names everywhere
    the host name is mentioned, which could make this whole thing quite
    silly.

    Secondly, long-standing and presumably reputable implementations of a
    similar feature, namely Apache's mod_authz_host and tcp-wrappers use
    getnameinfo() in preference of gethostbyaddr(), and don't support
    multiple host names per IP address. In fact, reading through that
    source code indicates that gethostbyaddr() has all kinds of bugs and
    issues, including apparently lack of IPv6 support (on some platforms?).

    Thirdly, gethostbyname() and gethostbyaddr() are deprecated by POSIX in
    favor of getaddrinfo() and getnameinfo(), so we shouldn't build new
    features that depend on them.
    * "localhost" in the default pg_hba.conf

    Robert Haas disagreed to switch "localhost" in the default pg_hba.conf
    into numeric expression, because /etc/host should determine where the
    localhost shall be mapped.

    I agree with the opinion from Robert. We should not have any assumptions
    of /ets/hosts settings.
    Note that we already default listen_addresses to 'localhost', so it
    would only make sense to have pg_hba.conf behave the same by default.
    To pick up on your argument, we effectively *do* make assumptions
    about /etc/hosts now, and this change would remove them.

    Note that this would only be a default. But if you think that it's the
    wrong default, then we should consider changing the listen_addresses
    default as well.

    The reason why I think this is semi-important and not just cosmetic is
    that (for some reason that is not entirely clear to me) clients
    connecting to localhost end up appearing to the server as ::1 on a lot
    of machines I use which are not otherwise keen on IPv6, and it is a
    common mistake to forget to keep the pg_hba.conf entries for 127.0.0.1
    and ::1 in sync.
  • KaiGai Kohei at Oct 6, 2010 at 1:22 am

    (2010/10/06 4:41), Peter Eisentraut wrote:
    On tis, 2010-10-05 at 12:35 +0900, KaiGai Kohei wrote:
    It seems to me this patch has been left for a long time, although it tries
    to provide a useful functionality.

    In the previous discussion, several issues were pointed out.

    * Case handling when an IP-address has multiple hostnames.

    Steve Atkins noticed getaddrinfo() does not have capability to handle
    this case, because it returns just one hostname from the candidates.
    So, he suggested to use gethostbyaddr() instead, because it can return
    all the candidates of hostnames.

    I also think gethostbyaddr() is more prefereble here rather than
    getaddrinfo() with the primary hostname as a fallback.
    I have several concerns about this approach.

    The first is conceptual: Assume that an IP address resolves as host
    names "foo" and "bar", perhaps even by accident. You have log_hostnames
    on, which would be typical if you have an all-names pg_hba.conf.
    log_hostnames only logs the first host name, say, "foo". And then
    assume that pg_hba.conf is set up to only allow "bar" in. There you
    have a debugging and auditing nightmare, because what pg_hba.conf says
    and what you are logging doesn't match. To address this you would have
    to change the log_hostnames facility to log *all* host names everywhere
    the host name is mentioned, which could make this whole thing quite
    silly.

    Secondly, long-standing and presumably reputable implementations of a
    similar feature, namely Apache's mod_authz_host and tcp-wrappers use
    getnameinfo() in preference of gethostbyaddr(), and don't support
    multiple host names per IP address. In fact, reading through that
    source code indicates that gethostbyaddr() has all kinds of bugs and
    issues, including apparently lack of IPv6 support (on some platforms?).

    Thirdly, gethostbyname() and gethostbyaddr() are deprecated by POSIX in
    favor of getaddrinfo() and getnameinfo(), so we shouldn't build new
    features that depend on them.
    OK, it seems to me fair enough. I consented with your explanation.

    I'll check the patch for more details, please wait for a few days.
    * "localhost" in the default pg_hba.conf

    Robert Haas disagreed to switch "localhost" in the default pg_hba.conf
    into numeric expression, because /etc/host should determine where the
    localhost shall be mapped.

    I agree with the opinion from Robert. We should not have any assumptions
    of /ets/hosts settings.
    Note that we already default listen_addresses to 'localhost', so it
    would only make sense to have pg_hba.conf behave the same by default.
    To pick up on your argument, we effectively *do* make assumptions
    about /etc/hosts now, and this change would remove them.
    Sorry, I misread something.
    I read the previous discussions again, then I know I misread the reason
    why Robert disagreed with this replacement. He said we should not assume
    resolve of localhost being enough fast because of local /etc/hosts, not
    saying we should not assume localhost is "127.0.0.1" or "::1". Right?

    Well, in my personal opinion, we should not assume the way to resolve
    localhost, but we can expect more than 99.9% of hosts resolve localhost
    using local /etc/hosts. Even if here is a curious setting, it will pay
    a bit more cost on connection. It is a responsibility of DBA.

    I agree with replacement "127.0.0.1" and "::1" by "localhost".
    It enables to eliminate an assumption that localhost have either of
    their addresses.

    Thanks,
    --
    KaiGai Kohei <kaigai@ak.jp.nec.com>
  • Tom Lane at Oct 6, 2010 at 2:18 am

    KaiGai Kohei writes:
    Note that we already default listen_addresses to 'localhost', so it
    would only make sense to have pg_hba.conf behave the same by default.
    To pick up on your argument, we effectively *do* make assumptions
    about /etc/hosts now, and this change would remove them.
    Sorry, I misread something.
    I read the previous discussions again, then I know I misread the reason
    why Robert disagreed with this replacement. He said we should not assume
    resolve of localhost being enough fast because of local /etc/hosts, not
    saying we should not assume localhost is "127.0.0.1" or "::1". Right?
    Well, in my personal opinion, we should not assume the way to resolve
    localhost, but we can expect more than 99.9% of hosts resolve localhost
    using local /etc/hosts. Even if here is a curious setting, it will pay
    a bit more cost on connection. It is a responsibility of DBA.
    I agree with replacement "127.0.0.1" and "::1" by "localhost".
    It enables to eliminate an assumption that localhost have either of
    their addresses.
    This argument is completely unfounded in reality. Please go read the
    relevant RFCs. 127.0.0.1 is standardized as the IPv4 loopback address
    (see RFC3330), and ::1 is standardized as the IPv6 loopback address (see
    RFC1884 section 2.4.3). So far as I can find, there is *no* standard
    mandating that localhost means the loopback address. RFC1537 suggests
    that DNS domains "should" resolve localhost.anything as 127.0.0.1; but
    that is a lot weaker than the other specifications, and there's nothing
    whatever to promise that it will work in a DNS-less environment.

    In fact, we have seen cases where it didn't work even with publicly
    available DNS servers, eg
    http://archives.postgresql.org/pgsql-admin/2010-05/msg00073.php

    That example leads me to think that using localhost in the default
    pg_hba.conf file would actually be a security hazard: you would be
    placing it in the hands of your DNS provider as to which addresses
    Postgres will believe are "local" connections. That's an OK decision
    for individual admins to make, but it's not a good idea for us to ship
    it as a universal default. (Note that relying on the DNS provider to
    interpret listen_addresses is not nearly as dangerous, since in any case
    the kernel isn't going to let us bind() to nonlocal addresses.)

    On top of that, there's no way for your DNS server to know whether your
    local kernel speaks IPv6 or not, so you might not get a resolution of
    the name that includes the appropriate loopback addresses. This may or
    may not have anything to do with the reports we occasionally get of
    people having to change listen_addresses to "*" to get things to work.

    Between these issues and the speed concern, I don't think that we should
    change this.

    regards, tom lane
  • Stephen Frost at Oct 6, 2010 at 2:26 am

    * Tom Lane (tgl@sss.pgh.pa.us) wrote:
    Between these issues and the speed concern, I don't think that we should
    change this.
    +1.

    Stephen
  • Peter Eisentraut at Oct 6, 2010 at 8:05 am

    On tis, 2010-10-05 at 22:17 -0400, Tom Lane wrote:
    So far as I can find, there is *no* standard
    mandating that localhost means the loopback address.
    Should we then change pgstat.c to use IP addresses instead of hardcoding
    "localhost"?
  • Andrew Dunstan at Oct 6, 2010 at 11:53 am

    On 10/06/2010 04:05 AM, Peter Eisentraut wrote:
    On tis, 2010-10-05 at 22:17 -0400, Tom Lane wrote:
    So far as I can find, there is *no* standard
    mandating that localhost means the loopback address.
    Should we then change pgstat.c to use IP addresses instead of hardcoding
    "localhost"?
    I understood Tom to be saying we should not rely on "localhost" for
    authentication, not that we shouldn't use it at all.

    cheers

    andrew
  • Tom Lane at Oct 6, 2010 at 1:18 pm

    Andrew Dunstan writes:
    On 10/06/2010 04:05 AM, Peter Eisentraut wrote:
    On tis, 2010-10-05 at 22:17 -0400, Tom Lane wrote:
    So far as I can find, there is *no* standard
    mandating that localhost means the loopback address.
    Should we then change pgstat.c to use IP addresses instead of hardcoding
    "localhost"?
    I understood Tom to be saying we should not rely on "localhost" for
    authentication, not that we shouldn't use it at all.
    I think it's all right to use it as the default value for
    listen_addresses, because (1) it's an understandable default,
    and (2) users can change the setting if it doesn't work.

    However, the usage in pgstat.c is hard-wired, meaning that if you
    have a configuration where "localhost" doesn't resolve correctly
    for whatever reason, there's no simple recourse to get the stats
    collector working. So ISTM there is an argument for changing that.

    regards, tom lane
  • Magnus Hagander at Oct 6, 2010 at 1:21 pm

    On Wed, Oct 6, 2010 at 15:16, Tom Lane wrote:
    Andrew Dunstan <andrew@dunslane.net> writes:
    On 10/06/2010 04:05 AM, Peter Eisentraut wrote:
    On tis, 2010-10-05 at 22:17 -0400, Tom Lane wrote:
    So far as I can find, there is *no* standard
    mandating that localhost means the loopback address.
    Should we then change pgstat.c to use IP addresses instead of hardcoding
    "localhost"?
    I understood Tom to be saying we should not rely on "localhost" for
    authentication, not that we shouldn't use it at all.
    I think it's all right to use it as the default value for
    listen_addresses, because (1) it's an understandable default,
    and (2) users can change the setting if it doesn't work.

    However, the usage in pgstat.c is hard-wired, meaning that if you
    have a configuration where "localhost" doesn't resolve correctly
    for whatever reason, there's no simple recourse to get the stats
    collector working.  So ISTM there is an argument for changing that.
    Well, hardcoding it will break the (unusual) case when localhost isn't
    127.0.0.1 / ::1. (You'd obviously have to have it try both ipv4 and
    ipv6).

    It's not common, but i've certainly come across a number of virtual
    machines where localhost resolves (through /etc/hosts) to the machines
    "real" IP rather than 127.0.01, because 127.0.0.1 simply doesn't
    exist.
  • Tom Lane at Oct 6, 2010 at 1:34 pm

    Magnus Hagander writes:
    On Wed, Oct 6, 2010 at 15:16, Tom Lane wrote:
    However, the usage in pgstat.c is hard-wired, meaning that if you
    have a configuration where "localhost" doesn't resolve correctly
    for whatever reason, there's no simple recourse to get the stats
    collector working. So ISTM there is an argument for changing that.
    Well, hardcoding it will break the (unusual) case when localhost isn't
    127.0.0.1 / ::1. (You'd obviously have to have it try both ipv4 and
    ipv6).
    You didn't read what I wrote before. Those numeric addresses define the
    loopback address, *not* "localhost". When localhost fails to resolve
    as those address(es), it's localhost that is wrong. We have actually
    seen this in the field with bogus DNS providers.
    It's not common, but i've certainly come across a number of virtual
    machines where localhost resolves (through /etc/hosts) to the machines
    "real" IP rather than 127.0.01, because 127.0.0.1 simply doesn't
    exist.
    That appears to me to be a broken (non RFC compliant) VM setup.
    However, maybe what this is telling us is we need to expose the setting?
    Or perhaps better, try 127.0.0.1, ::1, localhost, in that order.

    regards, tom lane
  • Magnus Hagander at Oct 6, 2010 at 1:44 pm

    On Wed, Oct 6, 2010 at 15:34, Tom Lane wrote:
    Magnus Hagander <magnus@hagander.net> writes:
    On Wed, Oct 6, 2010 at 15:16, Tom Lane wrote:
    However, the usage in pgstat.c is hard-wired, meaning that if you
    have a configuration where "localhost" doesn't resolve correctly
    for whatever reason, there's no simple recourse to get the stats
    collector working. So ISTM there is an argument for changing that.
    Well, hardcoding it will break the (unusual) case when localhost isn't
    127.0.0.1 / ::1. (You'd obviously have to have it try both ipv4 and
    ipv6).
    You didn't read what I wrote before.  Those numeric addresses define the
    loopback address, *not* "localhost".  When localhost fails to resolve
    as those address(es), it's localhost that is wrong.  We have actually
    seen this in the field with bogus DNS providers.
    It's not common, but i've certainly come across a number of virtual
    machines where localhost resolves (through /etc/hosts) to the machines
    "real" IP rather than 127.0.01, because 127.0.0.1 simply doesn't
    exist.
    That appears to me to be a broken (non RFC compliant) VM setup.
    Can't argue with that. But it exists.

    However, maybe what this is telling us is we need to expose the setting?
    Or perhaps better, try 127.0.0.1, ::1, localhost, in that order.
    That was kind of my point, that yes, we probably need to do one of
    those at least. Today it is "kind of exposed", because you can edit
    /etc/hosts - you don't need to rely on DNS for it. I just don't want
    to lose that ability.
  • Stephen Frost at Oct 6, 2010 at 1:49 pm

    * Tom Lane (tgl@sss.pgh.pa.us) wrote:
    That appears to me to be a broken (non RFC compliant) VM setup.
    However, maybe what this is telling us is we need to expose the setting?
    Or perhaps better, try 127.0.0.1, ::1, localhost, in that order.
    Yeah, I'd be happier if we exposed it, to be honest. Either that, or
    figure out a way to get rid of it entirely by using a different method,
    but that's a much bigger issue.

    Thanks,

    Stephen
  • Andrew Dunstan at Oct 6, 2010 at 2:11 pm

    On 10/06/2010 09:49 AM, Stephen Frost wrote:
    * Tom Lane (tgl@sss.pgh.pa.us) wrote:
    That appears to me to be a broken (non RFC compliant) VM setup.
    However, maybe what this is telling us is we need to expose the setting?
    Or perhaps better, try 127.0.0.1, ::1, localhost, in that order.
    Yeah, I'd be happier if we exposed it, to be honest. Either that, or
    figure out a way to get rid of it entirely by using a different method,
    but that's a much bigger issue.
    Please don't expose it. It will a source of yet more confusion. People
    already get confused by the difference between listening addresses and
    pg_hba.conf addresses. It's one of the most frequent points of confusion
    seen on IRC. Adding another address to configure will just compound the
    confusion badly. I much prefer Tom's last suggestion.

    cheers

    andrew
  • Greg Stark at Oct 6, 2010 at 8:10 pm

    On Wed, Oct 6, 2010 at 6:21 AM, Magnus Hagander wrote:
    It's not common, but i've certainly come across a number of virtual
    machines where localhost resolves (through /etc/hosts) to the machines
    "real" IP rather than 127.0.01, because 127.0.0.1 simply doesn't
    exist.
    It's perfectly fine for localhost to resolve to the machine's external
    ip address. It would be weird for it to resolve to some other host's
    ip address like the vm's host machine. But having 127.0.0.1 not exist
    would be positively broken. All kinds of things wouldn't work. Are you
    sure about that part?


    --
    greg
  • Tom Lane at Oct 6, 2010 at 1:12 pm

    Peter Eisentraut writes:
    On tis, 2010-10-05 at 22:17 -0400, Tom Lane wrote:
    So far as I can find, there is *no* standard
    mandating that localhost means the loopback address.
    Should we then change pgstat.c to use IP addresses instead of hardcoding
    "localhost"?
    Hm, perhaps so.

    regards, tom lane
  • KaiGai Kohei at Oct 7, 2010 at 3:46 am

    (2010/10/06 10:21), KaiGai Kohei wrote:
    I'll check the patch for more details, please wait for a few days.
    I could find out some matters in this patch, independent from the
    discussion of localhost. (About pg_hba.conf.sample, I'm sorry for
    the missuggestion. Please fix up it according to Tom's comment.)

    * The logic is still unclear for me.

    The check_hostname() immediately returns with false, if the resolved
    remote hostname is NOT matched with the hostname described in pg_hba.conf.
    +static bool
    +check_hostname(hbaPort *port, const char *hostname)
    +{
    + struct addrinfo *gai_result, *gai;
    + int ret;
    + bool found;
    +
    + /* Lookup remote host name if not already done */
    + if (!port->remote_hostname)
    + {
    + char remote_hostname[NI_MAXHOST];
    +
    + if (pg_getnameinfo_all(&port->raddr.addr, port->raddr.salen,
    + remote_hostname, sizeof(remote_hostname),
    + NULL, 0,
    + 0))
    + return false;
    +
    + port->remote_hostname = strdup(remote_hostname);
    + }
    +
    + if (strcmp(port->remote_hostname, hostname) != 0)
    + return false;
    +
    + /* Lookup IP from host name and check against original IP */
    However, it seems to me you expected an opposite behavior.

    If the resolved hostname is matched with the hostname described
    in pg_hba.conf, we can consider this HbaLine to be a suitable
    configuration without any fallbacks. Right?
    It so, it should be as follows:

    if (strcmp(port->remote_hostname, hostname) == 0)
    return true;

    In addition, we should go the rest of fallback code on mismatch
    cases only, don't we?

    * Why getnameinfo() in the fallback loop?

    At the fallback code when the hostname was matched (I believe this code
    is intended to handle the case when hostname was NOT matched.) calls
    getnameinfo() for each candidate of remote addresses.
    But its result is not referenced by anybody. Is it really necessary?
    + found = false;
    + for (gai = gai_result; gai; gai = gai->ai_next)
    + {
    + char hostinfo[NI_MAXHOST];
    +
    + getnameinfo(gai->ai_addr, gai->ai_addrlen,
    + hostinfo, sizeof(hostinfo),
    + NULL, 0,
    + NI_NUMERICHOST);
    +
    + if (gai->ai_addr->sa_family == port->raddr.addr.ss_family)
    + {
    + if (gai->ai_addr->sa_family == AF_INET)
    + {
    + if (ipv4eq((struct sockaddr_in *) gai->ai_addr,
    + (struct sockaddr_in *) &port->raddr.addr))
    + {
    + found = true;
    + break;
    + }
    + }
    + else if (gai->ai_addr->sa_family == AF_INET6)
    + {
    + if (ipv6eq((struct sockaddr_in6 *) gai->ai_addr,
    + (struct sockaddr_in6 *) &port->raddr.addr))
    + {
    + found = true;
    + break;
    + }
    + }
    + }
    + }
    +
    + if (gai_result)
    + freeaddrinfo(gai_result);
    +
    + return found;
    +}
    * Slash ('/') after the hostname

    At the parse_hba_line(), the parsed token which contains either
    hostname or cidr address is sliced into two parts on the first '/'
    character, if exist.
    Then, even if cidr_slash is not NULL, it shall be ignored when
    top-half of the token is hostname, not numeric address.
    else
    {
    /* IP and netmask are specified */
    parsedline->ip_cmp_method = ipCmpMask;

    /* need a modifiable copy of token */
    token = pstrdup(token);

    /* Check if it has a CIDR suffix and if so isolate it */
    cidr_slash = strchr(token, '/');
    if (cidr_slash)
    *cidr_slash = '\0'; :
    ret = pg_getaddrinfo_all(token, NULL, &hints, &gai_result);
    - if (ret || !gai_result)
    + if (ret == 0 && gai_result)
    + memcpy(&parsedline->addr, gai_result->ai_addr,
    + gai_result->ai_addrlen);
    + else if (ret == EAI_NONAME)
    + parsedline->hostname = token;
    + else
    {
    It allows the following configuration works without any errors.
    (In fact, it works for me.)

    # IPv4/6 local connections:
    host all all kaigai.myhome.cx/today_is_sunny trust

    It seems to me, we should raise an error, if both of cidr_slash and
    parsedline->hostname are not NULL.

    Thanks,
    --
    KaiGai Kohei <kaigai@ak.jp.nec.com>
  • Peter Eisentraut at Oct 11, 2010 at 6:34 pm

    On tor, 2010-10-07 at 12:45 +0900, KaiGai Kohei wrote:
    * The logic is still unclear for me.

    The check_hostname() immediately returns with false, if the resolved
    remote hostname is NOT matched with the hostname described in pg_hba.conf.
    If the resolved hostname is matched with the hostname described
    in pg_hba.conf, we can consider this HbaLine to be a suitable
    configuration without any fallbacks. Right?
    It so, it should be as follows:

    if (strcmp(port->remote_hostname, hostname) == 0)
    return true;

    In addition, we should go the rest of fallback code on mismatch
    cases only, don't we?
    The code below that is not a fallback, it is the second part of the
    double DNS lookup that has been extensively discussed throughout this
    thread. The logic is

    get hostname from client's IP address
    strcmp hostname to pg_hba.conf
    get IP address from hostname
    if that IP address == client's IP address; then pg_hba.conf entry OK
    * Why getnameinfo() in the fallback loop?
    I checked through my git history; this was actually a leftover from some
    debugging code. I'll remove it.
    * Slash ('/') after the hostname

    At the parse_hba_line(), the parsed token which contains either
    hostname or cidr address is sliced into two parts on the first '/'
    character, if exist.
    Then, even if cidr_slash is not NULL, it shall be ignored when
    top-half of the token is hostname, not numeric address.
    OK, I'll fix that.
  • KaiGai Kohei at Oct 12, 2010 at 1:12 am

    (2010/10/12 3:34), Peter Eisentraut wrote:
    On tor, 2010-10-07 at 12:45 +0900, KaiGai Kohei wrote:
    * The logic is still unclear for me.

    The check_hostname() immediately returns with false, if the resolved
    remote hostname is NOT matched with the hostname described in pg_hba.conf.
    If the resolved hostname is matched with the hostname described
    in pg_hba.conf, we can consider this HbaLine to be a suitable
    configuration without any fallbacks. Right?
    It so, it should be as follows:

    if (strcmp(port->remote_hostname, hostname) == 0)
    return true;

    In addition, we should go the rest of fallback code on mismatch
    cases only, don't we?
    The code below that is not a fallback, it is the second part of the
    double DNS lookup that has been extensively discussed throughout this
    thread. The logic is

    get hostname from client's IP address
    strcmp hostname to pg_hba.conf
    get IP address from hostname
    if that IP address == client's IP address; then pg_hba.conf entry OK
    Sorry, I missed what you introduced at the head of this thread.
    When an entry passes the checks, this routine always checks both of
    directions for the supplied entries.

    BTW, I have one other question.
    Is it really necessary to check reverse dns entries?
    If check_hostname() compares remote address of the client and all
    the candidates retrieved using getaddrinfo() for the hostname in
    pg_hba.conf line (not port->remote_hostname), it seems to me we don't
    need to set up reverse dns entries.
    A typical ISP often doesn't assign reverse dns entries for cheap class
    rental server being shared by multiple users, for instance. It seem to
    me this idea allows to apply this new feature more widely.
    # Let's try nslookup to www.kaigai.gr.jp, and its retrieved address. :-)

    Of course, it is just my idea. If dislike it, please ignore it.
    * Why getnameinfo() in the fallback loop?
    I checked through my git history; this was actually a leftover from some
    debugging code. I'll remove it.
    * Slash ('/') after the hostname

    At the parse_hba_line(), the parsed token which contains either
    hostname or cidr address is sliced into two parts on the first '/'
    character, if exist.
    Then, even if cidr_slash is not NULL, it shall be ignored when
    top-half of the token is hostname, not numeric address.
    OK, I'll fix that.


    --
    KaiGai Kohei <kaigai@ak.jp.nec.com>
  • Peter Eisentraut at Oct 12, 2010 at 5:22 pm

    On tis, 2010-10-12 at 10:11 +0900, KaiGai Kohei wrote:
    Is it really necessary to check reverse dns entries?
    This has been extensively discussed in this thread.
  • Peter Eisentraut at Oct 12, 2010 at 5:20 pm

    On mån, 2010-10-11 at 21:34 +0300, Peter Eisentraut wrote:
    * Why getnameinfo() in the fallback loop?
    I checked through my git history; this was actually a leftover from
    some
    debugging code. I'll remove it.
    * Slash ('/') after the hostname

    At the parse_hba_line(), the parsed token which contains either
    hostname or cidr address is sliced into two parts on the first '/'
    character, if exist.
    Then, even if cidr_slash is not NULL, it shall be ignored when
    top-half of the token is hostname, not numeric address.
    OK, I'll fix that.
    Hopefully final patch, which addresses the above issues, adds some
    documentation enhancements, and the possibility to quote host names (in
    case someone wants to have a host named "samehost").
  • Tom Lane at Oct 12, 2010 at 8:33 pm

    Peter Eisentraut writes:
    Hopefully final patch, which addresses the above issues, adds some
    documentation enhancements, and the possibility to quote host names (in
    case someone wants to have a host named "samehost").
    A few minor gripes:
    + If a host name is specified (anything that is not an IP address
    + or a special key word is processed as a potential host name),
    + that name is compared with the result of a reverse name
    + resolution of the client's IP address (e.g., reverse DNS
    + lookup, if DNS is used). If there is a match, then a forward
    + name resolution (e.g., forward DNS lookup) is performed on the
    + host name to check if it resolves to an IP address that is
    + equal to the client's IP address. If both directions match,
    + then the entry is considered to match.
    I think the reason you're getting repeated questions about why the
    reverse DNS lookup is needed is that the documentation fails to explain
    that. It'd be helpful if this part of the docs pointed out why the
    apparently-extra lookup is necessary. If I understand correctly, the
    point is that we do the reverse lookup only once per connection (when
    first finding a name-based pg_hba.conf entry) and that saves us having
    to do forward lookup on all the name-based entries that don't match.

    Which means BTW that the technique loses if the first name-based entry
    matches the connection, and only wins when the match comes at the third
    or later name-based entry. Are we really sure this is a good tradeoff?
    Are we sure there shouldn't be a way to turn it off? I'm a bit
    concerned about the fact that the argument seems to be "better
    performance" versus "makes it completely useless for some use cases",
    as here:
    http://archives.postgresql.org/pgsql-hackers/2010-08/msg00726.php

    Another smaller point is that it might be helpful if the wording didn't
    make it sound like we expect the forward DNS lookup to produce just
    one IP address. Perhaps "If there is a match, then a forward name
    resolution (e.g., forward DNS lookup) is performed on the host name to
    check if any of the addresses it resolves to are equal to the client's
    IP address."

    In check_hostname():
    + port->remote_hostname = strdup(remote_hostname);
    Seems like this had better be pstrdup(), or at least have an error
    check. Dumping core at the next line is not a real acceptable
    substitute for an "out of memory" error.
    + /* Lookup IP from host name and check against original IP */
    + ret = getaddrinfo(port->remote_hostname, NULL, NULL, &gai_result);
    + if (ret != 0)
    + ereport(ERROR,
    + (errcode(ERRCODE_CONFIG_FILE_ERROR),
    + errmsg("getaddrinfo failed on \"%s\": %s",
    + port->remote_hostname, gai_strerror(ret))));
    Is ERRCODE_CONFIG_FILE_ERROR really the most appropriate code here?

    regards, tom lane
  • Tom Lane at Oct 12, 2010 at 9:03 pm

    Peter Eisentraut writes:
    Hopefully final patch, which addresses the above issues, adds some
    documentation enhancements, and the possibility to quote host names (in
    case someone wants to have a host named "samehost").
    Oh, I had an idea for a small improvement to this. It doesn't seem
    unlikely that pg_hba.conf could contain multiple entries with the same
    host name (but, presumably, different user and/or database names). As
    this is coded, you'll do a forward DNS lookup for each one until finding
    the complete match. You could easily prevent that by adding an
    additional cache field to struct Port, along the lines of
    +1 = remote_hostname is known to resolve to client's IP address
    -1 = remote_hostname is known NOT to resolve to client's IP address
    0 = we have not done the forward DNS lookup yet.
    With this additional field we could guarantee to do not more than two
    DNS lookups per connection attempt.

    It also seems worth taking a second look at the order of tests in
    check_hba(). I suspect that on average check_db() and check_role()
    will now be much cheaper than the client IP test; should they be
    done first? Of course, if you assume that "all" is the typical
    entry in those columns, this doesn't win.

    regards, tom lane
  • Peter Eisentraut at Oct 14, 2010 at 5:07 am

    On tis, 2010-10-12 at 17:03 -0400, Tom Lane wrote:
    Oh, I had an idea for a small improvement to this. It doesn't seem
    unlikely that pg_hba.conf could contain multiple entries with the same
    host name (but, presumably, different user and/or database names). As
    this is coded, you'll do a forward DNS lookup for each one until
    finding
    the complete match. You could easily prevent that by adding an
    additional cache field to struct Port, along the lines of
    +1 = remote_hostname is known to resolve to client's IP
    address
    -1 = remote_hostname is known NOT to resolve to client's IP
    address
    0 = we have not done the forward DNS lookup yet.
    With this additional field we could guarantee to do not more than two
    DNS lookups per connection attempt.
    That's a very good idea. I will revise my patch.
  • Peter Eisentraut at Oct 15, 2010 at 8:03 pm
    I have addressed these issues in my final patch. One other change I did
    was change the host name comparison to case insensitive (Apache and TCP
    Wrappers do the same).

    I'm planning to work on a wildcard mechanism soon. After that the issue
    of which way the host resolution should work will have an additional
    argument, so I'm not going to address that any further right now.

    On tis, 2010-10-12 at 16:33 -0400, Tom Lane wrote:
    Peter Eisentraut <peter_e@gmx.net> writes:
    Hopefully final patch, which addresses the above issues, adds some
    documentation enhancements, and the possibility to quote host names (in
    case someone wants to have a host named "samehost").
    A few minor gripes:
    + If a host name is specified (anything that is not an IP address
    + or a special key word is processed as a potential host name),
    + that name is compared with the result of a reverse name
    + resolution of the client's IP address (e.g., reverse DNS
    + lookup, if DNS is used). If there is a match, then a forward
    + name resolution (e.g., forward DNS lookup) is performed on the
    + host name to check if it resolves to an IP address that is
    + equal to the client's IP address. If both directions match,
    + then the entry is considered to match.
    I think the reason you're getting repeated questions about why the
    reverse DNS lookup is needed is that the documentation fails to explain
    that. It'd be helpful if this part of the docs pointed out why the
    apparently-extra lookup is necessary. If I understand correctly, the
    point is that we do the reverse lookup only once per connection (when
    first finding a name-based pg_hba.conf entry) and that saves us having
    to do forward lookup on all the name-based entries that don't match.

    Which means BTW that the technique loses if the first name-based entry
    matches the connection, and only wins when the match comes at the third
    or later name-based entry. Are we really sure this is a good tradeoff?
    Are we sure there shouldn't be a way to turn it off? I'm a bit
    concerned about the fact that the argument seems to be "better
    performance" versus "makes it completely useless for some use cases",
    as here:
    http://archives.postgresql.org/pgsql-hackers/2010-08/msg00726.php

    Another smaller point is that it might be helpful if the wording didn't
    make it sound like we expect the forward DNS lookup to produce just
    one IP address. Perhaps "If there is a match, then a forward name
    resolution (e.g., forward DNS lookup) is performed on the host name to
    check if any of the addresses it resolves to are equal to the client's
    IP address."

    In check_hostname():
    + port->remote_hostname = strdup(remote_hostname);
    Seems like this had better be pstrdup(), or at least have an error
    check. Dumping core at the next line is not a real acceptable
    substitute for an "out of memory" error.
    + /* Lookup IP from host name and check against original IP */
    + ret = getaddrinfo(port->remote_hostname, NULL, NULL, &gai_result);
    + if (ret != 0)
    + ereport(ERROR,
    + (errcode(ERRCODE_CONFIG_FILE_ERROR),
    + errmsg("getaddrinfo failed on \"%s\": %s",
    + port->remote_hostname, gai_strerror(ret))));
    Is ERRCODE_CONFIG_FILE_ERROR really the most appropriate code here?

    regards, tom lane
  • Joachim Wieland at Oct 6, 2010 at 2:36 am

    On Tue, Oct 5, 2010 at 3:41 PM, Peter Eisentraut wrote:

    The reason why I think this is semi-important and not just cosmetic is
    that (for some reason that is not entirely clear to me) clients
    connecting to localhost end up appearing to the server as ::1 on a lot
    of machines I use which are not otherwise keen on IPv6, and it is a
    common mistake to forget to keep the pg_hba.conf entries for 127.0.0.1
    and ::1 in sync.
    This is exactly what I am seeing here. However contrary to your case the
    patch makes it even worse on my side. With the patch compiled in and a
    pg_hba.conf entry of "localhost", I cannot connect anymore to "-h
    localhost", I get "no pg_hba.conf entry for host ::1".

    This is mostly standard Ubuntu setup.


    Joachim

Related Discussions

People

Translate

site design / logo © 2022 Grokbase