Thread: host name support in pg_hba.conf

host name support in pg_hba.conf

From
Peter Eisentraut
Date:
Here is a patch for host name support in pg_hba.conf.  I have reviewed
various past threads about this, and there appeared to have been a 50/50
split of for and against reverse lookup.  I went with the reverse
lookup, because

0) I like it.

1) It is more secure.

2) It allows extending it to wildcards in the future.

3) Apache (Allow from) does it that way.

To clarify how it works:  The client's IP address (known from the
kernel) is reverse looked up, which results in a host name.  That host
name is compared with the line in pg_hba.conf.  If it matches, a forward
lookup is performed on the host name to check if any of the resulting IP
addresses match the client's IP address.  If yes, the line is considered
to match and the authentication method is selected.

Anyway, assuming we will go with this, you will also notice that in the
patch I changed the default pg_hba.conf to match against "localhost"
instead of numeric addresses.  Initially thought of as a temporary
change for testing this patch, I think this might actually have some
permanent value because it saves you from having to change the IPv4 and
IPv6 lines in tandem most of the times, which is a moderately common
mistake.  We already rely on localhost being (forward) resolvable for
the stats collector.

Something to think about: Maybe we need a quoting mechanism in case
someone names their hosts "samenet".

Attachment

Re: host name support in pg_hba.conf

From
"Kevin Grittner"
Date:
Peter Eisentraut <peter_e@gmx.net> wrote:
> The client's IP address (known from the kernel)
Some machines have several IP addresses; how is that handled?
> is reverse looked up, which results in a host name.
Some IP addresses have several host names, including in reverse
lookup; how is that handled?
-Kevin


Re: host name support in pg_hba.conf

From
Thom Brown
Date:
On 9 August 2010 19:47, Peter Eisentraut <peter_e@gmx.net> wrote:
> Here is a patch for host name support in pg_hba.conf.  I have reviewed
> various past threads about this, and there appeared to have been a 50/50
> split of for and against reverse lookup.  I went with the reverse
> lookup, because
>
> 0) I like it.
>
> 1) It is more secure.
>
> 2) It allows extending it to wildcards in the future.
>
> 3) Apache (Allow from) does it that way.
>
> To clarify how it works:  The client's IP address (known from the
> kernel) is reverse looked up, which results in a host name.  That host
> name is compared with the line in pg_hba.conf.  If it matches, a forward
> lookup is performed on the host name to check if any of the resulting IP
> addresses match the client's IP address.  If yes, the line is considered
> to match and the authentication method is selected.
>
> Anyway, assuming we will go with this, you will also notice that in the
> patch I changed the default pg_hba.conf to match against "localhost"
> instead of numeric addresses.  Initially thought of as a temporary
> change for testing this patch, I think this might actually have some
> permanent value because it saves you from having to change the IPv4 and
> IPv6 lines in tandem most of the times, which is a moderately common
> mistake.  We already rely on localhost being (forward) resolvable for
> the stats collector.
>
> Something to think about: Maybe we need a quoting mechanism in case
> someone names their hosts "samenet".
>
>
> --

A couple things:

+       matches.  This field can contain either a host name, an IP
+       address range, one of the special key words mentioned below.
+      </para>

s/, one/, or one/

+       If a host name is specified (anything that is not an IP address
+       or a special key word is processed as a potential host name), a
+       reverse DNS lookup is performed on the client's IP address,
+       then a forward DNS lookup on the resulting name to check if it
+       matches the original IP address (that is, at least one of the
+       potentially many IP addresses matches the original one), and
+       the name found in the reverse lookup is compared with the
+       specified host name.

That's one loooong sentence!

--
Thom Brown
Registered Linux user: #516935


Re: host name support in pg_hba.conf

From
Robert Haas
Date:
On Mon, Aug 9, 2010 at 2:47 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
> Here is a patch for host name support in pg_hba.conf.  I have reviewed
> various past threads about this, and there appeared to have been a 50/50
> split of for and against reverse lookup.  I went with the reverse
> lookup, because
>
> 0) I like it.
>
> 1) It is more secure.
>
> 2) It allows extending it to wildcards in the future.
>
> 3) Apache (Allow from) does it that way.
>
> To clarify how it works:  The client's IP address (known from the
> kernel) is reverse looked up, which results in a host name.  That host
> name is compared with the line in pg_hba.conf.  If it matches, a forward
> lookup is performed on the host name to check if any of the resulting IP
> addresses match the client's IP address.  If yes, the line is considered
> to match and the authentication method is selected.

Seems reasonable.

> Anyway, assuming we will go with this, you will also notice that in the
> patch I changed the default pg_hba.conf to match against "localhost"
> instead of numeric addresses.  Initially thought of as a temporary
> change for testing this patch, I think this might actually have some
> permanent value because it saves you from having to change the IPv4 and
> IPv6 lines in tandem most of the times, which is a moderately common
> mistake.  We already rely on localhost being (forward) resolvable for
> the stats collector.

-1 from me, on this part.  I think we should be trying to eliminate
any dependencies we have on how localhost resolves, and we certainly
should not add more.  I have spent way too much time fighting with
problems caused by localhost being present/absent in /etc/hosts; and
more caused by whether it maps to 127.0.0.1 or some IP assigned to one
of the boxes' Ethernet cards.  What's really special is when you have
two parts of the system, one of which will only work with one version
of /etc/hosts and the other of which will only work with a different
version.  I humbly, but fervently, suggest that we try to avoid being
polemical about this.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: host name support in pg_hba.conf

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> Here is a patch for host name support in pg_hba.conf.

My recollection is that the previous discussions got stuck on the cost
of doing DNS lookups for every connect; and the obvious solution of
trying to cache the names was shot down on the basis of not knowing when
to flush the cache.  Have you decided that people who want this feature
will just have to pay that cost?  If so, I think the documentation needs
to be a bit more explicit about names being more expensive than IP
addresses in pg_hba.conf.
        regards, tom lane


Re: host name support in pg_hba.conf

From
Stephen Frost
Date:
* Kevin Grittner (Kevin.Grittner@wicourts.gov) wrote:
> > The client's IP address (known from the kernel)
>
> Some machines have several IP addresses; how is that handled?

Sounds like he already described it, or I read it wrong.  The fact that
some machines have several IP addresses hardly matters- whatever IP is
used to connect to PG is what gets the reverse DNS lookup.  That then
returns a host.  That host is then looked up, and as long as *one* of
the IPs associated with that host matches the IP of the connector, it's
good to go.

> > is reverse looked up, which results in a host name.
>
> Some IP addresses have several host names, including in reverse
> lookup; how is that handled?

Yeahhhh...  That's just busted, imnsho.  But then, that's probably
because it breaks Kerberos too. :)
Thanks,
    Stephen

Re: host name support in pg_hba.conf

From
Kristian Larsson
Date:
On Mon, Aug 09, 2010 at 03:25:23PM -0400, Tom Lane wrote:
> Peter Eisentraut <peter_e@gmx.net> writes:
> > Here is a patch for host name support in pg_hba.conf.
> 
> My recollection is that the previous discussions got stuck on the cost
> of doing DNS lookups for every connect; and the obvious solution of
> trying to cache the names was shot down on the basis of not knowing when
> to flush the cache.  Have you decided that people who want this feature
> will just have to pay that cost?  If so, I think the documentation needs
> to be a bit more explicit about names being more expensive than IP
> addresses in pg_hba.conf.

I am also a bit scared of the startup cost for each connection.
In which order are things evaluated? What if I only include IP
addresses in my pg_hba, will it still cost me a DNS lookup or
two? 

Kind regards,  Kristian. 

-- 
Kristian Larsson                                        KLL-RIPE
+46 704 264511                          kll@spritelink.net


Re: host name support in pg_hba.conf

From
Peter Eisentraut
Date:
On mån, 2010-08-09 at 15:05 -0400, Robert Haas wrote:
> -1 from me, on this part.  I think we should be trying to eliminate
> any dependencies we have on how localhost resolves, and we certainly
> should not add more.

Maybe this is something that distributors could add if they have more
knowledge about the "localhost" policies on their hosts.



Re: host name support in pg_hba.conf

From
Peter Eisentraut
Date:
On mån, 2010-08-09 at 13:56 -0500, Kevin Grittner wrote:
> Peter Eisentraut <peter_e@gmx.net> wrote:
>  
> > The client's IP address (known from the kernel)
>  
> Some machines have several IP addresses; how is that handled?

A connection comes in over exactly one address.
> > is reverse looked up, which results in a host name.
>  
> Some IP addresses have several host names, including in reverse
> lookup; how is that handled?

This is not possible, or at least the C library APIs don't expose it.
Compare the getnameinfo() and getaddrinfo() man pages, for example.



Re: host name support in pg_hba.conf

From
Peter Eisentraut
Date:
On mån, 2010-08-09 at 23:01 +0200, Kristian Larsson wrote:
> In which order are things evaluated? What if I only include IP
> addresses in my pg_hba, will it still cost me a DNS lookup or
> two?

No, if you don't use this feature, it won't cost you anything.



Re: host name support in pg_hba.conf

From
Peter Eisentraut
Date:
On mån, 2010-08-09 at 15:25 -0400, Tom Lane wrote:
> Peter Eisentraut <peter_e@gmx.net> writes:
> > Here is a patch for host name support in pg_hba.conf.
> 
> My recollection is that the previous discussions got stuck on the cost
> of doing DNS lookups for every connect; and the obvious solution of
> trying to cache the names was shot down on the basis of not knowing when
> to flush the cache.  Have you decided that people who want this feature
> will just have to pay that cost?

Either that or install nscd.

While the speed of establishing a connection is critical to some setups,
there are also plenty of setups where the connection setup speed is
pretty much unimportant because of connection pooling.

> If so, I think the documentation needs
> to be a bit more explicit about names being more expensive than IP
> addresses in pg_hba.conf.

Fair enough.




Re: host name support in pg_hba.conf

From
"Kevin Grittner"
Date:
Peter Eisentraut  wrote:
> On mån, 2010-08-09 at 13:56 -0500, Kevin Grittner wrote:
>> Peter Eisentraut  wrote:
>>> is reverse looked up, which results in a host name.
>>
>> Some IP addresses have several host names, including in reverse
>> lookup; how is that handled?
> 
> This is not possible,
http://en.wikipedia.org/wiki/Reverse_DNS_lookup#Multiple_pointer_records
> or at least the C library APIs don't expose it.
That may explain the prevalence of bugs in code dealing with it.
-Kevin


Re: host name support in pg_hba.conf

From
Peter Eisentraut
Date:
On tis, 2010-08-10 at 07:32 -0500, Kevin Grittner wrote:
> http://en.wikipedia.org/wiki/Reverse_DNS_lookup#Multiple_pointer_records

Yeah, you can configure all kinds of nonsense and sometimes even get
away with it, but the basic assumption throughout is that a system has
one host name and between 1 and many IP addresses.  We must make our
implementation robust again other setups, but we don't have to (or
rather cannot) support them.



Re: host name support in pg_hba.conf

From
Robert Haas
Date:
On Tue, Aug 10, 2010 at 10:05 AM, Peter Eisentraut <peter_e@gmx.net> wrote:
> On tis, 2010-08-10 at 07:32 -0500, Kevin Grittner wrote:
>> http://en.wikipedia.org/wiki/Reverse_DNS_lookup#Multiple_pointer_records
>
> Yeah, you can configure all kinds of nonsense and sometimes even get
> away with it, but the basic assumption throughout is that a system has
> one host name and between 1 and many IP addresses.

These days, I think it's more common the other way around: one IP
address, and many host names.

> We must make our
> implementation robust again other setups, but we don't have to (or
> rather cannot) support them.

"Cannot" is a good argument for not supporting just about anything.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: host name support in pg_hba.conf

From
"Kevin Grittner"
Date:
Peter Eisentraut <peter_e@gmx.net> wrote:
> Yeah, you can configure all kinds of nonsense and sometimes even
> get away with it, but the basic assumption throughout is that a
> system has one host name and between 1 and many IP addresses.
It's hardly nonsense to have multiple names on a machine.  While we
usually avoid having multiple reverse lookup names, we have many
in-house web applications and we neither want users to access them
by IP address or have to worry about which web server is hosting
which applications at the moment.  So it's not unusual for one of
our web servers to have 10 or 15 DNS names for forward lookup.  If
one machine becomes overloaded, we can move an application, change
the DNS, and everyone's bookmark still works.  This is precisely the
sort of situation where using a hostname in pg_hba.conf would be
most useful.
> We must make our implementation robust again other setups, but we
> don't have to (or rather cannot) support them.
Without the logic to ensure that the hostname matches the reverse
lookup, this might be useful for us.  With that logic it is useless
for us.  I'm wondering how much you gain by having it in there.  Why
can't a forward lookup which matches the requesting IP be considered
sufficient?
-Kevin


Re: host name support in pg_hba.conf

From
Peter Eisentraut
Date:
On tis, 2010-08-10 at 10:11 -0400, Robert Haas wrote:
> These days, I think it's more common the other way around: one IP
> address, and many host names.

Yes, that setup is very common, but it's actually only an illusion that
DNS creates.  The actual machine still has only one host name and some
IP addresses, as far as the kernel is concerned.  As you are surely
aware, this situation creates all kinds of problems in practice.



Re: host name support in pg_hba.conf

From
Peter Eisentraut
Date:
On tis, 2010-08-10 at 09:18 -0500, Kevin Grittner wrote:
> Without the logic to ensure that the hostname matches the reverse
> lookup, this might be useful for us.  With that logic it is useless
> for us.  I'm wondering how much you gain by having it in there.  Why
> can't a forward lookup which matches the requesting IP be considered
> sufficient?

For one thing, because people might like to add wildcard support.  So I
might be able to say

host  all  all  appserver*.example.com  md5



Re: host name support in pg_hba.conf

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Without the logic to ensure that the hostname matches the reverse
> lookup, this might be useful for us.  With that logic it is useless
> for us.  I'm wondering how much you gain by having it in there.  Why
> can't a forward lookup which matches the requesting IP be considered
> sufficient?

I was about to complain about that same thing.  ISTM the logic ought
to be that you do a forward DNS lookup on the name presented in
pg_hba.conf, and if any of the returned IP addresses match the
connection's remote IP address, then you have a match.  This business
with doing a reverse lookup is at least twice as expensive, far more
fragile, and it seems completely bogus from a security viewpoint.
Why should I trust the RDNS server for an attacker's IP address?
        regards, tom lane


Re: host name support in pg_hba.conf

From
Aidan Van Dyk
Date:
* Tom Lane <tgl@sss.pgh.pa.us> [100810 10:39]:

> I was about to complain about that same thing.  ISTM the logic ought
> to be that you do a forward DNS lookup on the name presented in
> pg_hba.conf, and if any of the returned IP addresses match the
> connection's remote IP address, then you have a match.  This business
> with doing a reverse lookup is at least twice as expensive, far more
> fragile, and it seems completely bogus from a security viewpoint.
> Why should I trust the RDNS server for an attacker's IP address?

Well, you don't trust the RDNS of the IP, you trust the normal lookup of
the hostname returned by the RDNS.  So if some other ip network is
trying to give hostnames that should be authorized, you see that
immediately when you resolve the "authorized" hostname and it doesn't
give you that IP.

The PTR query is a means to get the "hostname" to check against, so you
d'nt have to pre-cache all thos possible results of all the hostnames.
Pre-caching all the hostnames in pg_hba.conf is madness.  How long do
you cache them for?  or do send out 1000 queries every connection?   You
can't support wildcards, or anythign usefull...

AFAIK, every software I've used which allows hostnames as some
connection control all do PTR->A/AAAA lookups as Peter proposed.

a.


-- 
Aidan Van Dyk                                             Create like a god,
aidan@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

Re: host name support in pg_hba.conf

From
Peter Eisentraut
Date:
On tis, 2010-08-10 at 10:39 -0400, Tom Lane wrote:
> I was about to complain about that same thing.  ISTM the logic ought
> to be that you do a forward DNS lookup on the name presented in
> pg_hba.conf, and if any of the returned IP addresses match the
> connection's remote IP address, then you have a match.  This business
> with doing a reverse lookup is at least twice as expensive, far more
> fragile, and it seems completely bogus from a security viewpoint.

If you have hundreds on lines in pg_hba.conf, then you need to do
hundreds of DNS lookups per connection attempt (*), and each of those
lookups could result in even more IP addresses, or could time out.  So
if some unrelated part of the system breaks (DNS down), it could take
you hours to establish a connection.  On the other hand, with the
reverse DNS lookup, you would normally do about two DNS queries per
successful connection attempt, and those would only be in relation to
the machines actually involved in the connection.  Also, if you are in a
names-only environment, you might also like to turn on log_hostnames, in
which case the reverse lookup is free (well, shared).

(*) That could of course be addressed by your earlier idea of caching
the resolved names when pg_hba.conf is read, but I don't think many
people were on board with that idea.



Re: host name support in pg_hba.conf

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> On tis, 2010-08-10 at 09:18 -0500, Kevin Grittner wrote:
>> Why can't a forward lookup which matches the requesting IP be considered
>> sufficient?

> For one thing, because people might like to add wildcard support.  So I
> might be able to say
> host  all  all  appserver*.example.com  md5

I don't think that the possibility that we might support that in future
can justify using a slow and not-too-reliable method for ordinary
non-wildcard names.
        regards, tom lane


Re: host name support in pg_hba.conf

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> On tis, 2010-08-10 at 10:39 -0400, Tom Lane wrote:
>> I was about to complain about that same thing.  ISTM the logic ought
>> to be that you do a forward DNS lookup on the name presented in
>> pg_hba.conf, and if any of the returned IP addresses match the
>> connection's remote IP address, then you have a match.  This business
>> with doing a reverse lookup is at least twice as expensive, far more
>> fragile, and it seems completely bogus from a security viewpoint.

> If you have hundreds on lines in pg_hba.conf, then you need to do
> hundreds of DNS lookups per connection attempt (*), and each of those
> lookups could result in even more IP addresses, or could time out.

If you have a configuration that would actually require that, then you
would have a case for using a wildcard.  My complaint is that you're
trying to force everyone to pay for that feature whether it's of use to
them or not.  I think it's at least as likely that typical setups would
need exactly *one*, non wildcard, entry, to wit appserver.mycompany.com
(which'd necessarily yield IPs for all the machines running your app
server code).
        regards, tom lane


Re: host name support in pg_hba.conf

From
Stephen Frost
Date:
* Peter Eisentraut (peter_e@gmx.net) wrote:
> On mån, 2010-08-09 at 13:56 -0500, Kevin Grittner wrote:
> > Some IP addresses have several host names, including in reverse
> > lookup; how is that handled?
>
> This is not possible, or at least the C library APIs don't expose it.
> Compare the getnameinfo() and getaddrinfo() man pages, for example.

Don't know how it happens at a technical level, but I've definitely seen
it happen before..  Particularly with Windows domains where they don't
have "clean-up reverse DNS" enabled.  Manifests itself by having
different host names show up on successive requests...  Evil in any
case.
Stephen

Re: host name support in pg_hba.conf

From
Stephen Frost
Date:
* Aidan Van Dyk (aidan@highrise.ca) wrote:
> The PTR query is a means to get the "hostname" to check against, so you
> d'nt have to pre-cache all thos possible results of all the hostnames.
> Pre-caching all the hostnames in pg_hba.conf is madness.  How long do
> you cache them for?  or do send out 1000 queries every connection?   You
> can't support wildcards, or anythign usefull...
>
> AFAIK, every software I've used which allows hostnames as some
> connection control all do PTR->A/AAAA lookups as Peter proposed.

Completely agreed.  It's madness to precache all thse hostnames, but we
need to figure out the hostname, thus, rDNS is used.  The forward lookup
is then to double-check that it matches.  This is exactly how Kerberos
works also.  You certainly don't want to be repeatedly doing rDNS
lookups to see if maybe that IP has other hosts.  I also don't buy that
there's an issue with setting up your rDNS to go to what you put in the
pg_hba and then having the forward of that include the IP; again, it's
how Kerberos works, and even if you don't believe in Kerberos, I hope
you realize it's kind of popular.
Thanks,
    Stephen

Re: host name support in pg_hba.conf

From
Stephen Frost
Date:
* Kevin Grittner (Kevin.Grittner@wicourts.gov) wrote:
> It's hardly nonsense to have multiple names on a machine.  While we
> usually avoid having multiple reverse lookup names, we have many
> in-house web applications and we neither want users to access them
> by IP address or have to worry about which web server is hosting
> which applications at the moment.  So it's not unusual for one of
> our web servers to have 10 or 15 DNS names for forward lookup.  If
> one machine becomes overloaded, we can move an application, change
> the DNS, and everyone's bookmark still works.  This is precisely the
> sort of situation where using a hostname in pg_hba.conf would be
> most useful.

We're talking about client systems connecting to PG here.  Are you
authenticating your web users by looking at their client addresses..?
That strikes me as pretty unlikely.  Your web servers will be connecting
to your PG server from *one* address (whatever the main one is for that
pariticular server), and that address just needs to have an rDNS entry
that goes to a host whose forward DNS includes that IP.

If you have multiple web servers that are connecting to the same PG
database, then have multiple pg_hba entries, or make them all have the
same hostname per reverse DNS (though I don't really see why you'd want
to).

> > We must make our implementation robust again other setups, but we
> > don't have to (or rather cannot) support them.
>
> Without the logic to ensure that the hostname matches the reverse
> lookup, this might be useful for us.  With that logic it is useless
> for us.  I'm wondering how much you gain by having it in there.  Why
> can't a forward lookup which matches the requesting IP be considered
> sufficient?

Because "you can't get there from here".  You'd either have to cache all
the entries in pg_hba (which is horrible), or do a look-up on each one
on every connection (which is going to be a hell of alot slower than
doing one more DNS lookup here).  This isn't magic.  What we have is a
bunch of host names and a single IP (the connecting one).  Figuring out
which one goes with which is the issue.
Stephen

Re: host name support in pg_hba.conf

From
Steve Atkins
Date:
On Aug 10, 2010, at 8:23 AM, Stephen Frost wrote:

> * Peter Eisentraut (peter_e@gmx.net) wrote:
>> On mån, 2010-08-09 at 13:56 -0500, Kevin Grittner wrote:
>>> Some IP addresses have several host names, including in reverse
>>> lookup; how is that handled?
>>
>> This is not possible, or at least the C library APIs don't expose it.
>> Compare the getnameinfo() and getaddrinfo() man pages, for example.
>
> Don't know how it happens at a technical level, but I've definitely seen
> it happen before..  Particularly with Windows domains where they don't
> have "clean-up reverse DNS" enabled.  Manifests itself by having
> different host names show up on successive requests...  Evil in any
> case.


Multiple hostnames for a given IP address are supported just fine
by the DNS. Some C library APIs support this just fine, others
(such as getnameinfo) have been simplified to make them more
pleasant to use for the common case of displaying a text representation
of an IP address in a friendly manner with simple code, at the expense
of actually returning correct data.

So getnameinfo() is not suitable for this particular usage. If an
IP address has multiple hostnames then what getnameinfo() will
return isn't well-defined (and I believe there's been a trickle of
bugs in implementations such that sometimes they won't return
any hostname if there are multiple ones configured in the DNS).

Any approach to restrict based on hostnames will either need to
just work with forward DNS resolution of hostnames configured
in pg_hba.conf to create a list of IP addresses to compare against
an incoming connection, or it'll need to use a more general
interface to get the reverse DNS of an incoming connection (e.g.
gethostbyaddr(), less elegant as that is) before checking forward
DNS.

The former approach won't work if we want to support wildcard
hostnames ("accept connections from *.example.com") - and
that's the only useful functionality that adding hostname based
ACLs provides, I think. If we want to do that, we need to use
gethostbyaddr() to get all the claimed hostnames via reverse
DNS, and for each of those that matches our ACL do a
getaddrinfo() to check it resolves to the connecting IP.

This is something that's pretty common to do in the email world,
so stealing some robust code from there might be an idea.

Cheers, Steve



Re: host name support in pg_hba.conf

From
KaiGai Kohei
Date:
It seems to me this patch has been left for a long time, although it tries
to provide a useful functionality.

In the previous discussion, several issues were pointed out.

* Case handling when an IP-address has multiple hostnames.

Steve Atkins noticed getaddrinfo() does not have capability to handle
this case, because it returns just one hostname from the candidates.
So, he suggested to use gethostbyaddr() instead, because it can return
all the candidates of hostnames.

I also think gethostbyaddr() is more prefereble here rather than
getaddrinfo() with the primary hostname as a fallback.

How about the following idea?

1. Add a list of hostname on Port structure to store all the candidates  of hostnames for the current remote host.
2. Revise the pg_getnameinfo_all() to call the gethostbyaddr(), then  it constructs the list from the returned
hostent->h_aliases.
3. Revise Port->remote_host to indicate contents of the list head.
4. Revise the new check_hostname() to compare the supplied hostname  from pg_hba.conf and each hostnames in the above
list.

If pg_getnameinfo_all() would have all the candidates of hostnames,
we don't need to lookup dns twice on fallbacks.


* "localhost" in the default pg_hba.conf

Robert Haas disagreed to switch "localhost" in the default pg_hba.conf
into numeric expression, because /etc/host should determine where the
localhost shall be mapped.

I agree with the opinion from Robert. We should not have any assumptions
of /ets/hosts settings.


* Cost for reverse lookup of DNS

Tom Lane concerned about the cost to communicate with DNS for each
connections. You answered either caching mechanism in PG or nscd is
a solution, and agreed to add an explicit description about names
being more expensive than IP addresses in pg_hba.conf.

I also think it is fair enough. How about adding a recommendation
to install nscd when people uses hostname in pg_hba.conf.

Thanks,

(2010/08/10 3:47), Peter Eisentraut wrote:
> Here is a patch for host name support in pg_hba.conf.  I have reviewed
> various past threads about this, and there appeared to have been a 50/50
> split of for and against reverse lookup.  I went with the reverse
> lookup, because
> 
> 0) I like it.
> 
> 1) It is more secure.
> 
> 2) It allows extending it to wildcards in the future.
> 
> 3) Apache (Allow from) does it that way.
> 
> To clarify how it works:  The client's IP address (known from the
> kernel) is reverse looked up, which results in a host name.  That host
> name is compared with the line in pg_hba.conf.  If it matches, a forward
> lookup is performed on the host name to check if any of the resulting IP
> addresses match the client's IP address.  If yes, the line is considered
> to match and the authentication method is selected.
> 
> Anyway, assuming we will go with this, you will also notice that in the
> patch I changed the default pg_hba.conf to match against "localhost"
> instead of numeric addresses.  Initially thought of as a temporary
> change for testing this patch, I think this might actually have some
> permanent value because it saves you from having to change the IPv4 and
> IPv6 lines in tandem most of the times, which is a moderately common
> mistake.  We already rely on localhost being (forward) resolvable for
> the stats collector.
> 
> Something to think about: Maybe we need a quoting mechanism in case
> someone names their hosts "samenet".
> 
> 
> 
> 


-- 
KaiGai Kohei <kaigai@ak.jp.nec.com>


Re: host name support in pg_hba.conf

From
Peter Eisentraut
Date:
On tis, 2010-10-05 at 12:35 +0900, KaiGai Kohei wrote:
> It seems to me this patch has been left for a long time, although it tries
> to provide a useful functionality.
> 
> In the previous discussion, several issues were pointed out.
> 
> * Case handling when an IP-address has multiple hostnames.
> 
> Steve Atkins noticed getaddrinfo() does not have capability to handle
> this case, because it returns just one hostname from the candidates.
> So, he suggested to use gethostbyaddr() instead, because it can return
> all the candidates of hostnames.
> 
> I also think gethostbyaddr() is more prefereble here rather than
> getaddrinfo() with the primary hostname as a fallback.

I have several concerns about this approach.

The first is conceptual: Assume that an IP address resolves as host
names "foo" and "bar", perhaps even by accident.  You have log_hostnames
on, which would be typical if you have an all-names pg_hba.conf.
log_hostnames only logs the first host name, say, "foo".  And then
assume that pg_hba.conf is set up to only allow "bar" in.  There you
have a debugging and auditing nightmare, because what pg_hba.conf says
and what you are logging doesn't match.  To address this you would have
to change the log_hostnames facility to log *all* host names everywhere
the host name is mentioned, which could make this whole thing quite
silly.

Secondly, long-standing and presumably reputable implementations of a
similar feature, namely Apache's mod_authz_host and tcp-wrappers use
getnameinfo() in preference of gethostbyaddr(), and don't support
multiple host names per IP address.  In fact, reading through that
source code indicates that gethostbyaddr() has  all kinds of bugs and
issues, including apparently lack of IPv6 support (on some platforms?).

Thirdly, gethostbyname() and gethostbyaddr() are deprecated by POSIX in
favor of getaddrinfo() and getnameinfo(), so we shouldn't build new
features that depend on them.

> * "localhost" in the default pg_hba.conf
> 
> Robert Haas disagreed to switch "localhost" in the default pg_hba.conf
> into numeric expression, because /etc/host should determine where the
> localhost shall be mapped.
> 
> I agree with the opinion from Robert. We should not have any assumptions
> of /ets/hosts settings.

Note that we already default listen_addresses to 'localhost', so it
would only make sense to have pg_hba.conf behave the same by default.
To pick up on your argument, we effectively *do* make assumptions
about /etc/hosts now, and this change would remove them.

Note that this would only be a default.  But if you think that it's the
wrong default, then we should consider changing the listen_addresses
default as well.

The reason why I think this is semi-important and not just cosmetic is
that (for some reason that is not entirely clear to me) clients
connecting to localhost end up appearing to the server as ::1 on a lot
of machines I use which are not otherwise keen on IPv6, and it is a
common mistake to forget to keep the pg_hba.conf entries for 127.0.0.1
and ::1 in sync.




Re: host name support in pg_hba.conf

From
KaiGai Kohei
Date:
(2010/10/06 4:41), Peter Eisentraut wrote:
> On tis, 2010-10-05 at 12:35 +0900, KaiGai Kohei wrote:
>> It seems to me this patch has been left for a long time, although it tries
>> to provide a useful functionality.
>>
>> In the previous discussion, several issues were pointed out.
>>
>> * Case handling when an IP-address has multiple hostnames.
>>
>> Steve Atkins noticed getaddrinfo() does not have capability to handle
>> this case, because it returns just one hostname from the candidates.
>> So, he suggested to use gethostbyaddr() instead, because it can return
>> all the candidates of hostnames.
>>
>> I also think gethostbyaddr() is more prefereble here rather than
>> getaddrinfo() with the primary hostname as a fallback.
> 
> I have several concerns about this approach.
> 
> The first is conceptual: Assume that an IP address resolves as host
> names "foo" and "bar", perhaps even by accident.  You have log_hostnames
> on, which would be typical if you have an all-names pg_hba.conf.
> log_hostnames only logs the first host name, say, "foo".  And then
> assume that pg_hba.conf is set up to only allow "bar" in.  There you
> have a debugging and auditing nightmare, because what pg_hba.conf says
> and what you are logging doesn't match.  To address this you would have
> to change the log_hostnames facility to log *all* host names everywhere
> the host name is mentioned, which could make this whole thing quite
> silly.
> 
> Secondly, long-standing and presumably reputable implementations of a
> similar feature, namely Apache's mod_authz_host and tcp-wrappers use
> getnameinfo() in preference of gethostbyaddr(), and don't support
> multiple host names per IP address.  In fact, reading through that
> source code indicates that gethostbyaddr() has  all kinds of bugs and
> issues, including apparently lack of IPv6 support (on some platforms?).
> 
> Thirdly, gethostbyname() and gethostbyaddr() are deprecated by POSIX in
> favor of getaddrinfo() and getnameinfo(), so we shouldn't build new
> features that depend on them.
> 
OK, it seems to me fair enough. I consented with your explanation.

I'll check the patch for more details, please wait for a few days.

>> * "localhost" in the default pg_hba.conf
>>
>> Robert Haas disagreed to switch "localhost" in the default pg_hba.conf
>> into numeric expression, because /etc/host should determine where the
>> localhost shall be mapped.
>>
>> I agree with the opinion from Robert. We should not have any assumptions
>> of /ets/hosts settings.
> 
> Note that we already default listen_addresses to 'localhost', so it
> would only make sense to have pg_hba.conf behave the same by default.
> To pick up on your argument, we effectively *do* make assumptions
> about /etc/hosts now, and this change would remove them.
> 
Sorry, I misread something.
I read the previous discussions again, then I know I misread the reason
why Robert disagreed with this replacement. He said we should not assume
resolve of localhost being enough fast because of local /etc/hosts, not
saying we should not assume localhost is "127.0.0.1" or "::1". Right?

Well, in my personal opinion, we should not assume the way to resolve
localhost, but we can expect more than 99.9% of hosts resolve localhost
using local /etc/hosts. Even if here is a curious setting, it will pay
a bit more cost on connection. It is a responsibility of DBA.

I agree with replacement "127.0.0.1" and "::1" by "localhost".
It enables to eliminate an assumption that localhost have either of
their addresses.

Thanks,
-- 
KaiGai Kohei <kaigai@ak.jp.nec.com>


Re: host name support in pg_hba.conf

From
Tom Lane
Date:
KaiGai Kohei <kaigai@ak.jp.nec.com> writes:
>> Note that we already default listen_addresses to 'localhost', so it
>> would only make sense to have pg_hba.conf behave the same by default.
>> To pick up on your argument, we effectively *do* make assumptions
>> about /etc/hosts now, and this change would remove them.
>> 
> Sorry, I misread something.
> I read the previous discussions again, then I know I misread the reason
> why Robert disagreed with this replacement. He said we should not assume
> resolve of localhost being enough fast because of local /etc/hosts, not
> saying we should not assume localhost is "127.0.0.1" or "::1". Right?

> Well, in my personal opinion, we should not assume the way to resolve
> localhost, but we can expect more than 99.9% of hosts resolve localhost
> using local /etc/hosts. Even if here is a curious setting, it will pay
> a bit more cost on connection. It is a responsibility of DBA.

> I agree with replacement "127.0.0.1" and "::1" by "localhost".
> It enables to eliminate an assumption that localhost have either of
> their addresses.

This argument is completely unfounded in reality.  Please go read the
relevant RFCs.  127.0.0.1 is standardized as the IPv4 loopback address
(see RFC3330), and ::1 is standardized as the IPv6 loopback address (see
RFC1884 section 2.4.3).  So far as I can find, there is *no* standard
mandating that localhost means the loopback address.  RFC1537 suggests
that DNS domains "should" resolve localhost.anything as 127.0.0.1; but
that is a lot weaker than the other specifications, and there's nothing
whatever to promise that it will work in a DNS-less environment.

In fact, we have seen cases where it didn't work even with publicly
available DNS servers, eg
http://archives.postgresql.org/pgsql-admin/2010-05/msg00073.php

That example leads me to think that using localhost in the default
pg_hba.conf file would actually be a security hazard: you would be
placing it in the hands of your DNS provider as to which addresses
Postgres will believe are "local" connections.  That's an OK decision
for individual admins to make, but it's not a good idea for us to ship
it as a universal default.  (Note that relying on the DNS provider to
interpret listen_addresses is not nearly as dangerous, since in any case
the kernel isn't going to let us bind() to nonlocal addresses.)

On top of that, there's no way for your DNS server to know whether your
local kernel speaks IPv6 or not, so you might not get a resolution of
the name that includes the appropriate loopback addresses.  This may or
may not have anything to do with the reports we occasionally get of
people having to change listen_addresses to "*" to get things to work.

Between these issues and the speed concern, I don't think that we should
change this.
        regards, tom lane


Re: host name support in pg_hba.conf

From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Between these issues and the speed concern, I don't think that we should
> change this.

+1.
Stephen

Re: host name support in pg_hba.conf

From
Joachim Wieland
Date:
On Tue, Oct 5, 2010 at 3:41 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
The reason why I think this is semi-important and not just cosmetic is
that (for some reason that is not entirely clear to me) clients
connecting to localhost end up appearing to the server as ::1 on a lot
of machines I use which are not otherwise keen on IPv6, and it is a
common mistake to forget to keep the pg_hba.conf entries for 127.0.0.1
and ::1 in sync.

This is exactly what I am seeing here. However contrary to your case the patch makes it even worse on my side. With the patch compiled in and a pg_hba.conf entry of "localhost", I cannot connect anymore to "-h localhost", I get "no pg_hba.conf entry for host ::1".

This is mostly standard Ubuntu setup.


Joachim

Re: host name support in pg_hba.conf

From
Peter Eisentraut
Date:
On tis, 2010-10-05 at 22:17 -0400, Tom Lane wrote:
> So far as I can find, there is *no* standard
> mandating that localhost means the loopback address.

Should we then change pgstat.c to use IP addresses instead of hardcoding
"localhost"?



Re: host name support in pg_hba.conf

From
Andrew Dunstan
Date:

On 10/06/2010 04:05 AM, Peter Eisentraut wrote:
> On tis, 2010-10-05 at 22:17 -0400, Tom Lane wrote:
>> So far as I can find, there is *no* standard
>> mandating that localhost means the loopback address.
> Should we then change pgstat.c to use IP addresses instead of hardcoding
> "localhost"?
>

I understood Tom to be saying we should not rely on "localhost" for 
authentication, not that we shouldn't use it at all.

cheers

andrew


Re: host name support in pg_hba.conf

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> On tis, 2010-10-05 at 22:17 -0400, Tom Lane wrote:
>> So far as I can find, there is *no* standard
>> mandating that localhost means the loopback address.

> Should we then change pgstat.c to use IP addresses instead of hardcoding
> "localhost"?

Hm, perhaps so.
        regards, tom lane


Re: host name support in pg_hba.conf

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> On 10/06/2010 04:05 AM, Peter Eisentraut wrote:
>> On tis, 2010-10-05 at 22:17 -0400, Tom Lane wrote:
>>> So far as I can find, there is *no* standard
>>> mandating that localhost means the loopback address.

>> Should we then change pgstat.c to use IP addresses instead of hardcoding
>> "localhost"?

> I understood Tom to be saying we should not rely on "localhost" for 
> authentication, not that we shouldn't use it at all.

I think it's all right to use it as the default value for
listen_addresses, because (1) it's an understandable default,
and (2) users can change the setting if it doesn't work.

However, the usage in pgstat.c is hard-wired, meaning that if you
have a configuration where "localhost" doesn't resolve correctly
for whatever reason, there's no simple recourse to get the stats
collector working.  So ISTM there is an argument for changing that.
        regards, tom lane


Re: host name support in pg_hba.conf

From
Magnus Hagander
Date:
On Wed, Oct 6, 2010 at 15:16, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> On 10/06/2010 04:05 AM, Peter Eisentraut wrote:
>>> On tis, 2010-10-05 at 22:17 -0400, Tom Lane wrote:
>>>> So far as I can find, there is *no* standard
>>>> mandating that localhost means the loopback address.
>
>>> Should we then change pgstat.c to use IP addresses instead of hardcoding
>>> "localhost"?
>
>> I understood Tom to be saying we should not rely on "localhost" for
>> authentication, not that we shouldn't use it at all.
>
> I think it's all right to use it as the default value for
> listen_addresses, because (1) it's an understandable default,
> and (2) users can change the setting if it doesn't work.
>
> However, the usage in pgstat.c is hard-wired, meaning that if you
> have a configuration where "localhost" doesn't resolve correctly
> for whatever reason, there's no simple recourse to get the stats
> collector working.  So ISTM there is an argument for changing that.

Well, hardcoding it will break the (unusual) case when localhost isn't
127.0.0.1 / ::1. (You'd obviously have to have it try both ipv4 and
ipv6).

It's not common, but i've certainly come across a number of virtual
machines where localhost resolves (through /etc/hosts) to the machines
"real" IP rather than 127.0.01, because 127.0.0.1 simply doesn't
exist.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: host name support in pg_hba.conf

From
Tom Lane
Date:
Magnus Hagander <magnus@hagander.net> writes:
> On Wed, Oct 6, 2010 at 15:16, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> However, the usage in pgstat.c is hard-wired, meaning that if you
>> have a configuration where "localhost" doesn't resolve correctly
>> for whatever reason, there's no simple recourse to get the stats
>> collector working. So ISTM there is an argument for changing that.

> Well, hardcoding it will break the (unusual) case when localhost isn't
> 127.0.0.1 / ::1. (You'd obviously have to have it try both ipv4 and
> ipv6).

You didn't read what I wrote before.  Those numeric addresses define the
loopback address, *not* "localhost".  When localhost fails to resolve
as those address(es), it's localhost that is wrong.  We have actually
seen this in the field with bogus DNS providers.

> It's not common, but i've certainly come across a number of virtual
> machines where localhost resolves (through /etc/hosts) to the machines
> "real" IP rather than 127.0.01, because 127.0.0.1 simply doesn't
> exist.

That appears to me to be a broken (non RFC compliant) VM setup.
However, maybe what this is telling us is we need to expose the setting?
Or perhaps better, try 127.0.0.1, ::1, localhost, in that order.
        regards, tom lane


Re: host name support in pg_hba.conf

From
Magnus Hagander
Date:
On Wed, Oct 6, 2010 at 15:34, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>> On Wed, Oct 6, 2010 at 15:16, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> However, the usage in pgstat.c is hard-wired, meaning that if you
>>> have a configuration where "localhost" doesn't resolve correctly
>>> for whatever reason, there's no simple recourse to get the stats
>>> collector working. So ISTM there is an argument for changing that.
>
>> Well, hardcoding it will break the (unusual) case when localhost isn't
>> 127.0.0.1 / ::1. (You'd obviously have to have it try both ipv4 and
>> ipv6).
>
> You didn't read what I wrote before.  Those numeric addresses define the
> loopback address, *not* "localhost".  When localhost fails to resolve
> as those address(es), it's localhost that is wrong.  We have actually
> seen this in the field with bogus DNS providers.
>
>> It's not common, but i've certainly come across a number of virtual
>> machines where localhost resolves (through /etc/hosts) to the machines
>> "real" IP rather than 127.0.01, because 127.0.0.1 simply doesn't
>> exist.
>
> That appears to me to be a broken (non RFC compliant) VM setup.

Can't argue with that. But it exists.


> However, maybe what this is telling us is we need to expose the setting?
> Or perhaps better, try 127.0.0.1, ::1, localhost, in that order.

That was kind of my point, that yes, we probably need to do one of
those at least. Today it is "kind of exposed", because you can edit
/etc/hosts - you don't need to rely on DNS for it. I just don't want
to lose that ability.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: host name support in pg_hba.conf

From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> That appears to me to be a broken (non RFC compliant) VM setup.
> However, maybe what this is telling us is we need to expose the setting?
> Or perhaps better, try 127.0.0.1, ::1, localhost, in that order.

Yeah, I'd be happier if we exposed it, to be honest.  Either that, or
figure out a way to get rid of it entirely by using a different method,
but that's a much bigger issue.
Thanks,
    Stephen

Re: host name support in pg_hba.conf

From
Andrew Dunstan
Date:

On 10/06/2010 09:49 AM, Stephen Frost wrote:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> That appears to me to be a broken (non RFC compliant) VM setup.
>> However, maybe what this is telling us is we need to expose the setting?
>> Or perhaps better, try 127.0.0.1, ::1, localhost, in that order.
> Yeah, I'd be happier if we exposed it, to be honest.  Either that, or
> figure out a way to get rid of it entirely by using a different method,
> but that's a much bigger issue.

Please don't expose it. It will a source of yet more confusion. People 
already get confused by the difference between listening addresses and 
pg_hba.conf addresses. It's one of the most frequent points of confusion 
seen on IRC.  Adding another address to configure will just compound the 
confusion badly. I much prefer Tom's last suggestion.

cheers

andrew


Re: host name support in pg_hba.conf

From
Greg Stark
Date:
On Wed, Oct 6, 2010 at 6:21 AM, Magnus Hagander <magnus@hagander.net> wrote:
> It's not common, but i've certainly come across a number of virtual
> machines where localhost resolves (through /etc/hosts) to the machines
> "real" IP rather than 127.0.01, because 127.0.0.1 simply doesn't
> exist.

It's perfectly fine for localhost to resolve to the machine's external
ip address. It would be weird for it to resolve to some other host's
ip address like the vm's host machine. But having 127.0.0.1 not exist
would be positively broken. All kinds of things wouldn't work. Are you
sure about that part?


-- 
greg


Re: host name support in pg_hba.conf

From
KaiGai Kohei
Date:
(2010/10/06 10:21), KaiGai Kohei wrote:
> I'll check the patch for more details, please wait for a few days.

I could find out some matters in this patch, independent from the
discussion of localhost. (About pg_hba.conf.sample, I'm sorry for
the missuggestion. Please fix up it according to Tom's comment.)

* The logic is still unclear for me.

The check_hostname() immediately returns with false, if the resolved
remote hostname is NOT matched with the hostname described in pg_hba.conf.

| +static bool
| +check_hostname(hbaPort *port, const char *hostname)
| +{
| +   struct addrinfo *gai_result, *gai;
| +   int         ret;
| +   bool        found;
| +
| +   /* Lookup remote host name if not already done */
| +   if (!port->remote_hostname)
| +   {
| +       char        remote_hostname[NI_MAXHOST];
| +
| +       if (pg_getnameinfo_all(&port->raddr.addr, port->raddr.salen,
| +                              remote_hostname, sizeof(remote_hostname),
| +                              NULL, 0,
| +                              0))
| +           return false;
| +
| +       port->remote_hostname = strdup(remote_hostname);
| +   }
| +
| +   if (strcmp(port->remote_hostname, hostname) != 0)
| +       return false;
| +
| +   /* Lookup IP from host name and check against original IP */

However, it seems to me you expected an opposite behavior.

If the resolved hostname is matched with the hostname described
in pg_hba.conf, we can consider this HbaLine to be a suitable
configuration without any fallbacks. Right?
It so, it should be as follows:
   if (strcmp(port->remote_hostname, hostname) == 0)       return true;

In addition, we should go the rest of fallback code on mismatch
cases only, don't we?

* Why getnameinfo() in the fallback loop?

At the fallback code when the hostname was matched (I believe this code
is intended to handle the case when hostname was NOT matched.) calls
getnameinfo() for each candidate of remote addresses.
But its result is not referenced by anybody. Is it really necessary?

| +   found = false;
| +   for (gai = gai_result; gai; gai = gai->ai_next)
| +   {
| +       char        hostinfo[NI_MAXHOST];
| +
| +       getnameinfo(gai->ai_addr, gai->ai_addrlen,
| +                   hostinfo, sizeof(hostinfo),
| +                   NULL, 0,
| +                   NI_NUMERICHOST);
| +
| +       if (gai->ai_addr->sa_family == port->raddr.addr.ss_family)
| +       {
| +           if (gai->ai_addr->sa_family == AF_INET)
| +           {
| +               if (ipv4eq((struct sockaddr_in *) gai->ai_addr,
| +                          (struct sockaddr_in *) &port->raddr.addr))
| +               {
| +                   found = true;
| +                   break;
| +               }
| +           }
| +           else if (gai->ai_addr->sa_family == AF_INET6)
| +           {
| +               if (ipv6eq((struct sockaddr_in6 *) gai->ai_addr,
| +                          (struct sockaddr_in6 *) &port->raddr.addr))
| +               {
| +                   found = true;
| +                   break;
| +               }
| +           }
| +       }
| +   }
| +
| +   if (gai_result)
| +       freeaddrinfo(gai_result);
| +
| +   return found;
| +}

* Slash ('/') after the hostname

At the parse_hba_line(), the parsed token which contains either
hostname or cidr address is sliced into two parts on the first '/'
character, if exist.
Then, even if cidr_slash is not NULL, it shall be ignored when
top-half of the token is hostname, not numeric address.

|         else
|         {
|             /* IP and netmask are specified */
|             parsedline->ip_cmp_method = ipCmpMask;
|
|             /* need a modifiable copy of token */
|             token = pstrdup(token);
|
|             /* Check if it has a CIDR suffix and if so isolate it */
|             cidr_slash = strchr(token, '/');
|             if (cidr_slash)
|                 *cidr_slash = '\0';                 :
|             ret = pg_getaddrinfo_all(token, NULL, &hints, &gai_result);
| -           if (ret || !gai_result)
| +           if (ret == 0 && gai_result)
| +               memcpy(&parsedline->addr, gai_result->ai_addr,
| +                      gai_result->ai_addrlen);
| +           else if (ret == EAI_NONAME)
| +               parsedline->hostname = token;
| +           else
|             {

It allows the following configuration works without any errors.
(In fact, it works for me.)
 # IPv4/6 local connections: host  all  all  kaigai.myhome.cx/today_is_sunny  trust

It seems to me, we should raise an error, if both of cidr_slash and
parsedline->hostname are not NULL.

Thanks,
-- 
KaiGai Kohei <kaigai@ak.jp.nec.com>


Re: host name support in pg_hba.conf

From
Peter Eisentraut
Date:
On tis, 2010-10-05 at 22:28 -0400, Joachim Wieland wrote:
> This is exactly what I am seeing here. However contrary to your case
> the
> patch makes it even worse on my side. With the patch compiled in and a
> pg_hba.conf entry of "localhost", I cannot connect anymore to "-h
> localhost", I get "no pg_hba.conf entry for host ::1".
> 
> This is mostly standard Ubuntu setup.

Could you post your /etc/hosts file?



Re: host name support in pg_hba.conf

From
Peter Eisentraut
Date:
On tor, 2010-10-07 at 12:45 +0900, KaiGai Kohei wrote:
> * The logic is still unclear for me.
> 
> The check_hostname() immediately returns with false, if the resolved
> remote hostname is NOT matched with the hostname described in pg_hba.conf.

> If the resolved hostname is matched with the hostname described
> in pg_hba.conf, we can consider this HbaLine to be a suitable
> configuration without any fallbacks. Right?
> It so, it should be as follows:
> 
>     if (strcmp(port->remote_hostname, hostname) == 0)
>         return true;
> 
> In addition, we should go the rest of fallback code on mismatch
> cases only, don't we?

The code below that is not a fallback, it is the second part of the
double DNS lookup that has been extensively discussed throughout this
thread.  The logic is

get hostname from client's IP address
strcmp hostname to pg_hba.conf
get IP address from hostname
if that IP address == client's IP address; then pg_hba.conf entry OK

> * Why getnameinfo() in the fallback loop?

I checked through my git history; this was actually a leftover from some
debugging code.  I'll remove it.

> * Slash ('/') after the hostname
> 
> At the parse_hba_line(), the parsed token which contains either
> hostname or cidr address is sliced into two parts on the first '/'
> character, if exist.
> Then, even if cidr_slash is not NULL, it shall be ignored when
> top-half of the token is hostname, not numeric address.

OK, I'll fix that.




Re: host name support in pg_hba.conf

From
KaiGai Kohei
Date:
(2010/10/12 3:34), Peter Eisentraut wrote:
> On tor, 2010-10-07 at 12:45 +0900, KaiGai Kohei wrote:
>> * The logic is still unclear for me.
>>
>> The check_hostname() immediately returns with false, if the resolved
>> remote hostname is NOT matched with the hostname described in pg_hba.conf.
> 
>> If the resolved hostname is matched with the hostname described
>> in pg_hba.conf, we can consider this HbaLine to be a suitable
>> configuration without any fallbacks. Right?
>> It so, it should be as follows:
>>
>>      if (strcmp(port->remote_hostname, hostname) == 0)
>>          return true;
>>
>> In addition, we should go the rest of fallback code on mismatch
>> cases only, don't we?
> 
> The code below that is not a fallback, it is the second part of the
> double DNS lookup that has been extensively discussed throughout this
> thread.  The logic is
> 
> get hostname from client's IP address
> strcmp hostname to pg_hba.conf
> get IP address from hostname
> if that IP address == client's IP address; then pg_hba.conf entry OK
> 
Sorry, I missed what you introduced at the head of this thread.
When an entry passes the checks, this routine always checks both of
directions for the supplied entries.

BTW, I have one other question.
Is it really necessary to check reverse dns entries?
If check_hostname() compares remote address of the client and all
the candidates retrieved using getaddrinfo() for the hostname in
pg_hba.conf line (not port->remote_hostname), it seems to me we don't
need to set up reverse dns entries.
A typical ISP often doesn't assign reverse dns entries for cheap class
rental server being shared by multiple users, for instance. It seem to
me this idea allows to apply this new feature more widely.
# Let's try nslookup to www.kaigai.gr.jp, and its retrieved address. :-)

Of course, it is just my idea. If dislike it, please ignore it.

>> * Why getnameinfo() in the fallback loop?
> 
> I checked through my git history; this was actually a leftover from some
> debugging code.  I'll remove it.
> 
>> * Slash ('/') after the hostname
>>
>> At the parse_hba_line(), the parsed token which contains either
>> hostname or cidr address is sliced into two parts on the first '/'
>> character, if exist.
>> Then, even if cidr_slash is not NULL, it shall be ignored when
>> top-half of the token is hostname, not numeric address.
> 
> OK, I'll fix that.
> 
> 
> 


-- 
KaiGai Kohei <kaigai@ak.jp.nec.com>


Re: host name support in pg_hba.conf

From
Peter Eisentraut
Date:
On mån, 2010-10-11 at 21:34 +0300, Peter Eisentraut wrote:
> > * Why getnameinfo() in the fallback loop?
>
> I checked through my git history; this was actually a leftover from
> some
> debugging code.  I'll remove it.
>
> > * Slash ('/') after the hostname
> >
> > At the parse_hba_line(), the parsed token which contains either
> > hostname or cidr address is sliced into two parts on the first '/'
> > character, if exist.
> > Then, even if cidr_slash is not NULL, it shall be ignored when
> > top-half of the token is hostname, not numeric address.
>
> OK, I'll fix that.

Hopefully final patch, which addresses the above issues, adds some
documentation enhancements, and the possibility to quote host names (in
case someone wants to have a host named "samehost").

Attachment

Re: host name support in pg_hba.conf

From
Peter Eisentraut
Date:
On tis, 2010-10-12 at 10:11 +0900, KaiGai Kohei wrote:
> Is it really necessary to check reverse dns entries?

This has been extensively discussed in this thread.



Re: host name support in pg_hba.conf

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> Hopefully final patch, which addresses the above issues, adds some
> documentation enhancements, and the possibility to quote host names (in
> case someone wants to have a host named "samehost").

A few minor gripes:

> +       If a host name is specified (anything that is not an IP address
> +       or a special key word is processed as a potential host name),
> +       that name is compared with the result of a reverse name
> +       resolution of the client's IP address (e.g., reverse DNS
> +       lookup, if DNS is used).  If there is a match, then a forward
> +       name resolution (e.g., forward DNS lookup) is performed on the
> +       host name to check if it resolves to an IP address that is
> +       equal to the client's IP address.  If both directions match,
> +       then the entry is considered to match.

I think the reason you're getting repeated questions about why the
reverse DNS lookup is needed is that the documentation fails to explain
that.  It'd be helpful if this part of the docs pointed out why the
apparently-extra lookup is necessary.  If I understand correctly, the
point is that we do the reverse lookup only once per connection (when
first finding a name-based pg_hba.conf entry) and that saves us having
to do forward lookup on all the name-based entries that don't match.

Which means BTW that the technique loses if the first name-based entry
matches the connection, and only wins when the match comes at the third
or later name-based entry.  Are we really sure this is a good tradeoff?
Are we sure there shouldn't be a way to turn it off?  I'm a bit
concerned about the fact that the argument seems to be "better
performance" versus "makes it completely useless for some use cases",
as here:
http://archives.postgresql.org/pgsql-hackers/2010-08/msg00726.php

Another smaller point is that it might be helpful if the wording didn't
make it sound like we expect the forward DNS lookup to produce just
one IP address.  Perhaps "If there is a match, then a forward name
resolution (e.g., forward DNS lookup) is performed on the host name to
check if any of the addresses it resolves to are equal to the client's
IP address."

In check_hostname():

> +        port->remote_hostname = strdup(remote_hostname);

Seems like this had better be pstrdup(), or at least have an error
check.  Dumping core at the next line is not a real acceptable
substitute for an "out of memory" error.

> +    /* Lookup IP from host name and check against original IP */
> +    ret = getaddrinfo(port->remote_hostname, NULL, NULL, &gai_result);
> +    if (ret != 0)
> +        ereport(ERROR,
> +                (errcode(ERRCODE_CONFIG_FILE_ERROR),
> +                 errmsg("getaddrinfo failed on \"%s\": %s",
> +                        port->remote_hostname, gai_strerror(ret))));

Is ERRCODE_CONFIG_FILE_ERROR really the most appropriate code here?
        regards, tom lane


Re: host name support in pg_hba.conf

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
>> Hopefully final patch, which addresses the above issues, adds some
>> documentation enhancements, and the possibility to quote host names (in
>> case someone wants to have a host named "samehost").

Oh, I had an idea for a small improvement to this.  It doesn't seem
unlikely that pg_hba.conf could contain multiple entries with the same
host name (but, presumably, different user and/or database names).  As
this is coded, you'll do a forward DNS lookup for each one until finding
the complete match.  You could easily prevent that by adding an
additional cache field to struct Port, along the lines of+1 = remote_hostname is known to resolve to client's IP
address-1= remote_hostname is known NOT to resolve to client's IP address0 = we have not done the forward DNS lookup
yet.
With this additional field we could guarantee to do not more than two
DNS lookups per connection attempt.

It also seems worth taking a second look at the order of tests in
check_hba().  I suspect that on average check_db() and check_role()
will now be much cheaper than the client IP test; should they be
done first?  Of course, if you assume that "all" is the typical
entry in those columns, this doesn't win.
        regards, tom lane


Re: host name support in pg_hba.conf

From
Peter Eisentraut
Date:
On tis, 2010-10-12 at 17:03 -0400, Tom Lane wrote:
> Oh, I had an idea for a small improvement to this.  It doesn't seem
> unlikely that pg_hba.conf could contain multiple entries with the same
> host name (but, presumably, different user and/or database names).  As
> this is coded, you'll do a forward DNS lookup for each one until
> finding
> the complete match.  You could easily prevent that by adding an
> additional cache field to struct Port, along the lines of
>         +1 = remote_hostname is known to resolve to client's IP
> address
>         -1 = remote_hostname is known NOT to resolve to client's IP
> address
>         0 = we have not done the forward DNS lookup yet.
> With this additional field we could guarantee to do not more than two
> DNS lookups per connection attempt.

That's a very good idea.  I will revise my patch.



Re: host name support in pg_hba.conf

From
Peter Eisentraut
Date:
I have addressed these issues in my final patch.  One other change I did
was change the host name comparison to case insensitive (Apache and TCP
Wrappers do the same).

I'm planning to work on a wildcard mechanism soon.  After that the issue
of which way the host resolution should work will have an additional
argument, so I'm not going to address that any further right now.


On tis, 2010-10-12 at 16:33 -0400, Tom Lane wrote:
> Peter Eisentraut <peter_e@gmx.net> writes:
> > Hopefully final patch, which addresses the above issues, adds some
> > documentation enhancements, and the possibility to quote host names (in
> > case someone wants to have a host named "samehost").
> 
> A few minor gripes:
> 
> > +       If a host name is specified (anything that is not an IP address
> > +       or a special key word is processed as a potential host name),
> > +       that name is compared with the result of a reverse name
> > +       resolution of the client's IP address (e.g., reverse DNS
> > +       lookup, if DNS is used).  If there is a match, then a forward
> > +       name resolution (e.g., forward DNS lookup) is performed on the
> > +       host name to check if it resolves to an IP address that is
> > +       equal to the client's IP address.  If both directions match,
> > +       then the entry is considered to match.
> 
> I think the reason you're getting repeated questions about why the
> reverse DNS lookup is needed is that the documentation fails to explain
> that.  It'd be helpful if this part of the docs pointed out why the
> apparently-extra lookup is necessary.  If I understand correctly, the
> point is that we do the reverse lookup only once per connection (when
> first finding a name-based pg_hba.conf entry) and that saves us having
> to do forward lookup on all the name-based entries that don't match.
> 
> Which means BTW that the technique loses if the first name-based entry
> matches the connection, and only wins when the match comes at the third
> or later name-based entry.  Are we really sure this is a good tradeoff?
> Are we sure there shouldn't be a way to turn it off?  I'm a bit
> concerned about the fact that the argument seems to be "better
> performance" versus "makes it completely useless for some use cases",
> as here:
> http://archives.postgresql.org/pgsql-hackers/2010-08/msg00726.php
> 
> Another smaller point is that it might be helpful if the wording didn't
> make it sound like we expect the forward DNS lookup to produce just
> one IP address.  Perhaps "If there is a match, then a forward name
> resolution (e.g., forward DNS lookup) is performed on the host name to
> check if any of the addresses it resolves to are equal to the client's
> IP address."
> 
> In check_hostname():
> 
> > +        port->remote_hostname = strdup(remote_hostname);
> 
> Seems like this had better be pstrdup(), or at least have an error
> check.  Dumping core at the next line is not a real acceptable
> substitute for an "out of memory" error.
> 
> > +    /* Lookup IP from host name and check against original IP */
> > +    ret = getaddrinfo(port->remote_hostname, NULL, NULL, &gai_result);
> > +    if (ret != 0)
> > +        ereport(ERROR,
> > +                (errcode(ERRCODE_CONFIG_FILE_ERROR),
> > +                 errmsg("getaddrinfo failed on \"%s\": %s",
> > +                        port->remote_hostname, gai_strerror(ret))));
> 
> Is ERRCODE_CONFIG_FILE_ERROR really the most appropriate code here?
> 
>             regards, tom lane
>