Thread: GSSAPI doesn't play nice with non-canonical host names

GSSAPI doesn't play nice with non-canonical host names

From
Tom Lane
Date:
Whilst trying to reproduce bug #3902 I noticed that the code doesn't
work with an abbreviated host name:

$ psql -l -h rh2.sss.pgh.pa.us      List of databases      ... everything's fine ...

$ psql -l -h rh2              
psql: GSSAPI continuation error: Unspecified GSS failure.  Minor code may provide more information
GSSAPI continuation error: Unknown code krb5 7
$

Considering that my DNS system knows perfectly well what to resolve the
short name to, it seems reasonable that GSSAPI should be able to deal
with it.  Looking into the KDC log shows that psql is trying everything
but the right thing for the hostname part of the server principal:

Jan 27 18:41:19 rh2.sss.pgh.pa.us krb5kdc[3993](info): TGS_REQ (7 etypes {18 17 16 23 1 3 2}) 192.168.0.8:
UNKNOWN_SERVER:authtime 1201474768,  tgl@SSS.PGH.PA.US for postgres/localhost.localdomain@SSS.PGH.PA.US, Server not
foundin Kerberos database
 
Jan 27 18:41:19 rh2.sss.pgh.pa.us krb5kdc[3993](info): TGS_REQ (7 etypes {18 17 16 23 1 3 2}) 192.168.0.8:
UNKNOWN_SERVER:authtime 1201474768,  tgl@SSS.PGH.PA.US for postgres/localhost.localdomain@SSS.PGH.PA.US, Server not
foundin Kerberos database
 
Jan 27 18:41:19 rh2.sss.pgh.pa.us krb5kdc[3993](info): TGS_REQ (7 etypes {18 17 16 23 1 3 2}) 192.168.0.8:
UNKNOWN_SERVER:authtime 1201474768,  tgl@SSS.PGH.PA.US for krbtgt/LOCALDOMAIN@SSS.PGH.PA.US, Server not found in
Kerberosdatabase
 
Jan 27 18:41:19 rh2.sss.pgh.pa.us krb5kdc[3993](info): TGS_REQ (7 etypes {18 17 16 23 1 3 2}) 192.168.0.8:
UNKNOWN_SERVER:authtime 1201474768,  tgl@SSS.PGH.PA.US for krbtgt/LOCALDOMAIN@SSS.PGH.PA.US, Server not found in
Kerberosdatabase
 
Jan 27 18:41:19 rh2.sss.pgh.pa.us krb5kdc[3993](info): TGS_REQ (7 etypes {18 17 16 23 1 3 2}) 192.168.0.8:
UNKNOWN_SERVER:authtime 1201474768,  tgl@SSS.PGH.PA.US for krbtgt/US@SSS.PGH.PA.US, Server not found in Kerberos
database
Jan 27 18:41:19 rh2.sss.pgh.pa.us krb5kdc[3993](info): TGS_REQ (7 etypes {18 17 16 23 1 3 2}) 192.168.0.8:
UNKNOWN_SERVER:authtime 1201474768,  tgl@SSS.PGH.PA.US for krbtgt/US@SSS.PGH.PA.US, Server not found in Kerberos
database
Jan 27 18:41:19 rh2.sss.pgh.pa.us krb5kdc[3993](info): TGS_REQ (7 etypes {18 17 16 23 1 3 2}) 192.168.0.8:
UNKNOWN_SERVER:authtime 1201474768,  tgl@SSS.PGH.PA.US for krbtgt/PA.US@SSS.PGH.PA.US, Server not found in Kerberos
database
Jan 27 18:41:19 rh2.sss.pgh.pa.us krb5kdc[3993](info): TGS_REQ (7 etypes {18 17 16 23 1 3 2}) 192.168.0.8:
UNKNOWN_SERVER:authtime 1201474768,  tgl@SSS.PGH.PA.US for krbtgt/PA.US@SSS.PGH.PA.US, Server not found in Kerberos
database
Jan 27 18:41:19 rh2.sss.pgh.pa.us krb5kdc[3993](info): TGS_REQ (7 etypes {18 17 16 23 1 3 2}) 192.168.0.8:
UNKNOWN_SERVER:authtime 1201474768,  tgl@SSS.PGH.PA.US for krbtgt/PGH.PA.US@SSS.PGH.PA.US, Server not found in Kerberos
database
Jan 27 18:41:19 rh2.sss.pgh.pa.us krb5kdc[3993](info): TGS_REQ (7 etypes {18 17 16 23 1 3 2}) 192.168.0.8:
UNKNOWN_SERVER:authtime 1201474768,  tgl@SSS.PGH.PA.US for krbtgt/PGH.PA.US@SSS.PGH.PA.US, Server not found in Kerberos
database

This could be a configuration error on my part (I've never set up a
Kerberos server before today) but what it looks like to me is that
something in the GSSAPI library is assuming it's being handed a fully
qualified domain name.  Perhaps pg_GSS_startup() shouldn't be using
just conn->pghost, but the fully resolved server domain name?

This is on Fedora 8, krb5-libs-1.6.2-9.fc8.x86_64, in case it matters.

While I'm complaining: that's got to be one of the least useful error
messages I've ever seen, and it's for a case that's surely going to be
fairly common in practice.  Can't we persuade GSSAPI to produce
something more user-friendly?  At least convert "7" to "Server not
found in Kerberos database"?
        regards, tom lane


Re: GSSAPI doesn't play nice with non-canonical host names

From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Whilst trying to reproduce bug #3902 I noticed that the code doesn't
> work with an abbreviated host name:
>
> $ psql -l -h rh2.sss.pgh.pa.us
>        List of databases
>        ... everything's fine ...
>
> $ psql -l -h rh2
> psql: GSSAPI continuation error: Unspecified GSS failure.  Minor code may provide more information
> GSSAPI continuation error: Unknown code krb5 7
> $

Testing w/ 8.3RC2, everything seems to be working fine here:

sfrost@snowman:/home/sfrost> psql -d postgres -h snowman
Welcome to psql 8.3RC2, the PostgreSQL interactive terminal.
...

sfrost@snowman:/home/sfrost> klist
Ticket cache: FILE:/tmp/krb5cc_1000
Default principal: sfrost@SNOWMAN.NET

Valid starting     Expires            Service principal
01/27/08 21:14:55  01/28/08 07:14:55  krbtgt/SNOWMAN.NET@SNOWMAN.NET       renew until 01/28/08 21:14:53
01/27/08 21:14:59  01/28/08 07:14:55  postgres/snowman.snowman.net@SNOWMAN.NET       renew until 01/28/08 21:14:53

It also picks up on the correct keytab entry on the first shot,
according to my KDC logs (and is what I would certainly expect):

Jan 27 21:14:53 snowman krb5kdc[5008]: AS_REQ (7 etypes {18 17 16 23 1 3 2}) 10.10.1.2: NEEDED_PREAUTH:
sfrost@SNOWMAN.NETfor krbtgt/SNOWMAN.NET@SNOWMAN.NET, Additional pre-authentication required 
Jan 27 21:14:55 snowman krb5kdc[5008]: AS_REQ (7 etypes {18 17 16 23 1 3 2}) 10.10.1.2: ISSUE: authtime 1201486495,
etypes{rep=16 tkt=16 ses=16}, sfrost@SNOWMAN.NET for krbtgt/SNOWMAN.NET@SNOWMAN.NET 
Jan 27 21:14:59 snowman krb5kdc[5008]: TGS_REQ (7 etypes {18 17 16 23 1 3 2}) 10.10.1.2: ISSUE: authtime 1201486495,
etypes{rep=16 tkt=16 ses=16}, sfrost@SNOWMAN.NET for postgres/snowman.snowman.net@SNOWMAN.NET 

The first two lines are the krb5 preauth, followed by getting a
ticket-granting-ticket (both a result of my doing a 'kinit'), and then a
few seconds later the postgres/snowman.snowman.net@SNOWMAN.NET ticket
which was requested by psql when I attempted to connect.

> Considering that my DNS system knows perfectly well what to resolve the
> short name to, it seems reasonable that GSSAPI should be able to deal
> with it.  Looking into the KDC log shows that psql is trying everything
> but the right thing for the hostname part of the server principal:

GSSAPI definitely should be handling it correctly, but the way it works
normally is perhaps a bit counter-intuitive: it resolves the reverse DNS
of the IP address to which it's connecting and uses that:

sfrost@snowman:/home/sfrost> host snowman.snowman.net
snowman.snowman.net has address 10.10.1.2
sfrost@snowman:/home/sfrost> host 10.10.1.2
2.1.10.10.in-addr.arpa domain name pointer snowman.snowman.net.

I'm guessing from the KDC logs that your 'rh2' is actually an alias in
/etc/hosts for 127.0.0.1, and localhost.localdomain, which is why it's
trying that first.  Putting in the FQDN is actually a work-around for
hosts which don't have a proper reverse DNS (we use it quite a bit when
we have remote servers that we use kerberos with but we don't control
the reverse DNS of), so I'm not too suprised that works.

As for why it doesn't eventually take whatever you gave it and tack it
on to the front of the local domain, I'm not sure.  There could be some
room for improvment here, certainly, but I'm pretty sure this is the
same behaviour the Kerberos auth mechanism would give you and isn't
suprising to people who use GSSAPI/Kerberos.

> This could be a configuration error on my part (I've never set up a
> Kerberos server before today) but what it looks like to me is that
> something in the GSSAPI library is assuming it's being handed a fully
> qualified domain name.  Perhaps pg_GSS_startup() shouldn't be using
> just conn->pghost, but the fully resolved server domain name?

I'm not sure if that would actually help or not and I wouldn't want to
confuse things for the GSSAPI library by passing it an FQDN when the
client didn't actually send one and perhaps bypass the reverse-DNS check
(which could break things- there are alot of cases where a given host
can have multiple hostnames while all of its IP addresses do resolve to
the same 'main' hostname which is what is used for Kerberos auth).

> While I'm complaining: that's got to be one of the least useful error
> messages I've ever seen, and it's for a case that's surely going to be
> fairly common in practice.  Can't we persuade GSSAPI to produce
> something more user-friendly?  At least convert "7" to "Server not
> found in Kerberos database"?

I agree, and have found it to be very frustrating while working w/
Kerberos in general.  I *think* there's a library which can convert
those error-codes (libcomm-err?), but I've not really looked into it
yet.
Thanks,    Stephen

Re: GSSAPI doesn't play nice with non-canonical host names

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> Whilst trying to reproduce bug #3902 I noticed that the code doesn't
>> work with an abbreviated host name:

> Testing w/ 8.3RC2, everything seems to be working fine here:

Okay, that probably means there's something wacko about my Kerberos
setup.  It's quite likely got something to do with the fact that I
set up the KDC on the same machine where I'm doing the PG testing,
which is surely a case that would never be sane in practice.

[ thinks for a bit... ]  In this context there's some ambiguity as to
whether 'rh2' should resolve as 127.0.0.1 or the machine's real IP
address, and no doubt something is making the wrong choice someplace.
That's probably how the localdomain lookups got into it.
        regards, tom lane


Re: GSSAPI doesn't play nice with non-canonical host names

From
Magnus Hagander
Date:
On Sun, Jan 27, 2008 at 09:51:48PM -0500, Tom Lane wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > * Tom Lane (tgl@sss.pgh.pa.us) wrote:
> >> Whilst trying to reproduce bug #3902 I noticed that the code doesn't
> >> work with an abbreviated host name:
> 
> > Testing w/ 8.3RC2, everything seems to be working fine here:
> 
> Okay, that probably means there's something wacko about my Kerberos
> setup.  It's quite likely got something to do with the fact that I
> set up the KDC on the same machine where I'm doing the PG testing,
> which is surely a case that would never be sane in practice.
> 
> [ thinks for a bit... ]  In this context there's some ambiguity as to
> whether 'rh2' should resolve as 127.0.0.1 or the machine's real IP
> address, and no doubt something is making the wrong choice someplace.
> That's probably how the localdomain lookups got into it.

Sounds likely. FWIW, DNS issues is by far the most common problem with
Kerberos installations - at least it is on Windows.

//Magnus


Re: GSSAPI doesn't play nice with non-canonical host names

From
Magnus Hagander
Date:
On Sun, Jan 27, 2008 at 09:32:54PM -0500, Stephen Frost wrote:
> 
> > While I'm complaining: that's got to be one of the least useful error
> > messages I've ever seen, and it's for a case that's surely going to be
> > fairly common in practice.  Can't we persuade GSSAPI to produce
> > something more user-friendly?  At least convert "7" to "Server not
> > found in Kerberos database"?
> 
> I agree, and have found it to be very frustrating while working w/
> Kerberos in general.  I *think* there's a library which can convert
> those error-codes (libcomm-err?), but I've not really looked into it
> yet.

AFAIK, that one is for Kerberos only. For GSSAPI, we already use the
gss_display_status function to get the error messages. I think the problem
here is in the Kerberos library?

//Magnus


Re: GSSAPI doesn't play nice with non-canonical host names

From
Tom Lane
Date:
Magnus Hagander <magnus@hagander.net> writes:
>>> While I'm complaining: that's got to be one of the least useful error
>>> messages I've ever seen, and it's for a case that's surely going to be
>>> fairly common in practice.

> AFAIK, that one is for Kerberos only. For GSSAPI, we already use the
> gss_display_status function to get the error messages. I think the problem
> here is in the Kerberos library?

Yeah, I had verified by tracing through it that the text was just what
gss_display_status gave us.  It could be that it's just plain broken,
but I was sort of hoping that we were using it incorrectly or that
there's some magic flag to set to get better messages out of it.
        regards, tom lane


Re: GSSAPI doesn't play nice with non-canonical host names

From
Tom Lane
Date:
Magnus Hagander <magnus@hagander.net> writes:
> On Sun, Jan 27, 2008 at 09:32:54PM -0500, Stephen Frost wrote:
>>> While I'm complaining: that's got to be one of the least useful error
>>> messages I've ever seen, and it's for a case that's surely going to be
>>> fairly common in practice.  Can't we persuade GSSAPI to produce
>>> something more user-friendly?  At least convert "7" to "Server not
>>> found in Kerberos database"?
>> 
>> I agree, and have found it to be very frustrating while working w/
>> Kerberos in general.  I *think* there's a library which can convert
>> those error-codes (libcomm-err?), but I've not really looked into it
>> yet.

> AFAIK, that one is for Kerberos only. For GSSAPI, we already use the
> gss_display_status function to get the error messages. I think the problem
> here is in the Kerberos library?

Yeah, I found it:
https://bugzilla.redhat.com/show_bug.cgi?id=430983

The best fix is not entirely clear, but in any case it's not our bug.
        regards, tom lane