Thread: GSSAPI doesn't play nice with non-canonical host names
Whilst trying to reproduce bug #3902 I noticed that the code doesn't work with an abbreviated host name: $ psql -l -h rh2.sss.pgh.pa.us List of databases ... everything's fine ... $ psql -l -h rh2 psql: GSSAPI continuation error: Unspecified GSS failure. Minor code may provide more information GSSAPI continuation error: Unknown code krb5 7 $ Considering that my DNS system knows perfectly well what to resolve the short name to, it seems reasonable that GSSAPI should be able to deal with it. Looking into the KDC log shows that psql is trying everything but the right thing for the hostname part of the server principal: Jan 27 18:41:19 rh2.sss.pgh.pa.us krb5kdc[3993](info): TGS_REQ (7 etypes {18 17 16 23 1 3 2}) 192.168.0.8: UNKNOWN_SERVER:authtime 1201474768, tgl@SSS.PGH.PA.US for postgres/localhost.localdomain@SSS.PGH.PA.US, Server not foundin Kerberos database Jan 27 18:41:19 rh2.sss.pgh.pa.us krb5kdc[3993](info): TGS_REQ (7 etypes {18 17 16 23 1 3 2}) 192.168.0.8: UNKNOWN_SERVER:authtime 1201474768, tgl@SSS.PGH.PA.US for postgres/localhost.localdomain@SSS.PGH.PA.US, Server not foundin Kerberos database Jan 27 18:41:19 rh2.sss.pgh.pa.us krb5kdc[3993](info): TGS_REQ (7 etypes {18 17 16 23 1 3 2}) 192.168.0.8: UNKNOWN_SERVER:authtime 1201474768, tgl@SSS.PGH.PA.US for krbtgt/LOCALDOMAIN@SSS.PGH.PA.US, Server not found in Kerberosdatabase Jan 27 18:41:19 rh2.sss.pgh.pa.us krb5kdc[3993](info): TGS_REQ (7 etypes {18 17 16 23 1 3 2}) 192.168.0.8: UNKNOWN_SERVER:authtime 1201474768, tgl@SSS.PGH.PA.US for krbtgt/LOCALDOMAIN@SSS.PGH.PA.US, Server not found in Kerberosdatabase Jan 27 18:41:19 rh2.sss.pgh.pa.us krb5kdc[3993](info): TGS_REQ (7 etypes {18 17 16 23 1 3 2}) 192.168.0.8: UNKNOWN_SERVER:authtime 1201474768, tgl@SSS.PGH.PA.US for krbtgt/US@SSS.PGH.PA.US, Server not found in Kerberos database Jan 27 18:41:19 rh2.sss.pgh.pa.us krb5kdc[3993](info): TGS_REQ (7 etypes {18 17 16 23 1 3 2}) 192.168.0.8: UNKNOWN_SERVER:authtime 1201474768, tgl@SSS.PGH.PA.US for krbtgt/US@SSS.PGH.PA.US, Server not found in Kerberos database Jan 27 18:41:19 rh2.sss.pgh.pa.us krb5kdc[3993](info): TGS_REQ (7 etypes {18 17 16 23 1 3 2}) 192.168.0.8: UNKNOWN_SERVER:authtime 1201474768, tgl@SSS.PGH.PA.US for krbtgt/PA.US@SSS.PGH.PA.US, Server not found in Kerberos database Jan 27 18:41:19 rh2.sss.pgh.pa.us krb5kdc[3993](info): TGS_REQ (7 etypes {18 17 16 23 1 3 2}) 192.168.0.8: UNKNOWN_SERVER:authtime 1201474768, tgl@SSS.PGH.PA.US for krbtgt/PA.US@SSS.PGH.PA.US, Server not found in Kerberos database Jan 27 18:41:19 rh2.sss.pgh.pa.us krb5kdc[3993](info): TGS_REQ (7 etypes {18 17 16 23 1 3 2}) 192.168.0.8: UNKNOWN_SERVER:authtime 1201474768, tgl@SSS.PGH.PA.US for krbtgt/PGH.PA.US@SSS.PGH.PA.US, Server not found in Kerberos database Jan 27 18:41:19 rh2.sss.pgh.pa.us krb5kdc[3993](info): TGS_REQ (7 etypes {18 17 16 23 1 3 2}) 192.168.0.8: UNKNOWN_SERVER:authtime 1201474768, tgl@SSS.PGH.PA.US for krbtgt/PGH.PA.US@SSS.PGH.PA.US, Server not found in Kerberos database This could be a configuration error on my part (I've never set up a Kerberos server before today) but what it looks like to me is that something in the GSSAPI library is assuming it's being handed a fully qualified domain name. Perhaps pg_GSS_startup() shouldn't be using just conn->pghost, but the fully resolved server domain name? This is on Fedora 8, krb5-libs-1.6.2-9.fc8.x86_64, in case it matters. While I'm complaining: that's got to be one of the least useful error messages I've ever seen, and it's for a case that's surely going to be fairly common in practice. Can't we persuade GSSAPI to produce something more user-friendly? At least convert "7" to "Server not found in Kerberos database"? regards, tom lane
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Whilst trying to reproduce bug #3902 I noticed that the code doesn't > work with an abbreviated host name: > > $ psql -l -h rh2.sss.pgh.pa.us > List of databases > ... everything's fine ... > > $ psql -l -h rh2 > psql: GSSAPI continuation error: Unspecified GSS failure. Minor code may provide more information > GSSAPI continuation error: Unknown code krb5 7 > $ Testing w/ 8.3RC2, everything seems to be working fine here: sfrost@snowman:/home/sfrost> psql -d postgres -h snowman Welcome to psql 8.3RC2, the PostgreSQL interactive terminal. ... sfrost@snowman:/home/sfrost> klist Ticket cache: FILE:/tmp/krb5cc_1000 Default principal: sfrost@SNOWMAN.NET Valid starting Expires Service principal 01/27/08 21:14:55 01/28/08 07:14:55 krbtgt/SNOWMAN.NET@SNOWMAN.NET renew until 01/28/08 21:14:53 01/27/08 21:14:59 01/28/08 07:14:55 postgres/snowman.snowman.net@SNOWMAN.NET renew until 01/28/08 21:14:53 It also picks up on the correct keytab entry on the first shot, according to my KDC logs (and is what I would certainly expect): Jan 27 21:14:53 snowman krb5kdc[5008]: AS_REQ (7 etypes {18 17 16 23 1 3 2}) 10.10.1.2: NEEDED_PREAUTH: sfrost@SNOWMAN.NETfor krbtgt/SNOWMAN.NET@SNOWMAN.NET, Additional pre-authentication required Jan 27 21:14:55 snowman krb5kdc[5008]: AS_REQ (7 etypes {18 17 16 23 1 3 2}) 10.10.1.2: ISSUE: authtime 1201486495, etypes{rep=16 tkt=16 ses=16}, sfrost@SNOWMAN.NET for krbtgt/SNOWMAN.NET@SNOWMAN.NET Jan 27 21:14:59 snowman krb5kdc[5008]: TGS_REQ (7 etypes {18 17 16 23 1 3 2}) 10.10.1.2: ISSUE: authtime 1201486495, etypes{rep=16 tkt=16 ses=16}, sfrost@SNOWMAN.NET for postgres/snowman.snowman.net@SNOWMAN.NET The first two lines are the krb5 preauth, followed by getting a ticket-granting-ticket (both a result of my doing a 'kinit'), and then a few seconds later the postgres/snowman.snowman.net@SNOWMAN.NET ticket which was requested by psql when I attempted to connect. > Considering that my DNS system knows perfectly well what to resolve the > short name to, it seems reasonable that GSSAPI should be able to deal > with it. Looking into the KDC log shows that psql is trying everything > but the right thing for the hostname part of the server principal: GSSAPI definitely should be handling it correctly, but the way it works normally is perhaps a bit counter-intuitive: it resolves the reverse DNS of the IP address to which it's connecting and uses that: sfrost@snowman:/home/sfrost> host snowman.snowman.net snowman.snowman.net has address 10.10.1.2 sfrost@snowman:/home/sfrost> host 10.10.1.2 2.1.10.10.in-addr.arpa domain name pointer snowman.snowman.net. I'm guessing from the KDC logs that your 'rh2' is actually an alias in /etc/hosts for 127.0.0.1, and localhost.localdomain, which is why it's trying that first. Putting in the FQDN is actually a work-around for hosts which don't have a proper reverse DNS (we use it quite a bit when we have remote servers that we use kerberos with but we don't control the reverse DNS of), so I'm not too suprised that works. As for why it doesn't eventually take whatever you gave it and tack it on to the front of the local domain, I'm not sure. There could be some room for improvment here, certainly, but I'm pretty sure this is the same behaviour the Kerberos auth mechanism would give you and isn't suprising to people who use GSSAPI/Kerberos. > This could be a configuration error on my part (I've never set up a > Kerberos server before today) but what it looks like to me is that > something in the GSSAPI library is assuming it's being handed a fully > qualified domain name. Perhaps pg_GSS_startup() shouldn't be using > just conn->pghost, but the fully resolved server domain name? I'm not sure if that would actually help or not and I wouldn't want to confuse things for the GSSAPI library by passing it an FQDN when the client didn't actually send one and perhaps bypass the reverse-DNS check (which could break things- there are alot of cases where a given host can have multiple hostnames while all of its IP addresses do resolve to the same 'main' hostname which is what is used for Kerberos auth). > While I'm complaining: that's got to be one of the least useful error > messages I've ever seen, and it's for a case that's surely going to be > fairly common in practice. Can't we persuade GSSAPI to produce > something more user-friendly? At least convert "7" to "Server not > found in Kerberos database"? I agree, and have found it to be very frustrating while working w/ Kerberos in general. I *think* there's a library which can convert those error-codes (libcomm-err?), but I've not really looked into it yet. Thanks, Stephen
Stephen Frost <sfrost@snowman.net> writes: > * Tom Lane (tgl@sss.pgh.pa.us) wrote: >> Whilst trying to reproduce bug #3902 I noticed that the code doesn't >> work with an abbreviated host name: > Testing w/ 8.3RC2, everything seems to be working fine here: Okay, that probably means there's something wacko about my Kerberos setup. It's quite likely got something to do with the fact that I set up the KDC on the same machine where I'm doing the PG testing, which is surely a case that would never be sane in practice. [ thinks for a bit... ] In this context there's some ambiguity as to whether 'rh2' should resolve as 127.0.0.1 or the machine's real IP address, and no doubt something is making the wrong choice someplace. That's probably how the localdomain lookups got into it. regards, tom lane
On Sun, Jan 27, 2008 at 09:51:48PM -0500, Tom Lane wrote: > Stephen Frost <sfrost@snowman.net> writes: > > * Tom Lane (tgl@sss.pgh.pa.us) wrote: > >> Whilst trying to reproduce bug #3902 I noticed that the code doesn't > >> work with an abbreviated host name: > > > Testing w/ 8.3RC2, everything seems to be working fine here: > > Okay, that probably means there's something wacko about my Kerberos > setup. It's quite likely got something to do with the fact that I > set up the KDC on the same machine where I'm doing the PG testing, > which is surely a case that would never be sane in practice. > > [ thinks for a bit... ] In this context there's some ambiguity as to > whether 'rh2' should resolve as 127.0.0.1 or the machine's real IP > address, and no doubt something is making the wrong choice someplace. > That's probably how the localdomain lookups got into it. Sounds likely. FWIW, DNS issues is by far the most common problem with Kerberos installations - at least it is on Windows. //Magnus
On Sun, Jan 27, 2008 at 09:32:54PM -0500, Stephen Frost wrote: > > > While I'm complaining: that's got to be one of the least useful error > > messages I've ever seen, and it's for a case that's surely going to be > > fairly common in practice. Can't we persuade GSSAPI to produce > > something more user-friendly? At least convert "7" to "Server not > > found in Kerberos database"? > > I agree, and have found it to be very frustrating while working w/ > Kerberos in general. I *think* there's a library which can convert > those error-codes (libcomm-err?), but I've not really looked into it > yet. AFAIK, that one is for Kerberos only. For GSSAPI, we already use the gss_display_status function to get the error messages. I think the problem here is in the Kerberos library? //Magnus
Magnus Hagander <magnus@hagander.net> writes: >>> While I'm complaining: that's got to be one of the least useful error >>> messages I've ever seen, and it's for a case that's surely going to be >>> fairly common in practice. > AFAIK, that one is for Kerberos only. For GSSAPI, we already use the > gss_display_status function to get the error messages. I think the problem > here is in the Kerberos library? Yeah, I had verified by tracing through it that the text was just what gss_display_status gave us. It could be that it's just plain broken, but I was sort of hoping that we were using it incorrectly or that there's some magic flag to set to get better messages out of it. regards, tom lane
Magnus Hagander <magnus@hagander.net> writes: > On Sun, Jan 27, 2008 at 09:32:54PM -0500, Stephen Frost wrote: >>> While I'm complaining: that's got to be one of the least useful error >>> messages I've ever seen, and it's for a case that's surely going to be >>> fairly common in practice. Can't we persuade GSSAPI to produce >>> something more user-friendly? At least convert "7" to "Server not >>> found in Kerberos database"? >> >> I agree, and have found it to be very frustrating while working w/ >> Kerberos in general. I *think* there's a library which can convert >> those error-codes (libcomm-err?), but I've not really looked into it >> yet. > AFAIK, that one is for Kerberos only. For GSSAPI, we already use the > gss_display_status function to get the error messages. I think the problem > here is in the Kerberos library? Yeah, I found it: https://bugzilla.redhat.com/show_bug.cgi?id=430983 The best fix is not entirely clear, but in any case it's not our bug. regards, tom lane