Re: Ident authentication fails due to bind error on server (8.4.8) - Mailing list pgsql-bugs

From Tom Lane
Subject Re: Ident authentication fails due to bind error on server (8.4.8)
Date
Msg-id 17691.1308325281@sss.pgh.pa.us
Whole thread Raw
In response to Ident authentication fails due to bind error on server (8.4.8)  ("Marinos Yannikos" <mjy@geizhals.at>)
Responses Re: Ident authentication fails due to bind error on server (8.4.8)
List pgsql-bugs
"Marinos Yannikos" <mjy@geizhals.at> writes:
> I'm not sure that this is not a configuration or networking issue (so
> apologies if it is), but we seem to be getting rare (a few times/day)
> failures with ident authentication because several clients attempt to do
> it simultaneously over a high-latency connection (capitalized = edited
> IPs/username etc.):

> [DB CLIENTADDR(51985) 3173 2011-06-17 10:49:56 CEST] LOG:  could not bind
> to local address "SERVERADDR": Address already in use
> [DB CLIENTADDR(51985) 3173 2011-06-17 10:49:56 CEST] FATAL:  Ident
> authentication failed for user "USER"

Hm.  What platform is this on?

> Is this a possible race condition in src/backend/libpq/auth.c ?

I don't think it's a race condition per se.  The code ought to be
setting up the address argument for bind() with sin_port = 0 so that
an unused port number gets assigned.  That seems to be what happens on
a couple of machines that I tried here, but I notice that the Linux
manpage for getaddrinfo says

    service sets the port in each returned address structure.  If
    this argument is a service name (see services(5)), it is
    translated to the corresponding port number.  This argument can
    also be specified as a decimal number, which is simply converted
    to binary.  If service is NULL, then the port number of the
    returned socket addresses will be left uninitialized.

In principle this wording would allow getaddrinfo to return the same
nonzero port number in multiple backends, which would lead to the
reported failure if they were doing ident verification at the same time.
I'm thinking maybe we should explicitly pass "0" rather than NULL to
getaddrinfo here.  On the other hand, it seems to work reliably as-is
on my Linux machine, so this is just speculation at this point.

(BTW, is it really sane to be using ident auth over a "high latency
connection"?  That would certainly suggest to me that you could be
getting connections from untrustworthy machines ...)

            regards, tom lane

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #6065: FATAL: lock 0 not held
Next
From: "Kevin Grittner"
Date:
Subject: Re: could not read block XXXXX in file "base/YYYYY/ZZZZZZ": read only 160 of 8192 bytes