Thread: WIP: remove use of flat auth file for client authentication

WIP: remove use of flat auth file for client authentication

From
Tom Lane
Date:
Attached is a patch that removes the use of the flat auth file during
client authentication, instead using regular access to the pg_auth
catalogs.  As previously discussed, this implies pushing the
authentication work down to InitPostgres.  I didn't yet do anything
about the idea of falling back to connecting to "postgres" when the
specified target DB doesn't exist, but other than that small change
I think it's about ready to go.

The main thing that turned out to be unexpectedly, um, "interesting" was
signal handling.  What we have been doing during client authentication
is to point SIGTERM, SIGALRM, etc to the "authdie" handler, which just
does "proc_exit(1)" directly.  This will *not* do once we've entered a
transaction --- in general you can't expect to just interrupt the
backend in a random place and start doing something different.  I'm not
really sure it was safe even during authentication, but for sure it's
not safe during database access.  The solution that's adopted here is to
use the regular backend SIGINT, SIGTERM, SIGQUIT, and SIGALRM signal
handling, and to turn on ImmediateInterruptOK while we are doing client
interactions during authentication, but not while we are doing database
accesses during authentication.  It's a bit ugly because the signal
handlers have to be taught to do something different when
ClientAuthInProgress, but I don't see a better way.

Another interesting point is that for this to work, those signal
interrupts have to actually be enabled (doh) ... and up to now we have
been running InitPostgres with most signals disabled.  I suspect that
this means some things are actively broken during InitPostgres's initial
transaction --- for example, if it happens to try to take a lock that
completes a deadlock cycle, the deadlock won't be detected because the
lock timeout SIGALRM interrupt will never occur.  Another example is
that SI inval messaging isn't working during InitPostgres either.
The patch addresses this by moving up PostgresMain's
PG_SETMASK(&UnBlockSig); call to before InitPostgres.  We might need to
back-patch that bit, though I'm hesitant to fool with such a thing in
back branches.

Thoughts, objections, better ideas?

            regards, tom lane


Attachment

Re: WIP: remove use of flat auth file for client authentication

From
Greg Stark
Date:
On Sat, Aug 29, 2009 at 6:00 AM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
> Attached is a patch that removes the use of the flat auth file during
> client authentication, instead using regular access to the pg_auth
> catalogs.  As previously discussed, this implies pushing the
> authentication work down to InitPostgres.  I didn't yet do anything
> about the idea of falling back to connecting to "postgres" when the
> specified target DB doesn't exist, but other than that small change
> I think it's about ready to go.

Falling back to connecting to "postgres" seems unnecessarily complex to me.

> Another interesting point is that for this to work, those signal
> interrupts have to actually be enabled (doh) ... and up to now we have
> been running InitPostgres with most signals disabled.  I suspect that
> this means some things are actively broken during InitPostgres's initial
> transaction --- for example, if it happens to try to take a lock that
> completes a deadlock cycle, the deadlock won't be detected because the
> lock timeout SIGALRM interrupt will never occur.  Another example is
> that SI inval messaging isn't working during InitPostgres either.
> The patch addresses this by moving up PostgresMain's
> PG_SETMASK(&UnBlockSig); call to before InitPostgres.  We might need to
> back-patch that bit, though I'm hesitant to fool with such a thing in
> back branches.

The deadlock can only fail to be detected by someone else if the whole
initpostgres thing takes longer than deadlock_timout I think. So it
doesn't seem very likely. Not sure how likely problems due to missed
SI messages are.

>
> Thoughts, objections, better ideas?
>
>                        regards, tom lane
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>



--
greg
http://mit.edu/~gsstark/resume.pdf


Re: WIP: remove use of flat auth file for client authentication

From
Simon Riggs
Date:
On Sat, 2009-08-29 at 01:00 -0400, Tom Lane wrote:

> Attached is a patch that removes the use of the flat auth file during
> client authentication, instead using regular access to the pg_auth
> catalogs.  As previously discussed, this implies pushing the
> authentication work down to InitPostgres.  I didn't yet do anything
> about the idea of falling back to connecting to "postgres" when the
> specified target DB doesn't exist, but other than that small change
> I think it's about ready to go.

I get the feeling that part of the inspiration for this is that Hot
Standby must maintain this file. If not, I'm curious as to the reasons
for doing this. No objections however, just curiosity.

Specifically, should I remove the parts of the HS patch that refresh
those files?

> I suspect that
> this means some things are actively broken during InitPostgres's
> initial
> transaction --- for example, if it happens to try to take a lock that
> completes a deadlock cycle, the deadlock won't be detected because the
> lock timeout SIGALRM interrupt will never occur. Another example is
> that SI inval messaging isn't working during InitPostgres either.

Are we doing anything in the initial transaction that *could* deadlock,
or cause an SI inval message?

-- Simon Riggs           www.2ndQuadrant.com



Re: WIP: remove use of flat auth file for client authentication

From
Stephen Frost
Date:
* Simon Riggs (simon@2ndQuadrant.com) wrote:
> I get the feeling that part of the inspiration for this is that Hot
> Standby must maintain this file. If not, I'm curious as to the reasons
> for doing this. No objections however, just curiosity.

The impetus for these changes was the performance complaint that adding
new users is extremely expensive once the files get to be large.
Solving that kind of a problem typically involves moving away from flat
files and to a database.  Thankfully, we happen to have a database
already built-in. ;)

> Specifically, should I remove the parts of the HS patch that refresh
> those files?

I'd probably wait till it's committed and then rip it out as part of
re-baseing against CVS.  That's just my 2c on it and I've no clue how
invasive the changes to HS are to do that.  If it's easy to separate the
changes and remove/add them as needed, then perhaps it doesn't matter if
you do that now or wait.
Thanks,
    Stephen

Re: WIP: remove use of flat auth file for client authentication

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> On Sat, Aug 29, 2009 at 6:00 AM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
>> ... I didn't yet do anything
>> about the idea of falling back to connecting to "postgres" when the
>> specified target DB doesn't exist, but other than that small change
>> I think it's about ready to go.

> Falling back to connecting to "postgres" seems unnecessarily complex to me.

If that's the consensus I don't have a problem with it ... it's
certainly something we could add later if anyone complains.

>> Another interesting point is that for this to work, those signal
>> interrupts have to actually be enabled (doh) ... and up to now we have
>> been running InitPostgres with most signals disabled. �I suspect that
>> this means some things are actively broken during InitPostgres's initial
>> transaction --- for example, if it happens to try to take a lock that
>> completes a deadlock cycle, the deadlock won't be detected because the
>> lock timeout SIGALRM interrupt will never occur. �Another example is
>> that SI inval messaging isn't working during InitPostgres either.
>> The patch addresses this by moving up PostgresMain's
>> PG_SETMASK(&UnBlockSig); call to before InitPostgres. �We might need to
>> back-patch that bit, though I'm hesitant to fool with such a thing in
>> back branches.

> The deadlock can only fail to be detected by someone else if the whole
> initpostgres thing takes longer than deadlock_timout I think. So it
> doesn't seem very likely. Not sure how likely problems due to missed
> SI messages are.

The problem I was worried about was where InitPostgres tries to take
the last lock in a deadlock cycle, and all the other participants have
already run their deadlock checks and are just waiting.  The
InitPostgres transaction needs to run the checker to get out of this
state, and it won't because it's got SIGALRM blocked.  I agree that this
is not very probable because InitPostgres takes only pretty weak locks,
but I suspect that a failure scenario could be demonstrated.  We have
seen occasional reports of apparent undetected-deadlock situations,
but of course I've got no proof that this is the cause.  On the whole,
I don't want to risk tinkering with it in the back branches, but I'm
happy that it will be fixed going forward.
        regards, tom lane


Re: WIP: remove use of flat auth file for client authentication

From
Tom Lane
Date:
Simon Riggs <simon@2ndQuadrant.com> writes:
> Specifically, should I remove the parts of the HS patch that refresh
> those files?

Yes.  This was the last part that I was afraid might have insurmountable
problems.  There are some bits yet to do but they're in the nature of
crank-turning, I believe.  I think it's a good bet the flat files will
be gone before the next commitfest starts.

BTW, one of the to-do bits is to find another way to initialize the
XID wrap limit at database start.  Right now that's set up as a
side-effect of scanning pg_database to create the flat DB file.
The idea I had about it was to store the current wrap limit in
checkpoint WAL records and use the most recent checkpoint's value
as the initial limit.  Can you tell me what you're doing about the
wrap limit in HS and whether that solution would be an issue for you?

>> I suspect that
>> this means some things are actively broken during InitPostgres's
>> initial
>> transaction --- for example, if it happens to try to take a lock that
>> completes a deadlock cycle, the deadlock won't be detected because the
>> lock timeout SIGALRM interrupt will never occur. Another example is
>> that SI inval messaging isn't working during InitPostgres either.

> Are we doing anything in the initial transaction that *could* deadlock,

Sure --- we have to take locks to examine system catalogs.  Pretty weak
locks mind you, but there's certainly a potential there if someone else
is doing something like a VACUUM FULL on the catalogs.

> or cause an SI inval message?

I don't think InitPostgres could *cause* a SI inval, I was more worried
about whether it would fail to respond to them.  The worst case result
is probably a failure of the InitPostgres transaction to notice someone
else's relfilenode reassignment on a system catalog, which would end
up with a "failed to open file" kind of error.  Again, the probabilities
aren't very high, but they appear nonzero.
        regards, tom lane


Re: WIP: remove use of flat auth file for client authentication

From
Simon Riggs
Date:
On Sat, 2009-08-29 at 09:00 -0400, Stephen Frost wrote:
> * Simon Riggs (simon@2ndQuadrant.com) wrote:
> > I get the feeling that part of the inspiration for this is that Hot
> > Standby must maintain this file. If not, I'm curious as to the reasons
> > for doing this. No objections however, just curiosity.
> 
> The impetus for these changes was the performance complaint that adding
> new users is extremely expensive once the files get to be large.
> Solving that kind of a problem typically involves moving away from flat
> files and to a database.  Thankfully, we happen to have a database
> already built-in. ;)

LOL. Quite lucky that. :)

-- Simon Riggs           www.2ndQuadrant.com



Re: WIP: remove use of flat auth file for client authentication

From
Simon Riggs
Date:
On Sat, 2009-08-29 at 11:19 -0400, Tom Lane wrote:
> Simon Riggs <simon@2ndQuadrant.com> writes:
> > Specifically, should I remove the parts of the HS patch that refresh
> > those files?
> 
> Yes.  This was the last part that I was afraid might have insurmountable
> problems.  There are some bits yet to do but they're in the nature of
> crank-turning, I believe.  I think it's a good bet the flat files will
> be gone before the next commitfest starts.

OK, its gone. Shame really, took a while to make that work. :)

> BTW, one of the to-do bits is to find another way to initialize the
> XID wrap limit at database start.  Right now that's set up as a
> side-effect of scanning pg_database to create the flat DB file.
> The idea I had about it was to store the current wrap limit in
> checkpoint WAL records and use the most recent checkpoint's value
> as the initial limit.  Can you tell me what you're doing about the
> wrap limit in HS and whether that solution would be an issue for you?

...quakes in boots...

We're weren't doing anything at all. I guess we will be now; thanks for
alerting me.

In general we don't look at the actual xid values themselves, so going
past the xid max value itself shouldn't present an issue (that I can
see, but I will think longer on this).

A problem could occur if a standby query/snapshot lived long enough to
see the same xid in different epochs. Realistically that's a long time
and a query would be unlikely to survive that long without having first
constrained the flow of cleaning records by holding a snapshot open on
the primary. However, it could present a problem to someone and agree we
must have a clear remedial action. 

The standard remedial action is to kill queries that cause conflicts, so
we would need to kill queries based upon a wrap test. We could do that
as each xid arrives, but it would be best to do that via something rare
like checkpoint records, so that works for me.

-- Simon Riggs           www.2ndQuadrant.com