Greg Stark <gsstark@mit.edu> writes:
> On Sat, Aug 29, 2009 at 6:00 AM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
>> ... I didn't yet do anything
>> about the idea of falling back to connecting to "postgres" when the
>> specified target DB doesn't exist, but other than that small change
>> I think it's about ready to go.
> Falling back to connecting to "postgres" seems unnecessarily complex to me.
If that's the consensus I don't have a problem with it ... it's
certainly something we could add later if anyone complains.
>> Another interesting point is that for this to work, those signal
>> interrupts have to actually be enabled (doh) ... and up to now we have
>> been running InitPostgres with most signals disabled. �I suspect that
>> this means some things are actively broken during InitPostgres's initial
>> transaction --- for example, if it happens to try to take a lock that
>> completes a deadlock cycle, the deadlock won't be detected because the
>> lock timeout SIGALRM interrupt will never occur. �Another example is
>> that SI inval messaging isn't working during InitPostgres either.
>> The patch addresses this by moving up PostgresMain's
>> PG_SETMASK(&UnBlockSig); call to before InitPostgres. �We might need to
>> back-patch that bit, though I'm hesitant to fool with such a thing in
>> back branches.
> The deadlock can only fail to be detected by someone else if the whole
> initpostgres thing takes longer than deadlock_timout I think. So it
> doesn't seem very likely. Not sure how likely problems due to missed
> SI messages are.
The problem I was worried about was where InitPostgres tries to take
the last lock in a deadlock cycle, and all the other participants have
already run their deadlock checks and are just waiting. The
InitPostgres transaction needs to run the checker to get out of this
state, and it won't because it's got SIGALRM blocked. I agree that this
is not very probable because InitPostgres takes only pretty weak locks,
but I suspect that a failure scenario could be demonstrated. We have
seen occasional reports of apparent undetected-deadlock situations,
but of course I've got no proof that this is the cause. On the whole,
I don't want to risk tinkering with it in the back branches, but I'm
happy that it will be fixed going forward.
regards, tom lane