Re: Interruptible sleeps (was Re: CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!) - Mailing list pgsql-hackers

On 03/09/10 19:38, Robert Haas wrote:
> On Fri, Sep 3, 2010 at 12:10 PM, Tom Lane<tgl@sss.pgh.pa.us>  wrote:
>> Robert Haas<robertmhaas@gmail.com>  writes:
>>> On Fri, Sep 3, 2010 at 10:07 AM, Tom Lane<tgl@sss.pgh.pa.us>  wrote:
>>>> [ shrug... ]  I stated before that the Hot Standby patch is doing
>>>> utterly unsafe things in signal handlers.  Simon rejected that.
>>>> I am waiting for irrefutable evidence to emerge from the field
>>>> (and am very confident that it will be forthcoming...) before
>>>> I argue with him further.  Meanwhile, I'm not going to accept anything
>>>> unsafe in a core facility like this patch is going to be.
>>
>>> Oh.  I thought you had ignored his objections and fixed it.  Why are
>>> we releasing 9.0 with this problem again?  Surely this is nuts.
>>
>> My original review of hot standby found about half a dozen things
>> I thought were broken:
>> http://archives.postgresql.org/pgsql-hackers/2010-05/msg00178.php
>> After a *very* long-drawn-out fight I fixed one of them
>> (max_standby_delay), largely still over Simon's objections.  I don't
>> have the energy to repeat that another half-dozen times, so I'm going
>> to wait for the suspected problems to be proven by field experience.
>
> Bummer.  Allow me to cast a vote for doing something about the fact
> that handle_standby_sig_alarm() thinks it can safely acquire an LWLock
> in a signal handler.  I think we should be making our decisions on
> what to change in the code based on what is technically sound, rather
> than based on how much the author complains about changing it.  Of
> course there may be cases where there is a legitimate difference of
> opinion concerning the best way forward, but I don't believe this is
> one of them.

Hmm, just to make the risk more concrete, here's one scenario that could 
happen:

1. Startup process tries to acquire cleanup lock on a page. It's pinned, 
so it has to wait, and calls ResolveRecoveryConflictWithBufferPin().
2. ResolveRecoveryConflictWithBufferPin enables the standby SIGALRM 
handler by calling enable_standby_sig_alarm(), and calls 
ProcWaitForSignal().

3. ProcWaitForSignal() calls semop() (assuming sysv semaphores here) to 
wait for the process semaphore

4. Max standby delay is reached and SIGALRM fired. CheckStandbyTimeout() 
is called in signal handler. CheckStandbyTimeout() calls 
SendRecoveryConflictWithBufferPin(), which calls CancelDBBackends()

5. CancelDBBackends() tries to acquire ProcArrayLock in exclusive mode. 
It's being held by another process, so we have to sleep

6. To sleep, LWLockAcquire calls PGSemaphoreLock, which calls semop() to 
wait on on the process semaphore.

So we now have the same process nested twice inside a semop() call. 
Looking at the Linux signal (7) man page, it is undefined what happens 
if semop() is re-entered in a signal handler while another semop() call 
is happening in main line of execution. Assuming it somehow works, which 
semop() call is going to return when the semaphore is incremented?

Maybe that's ok, if I'm reading the deadlock checker code correctly, it 
also calls semop() to increment the another process' semaphore, and the 
deadlock checker can be invoked from a signal handler while in semop() 
to wait on our process' semaphore. BTW, sem_post(), which is the Posix 
function for incrementing a semaphore, is listed as a safe function to 
call in a signal handler. But it's certainly fishy.

A safer approach would be to just PGSemaphoreUnlock() in the signal 
handler, and do all the other processing outside it. You'd still call 
semop() within semop(), but at least it would be closer to the semop() 
within semop() we already do in the deadlock checker. And there would be 
less randomness from timing and lock contention involved, making it 
easier to test the behavior on various platforms.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: returning multiple result sets from a stored procedure
Next
From: Robert Haas
Date:
Subject: Re: Cost estimates for parameterized paths