On Wed, 2010-01-20 at 06:14 +0100, Andres Freund wrote:
> >
> > Full resolution patch attached for Startup process waits on buffer pins.
> >
> > Startup process sets SIGALRM when waiting on a buffer pin. If woken by
> > alarm we send SIGUSR1 to all backends requesting that they check to see
> > if they are blocking Startup process. If so, they throw ERROR/FATAL as
> > for other conflict resolutions. Deadlock stop gap removed.
> > max_standby_delay = -1 option removed to prevent deadlock.
> Wouldnt it be more foolproof to also loop around sending the FATAL? Not that
> its likely but...
More foolproof and much less accurate. The Startup process doesn't know
who is holding the buffer pin that blocks it, so it could not target a
FATAL.
> From HoldingBufferPinThatDelaysRecovery youre calling
> GetStartupBufferPinWaitBufId - that sounds a bit dangerous because that one is
> acquiring a spinlock which can also get taken at other places. Its not the
> most likely scenario, but it would certainly be annoying to debug.
Spinlock. It isn't held for long in any situation. What problem do you
foresee?
> Is there any supported platform with sizeof(sig_atomic_t) <4 - I would doubt
> so? If not the locking in GetStartupBufferPinWaitBufId and
> SetStartupBufferPinWaitBufId shouldnt be needed?
I prefer spinlocking.
> Same issue issue (and more likely to trigger) exists with CheckStandbyTimeout-
> >SendRecoveryConflictWithBufferPin->CancelDBBackends
I don't see an issue.
-- Simon Riggs www.2ndQuadrant.com