On 2015-02-17 13:14:00 -0500, Tom Lane wrote:
> Hm, good point. On the other hand, should we worry about the possibility
> of a lost signal? Moving the flag-clearing would guard against that,
> which the current code does not. But we've not seen field reports of such
> issues AFAIR, so this might not be an important consideration.
I think if there were lost signals there'd be much bigger problems given
the same (or in master) similar mechanics are used for a lot of other
things including heavyweight and lightweight locks wait queues.
> > ...
> > /*
> > * Make sure waiter flag is reset - it might not be if
> > * ProcWaitForSignal() returned for another reason than UnpinBuffer()
> > * signalling us.
> > */
> > LockBufHdr(bufHdr);
> > buf->flags &= ~BM_PIN_COUNT_WAITER;
> > Assert(bufHdr->wait_backend_pid == MyProcPid);
> > UnlockBufHdr(bufHdr);
>
> > PinCountWaitBuf = NULL;
> > /* Loop back and try again */
> > }
>
> > to the bottom of the loop would suffice.
>
> No, I disagree. If we maintain the rule that the signaler clears
> BM_PIN_COUNT_WAITER, then once that happens there is nothing to stop a
> third party from trying to LockBufferForCleanup on the same buffer (except
> for table-level locking conventions, which IMO this mechanism shouldn't be
> dependent on). So this coding would potentially clear the
> BM_PIN_COUNT_WAITER flag belonging to that third party, and then fail the
> Assert --- but only in debug builds, not in production, where it would
> just silently lock up the third-party waiter. So I think having a test to
> verify that it's still "our" BM_PIN_COUNT_WAITER flag is essential.
Pushed with a test guarding against that. I still think it might be
slightly better to error out if somebody else waits, but I guess it's
unlikely that we'd mistakenly add code doing that.
Greetings,
Andres Freund
-- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services