Re: "multiple backends attempting to wait for pincount 1" - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: "multiple backends attempting to wait for pincount 1" |
Date | |
Msg-id | 20150214175603.GB15326@awork2.anarazel.de Whole thread Raw |
In response to | Re: "multiple backends attempting to wait for pincount 1" (Kevin Grittner <kgrittn@ymail.com>) |
Responses |
Re: "multiple backends attempting to wait for pincount 1"
|
List | pgsql-hackers |
On 2015-02-14 17:25:00 +0000, Kevin Grittner wrote: > Andres Freund <andres@2ndquadrant.com> wrote: > > Imagine what happens in LockBufferForCleanup() when > > ProcWaitForSignal() returns spuriously - something it's > > documented to possibly do (and which got more likely with the new > > patches). In the normal case UnpinBuffer() will have unset > > BM_PIN_COUNT_WAITER - but in a spurious return it'll still be set > > and LockBufferForCleanup() will see it still set. > > That analysis makes sense to me. > > > I think we should simply move the > > buf->flags &= ~BM_PIN_COUNT_WAITER (Inside LockBuffer) > > I think you meant inside UnpinBuffer? No, LockBufferHdr. What I meant was that the pincount can only be manipulated while the buffer header spinlock is held. > > to LockBufferForCleanup, besides the PinCountWaitBuf = NULL. > > Afaics, that should do the trick. > > I tried that on the master branch (33e879c) (attached) and it > passes `make check-world` with no problems. I'm reviewing the > places that BM_PIN_COUNT_WAITER appears, to see if I can spot any > flaw in this. Does anyone else see a problem with it? Even though > it appears to be a long-standing bug, there don't appear to have > been any field reports, so it doesn't seem like something to > back-patch. I was wondering about that as well. But I don't think I agree. The most likely scenario for this to fail is in full table vacuums that have to freeze rows - those are primarily triggered by autovacuum. I don't think it's likely that such a error message would be discovered in the logs unless it happens very regularly. > -- > Kevin Grittner > EDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company > diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c > index e1e6240..40b2194 100644 > --- a/src/backend/storage/buffer/bufmgr.c > +++ b/src/backend/storage/buffer/bufmgr.c > @@ -1548,7 +1548,6 @@ UnpinBuffer(volatile BufferDesc *buf, bool fixOwner) > /* we just released the last pin other than the waiter's */ > int wait_backend_pid = buf->wait_backend_pid; > > - buf->flags &= ~BM_PIN_COUNT_WAITER; > UnlockBufHdr(buf); > ProcSendSignal(wait_backend_pid); > } > @@ -3273,6 +3272,7 @@ LockBufferForCleanup(Buffer buffer) > else > ProcWaitForSignal(); > > + bufHdr->flags &= ~BM_PIN_COUNT_WAITER; > PinCountWaitBuf = NULL; > /* Loop back and try again */ > } You can't manipulate flags without holding the spinlock. Otherwise you (or the other writer) can easily cancel the other sides effects. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: