Re: "multiple backends attempting to wait for pincount 1" - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: "multiple backends attempting to wait for pincount 1"
Date
Msg-id 563622451.3169954.1423934700541.JavaMail.yahoo@mail.yahoo.com
Whole thread Raw
In response to Re: "multiple backends attempting to wait for pincount 1"  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: "multiple backends attempting to wait for pincount 1"
List pgsql-hackers
Andres Freund <andres@2ndquadrant.com> wrote:

> I don't think it's actually 675333 at fault here. I think it's a
> long standing bug in LockBufferForCleanup() that can just much
> easier be hit with the new interrupt code.

The patches I'll be posting soon make it even easier to hit, which
is why I was trying to sort this out when Tom noticed the buildfarm
issues.

> Imagine what happens in LockBufferForCleanup() when
> ProcWaitForSignal() returns spuriously - something it's
> documented to possibly do (and which got more likely with the new
> patches). In the normal case UnpinBuffer() will have unset
> BM_PIN_COUNT_WAITER - but in a spurious return it'll still be set
> and LockBufferForCleanup() will see it still set.

That analysis makes sense to me.

> I think we should simply move the
>   buf->flags &= ~BM_PIN_COUNT_WAITER (Inside LockBuffer)

I think you meant inside UnpinBuffer?

> to LockBufferForCleanup, besides the PinCountWaitBuf = NULL.
> Afaics, that should do the trick.

I tried that on the master branch (33e879c) (attached) and it
passes `make check-world` with no problems.  I'm reviewing the
places that BM_PIN_COUNT_WAITER appears, to see if I can spot any
flaw in this.  Does anyone else see a problem with it?  Even though
it appears to be a long-standing bug, there don't appear to have
been any field reports, so it doesn't seem like something to
back-patch.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Manipulating complex types as non-contiguous structures in-memory
Next
From: Robert Haas
Date:
Subject: Re: New CF app deployment