Home > mailing lists

Re: ENOSPC FailedAssertion("!(RefCountErrors == 0)" - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: ENOSPC FailedAssertion("!(RefCountErrors == 0)"
Date	July 16, 2018 20:39:26
Msg-id	20436.1531751966@sss.pgh.pa.us Whole thread Raw
In response to	Re: ENOSPC FailedAssertion("!(RefCountErrors == 0)" (Andres Freund <andres@anarazel.de>)
Responses	Re: ENOSPC FailedAssertion("!(RefCountErrors == 0)"
List	pgsql-hackers

Tree view

Andres Freund <andres@anarazel.de> writes:
> On 2018-07-15 18:48:43 -0400, Tom Lane wrote:
>> So basically, WAL replay hits an error while holding a buffer pin, and
>> nothing is done to release the buffer pin, but AtProcExit_Buffers thinks
>> something should have been done.

> I think there's a few other cases where we hit this. I've seen something
> similar from inside checkpointer / BufferSync(). I'd be surprised if
> bgwriter couldn't be triggered into the same.

Hm, yeah, on reflection it's pretty obvious that those are hazard cases.

> I'm pretty sure that we do *not* force a panic on all nonzero-exit-code
> cases for other subprocesses.

That's my recollection as well -- mostly, we just start a new one.

So I said I didn't want to do extra work on this, but I am looking into
fixing it by having these aux process types run a ResourceOwner that can
be told to clean up any open buffer pins at exit.  We could be sure the
coverage is complete by dint of removing the special-case code in
resowner.c that allows buffer pins to be taken with no active resowner.
Then CheckForBufferLeaks can be left as-is, ie something we do only
in assert builds.

            regards, tom lane

pgsql-hackers by date:

From: Robert Haas
Date: 16 July 2018, 20:38:14
Subject: Re: patch to allow disable of WAL recycling

From: Laurenz Albe
Date: 16 July 2018, 20:42:23
Subject: Re: Libpq support to connect to standby server as priority

Re: ENOSPC FailedAssertion("!(RefCountErrors == 0)" - Mailing list pgsql-hackers

Previous

Next