Re: PrivateRefCount patch has got issues - Mailing list pgsql-hackers

From Tom Lane
Subject Re: PrivateRefCount patch has got issues
Date
Msg-id 1831.1419186116@sss.pgh.pa.us
Whole thread Raw
In response to Re: PrivateRefCount patch has got issues  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: PrivateRefCount patch has got issues  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
Andres Freund <andres@2ndquadrant.com> writes:
> On 2014-12-16 18:25:13 -0500, Tom Lane wrote:
>> I just happened to look into bufmgr.c for the first time in awhile, and
>> noticed the privaterefcount-is-no-longer-a-simple-array stuff.  It doesn't
>> look too well thought out to me.  In particular, PinBuffer_Locked calls
>> GetPrivateRefCountEntry while holding a buffer-header spinlock.  That
>> seems completely unacceptable.

> Argh, yes. That certainly isn't ok.

> The easiest way to fix that seems to be to declare that PinBuffer_Locked
> can only be used when we're guaranteed to not have pinned the
> buffer. That happens to be true for all the existing users. In fact all
> of them even seem to require the refcount to be zero across all
> backends.  That prerequisite then allows to increase the buffer header
> refcount before releasing the spinlock *and* before increasing the
> private refcount.

Hm, if you do it like that, what happens if we get a palloc failure while
trying to record the private refcount?  I think you must not bump the pin
count in shared memory unless you're certain you can record the fact that
you've done so.

The idea I'd been wondering about hinged on the same observation that we
know the buffer is not pinned (by our process) already, but the mechanics
would be closer to what we do in resource managers: reserve space first,
do the thing that needs to be remembered, bump the count using the
reserved space.  Given the way you've set this up, the idea boils down to
having a precheck call that forces there to be an empty slot in the local
fastpath array (by pushing something out to the hash table if necessary)
before we start trying to pin the buffer.  Then it's guaranteed that the
"record" step will succeed.  You could possibly even arrange it so that
it's known which array entry needs to be used and then the "record" part
is just a couple of inline instructions, so that it'd be OK to do that
while still holding the spinlock.  Otherwise it would still be a good idea
to do the "record" after releasing the spinlock IMO; but either way this
avoids the issue of possible state inconsistency due to a midflight
palloc failure.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: PrivateRefCount patch has got issues
Next
From: Alvaro Herrera
Date:
Subject: Re: Add min and max execute statement time in pg_stat_statement