Re: Buffer locking is special (hints, checksums, AIO writes) - Mailing list pgsql-hackers

From Matthias van de Meent
Subject Re: Buffer locking is special (hints, checksums, AIO writes)
Date
Msg-id CAEze2WgGe8vjj3jiWqUugWuwLJ9cLryaGrnASjm-yJ=tEALX2A@mail.gmail.com
Whole thread Raw
In response to Re: Buffer locking is special (hints, checksums, AIO writes)  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Tue, 23 Sept 2025 at 00:14, Andres Freund <andres@anarazel.de> wrote:
> On 2025-09-15 19:05:37 -0400, Andres Freund wrote:
> > Here are the first few cleaned up patches implementing the above steps, as
> > well as some cleanups.  I included a commit from another thread, as it
> > conflicts with these changes, and we really should apply it - and it's
> > arguably required to make the changes viable, as it removes one more use of
> > PinBuffer_Locked().
> >
> > Another change included is to not return the buffer with the spinlock held
> > from StrategyGetBuffer(), and instead pin the buffer in freelist.c. The reason
> > for that is to reduce the most common PinBuffer_locked() call. By definition
> > PinBuffer_locked() will become a bit slower due to 0003. But even without 0003
> > it 0002 is faster than master. And the previous approach also just seems
> > pretty unclean.   I don't love that it requires the new TrackNewBufferPin(),
> > but I don't really have a better idea.
> >
> > I invite particular attention to the commit message for 0003 as well as the
> > comment changes in buf_internals.h within.
>
> Robert looked at the patches while we were chatting, and I addressed his
> feedback in this new version.

I like these changes, and have some minor comments:

0001 ensures that ReadRecentBuffer increments the usage counter, which
someone who uses an access strategy may want to prevent. I know this
isn't exactly new behaviour, but something I noticed anyway. Apart
from that observation, LGTM

0002 has a FIXME in a comment in GetVictimBuffer. Assuming it's about
the comment itself needing updates, how about:

+     * Ensure, before we pin a victim buffer, that there's a free refcount
+     * entry, and a resource owner slot for the pin.

Again, LGTM.

0003's UnlockBufHdrExt:
This is implemented with CAS, even when we only want to change bits we
know the state of (or could know, if we spent the effort).
Given its inline nature, wouldn't it be better to use atomic_sub
instructions? Or is this to handle cases where the bits we want to
(un)set might be (un)set by a concurrent process?
If the latter, could we specialize this to do a single atomic_sub
whenever we want to change state bits that we know can be only changed
whilst holding the spinlock?

0004: LGTM

0005: LGTM

0006: LGTM

Kind regards,

Matthias van de Meent



pgsql-hackers by date:

Previous
From: Konstantin Osipov
Date:
Subject: Re: Proposal: Exploring LSM Tree‑Based Storage Engine for PostgreSQL (Inspired by MyRocks)
Next
From: Xuneng Zhou
Date:
Subject: Re: Improve read_local_xlog_page_guts by replacing polling with latch-based waiting