Re: Buffer locking is special (hints, checksums, AIO writes) - Mailing list pgsql-hackers

From Matthias van de Meent
Subject Re: Buffer locking is special (hints, checksums, AIO writes)
Date
Msg-id CAEze2WjeK9CY003S4dmCugv_H4tz9AaXgnqW+wTc=BaPDg+2xg@mail.gmail.com
Whole thread Raw
In response to Re: Buffer locking is special (hints, checksums, AIO writes)  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Hi,

On Tue, 7 Oct 2025 at 00:55, Andres Freund <andres@anarazel.de> wrote:
> On 2025-10-04 09:05:45 +0200, Matthias van de Meent wrote:
> > 0001 ensures that ReadRecentBuffer increments the usage counter, which
> > someone who uses an access strategy may want to prevent. I know this
> > isn't exactly new behaviour, but something I noticed anyway. Apart
> > from that observation, LGTM
>
> Are you proposing to change behaviour?

Eventually, yes, but not necessarily now or with this patchset.

> Right now ReadRecentBuffer doesn't even
> accept a strategy, so I don't really see this as something that needs to be
> tackled at this point.
>
> I'm not sure I see any real use cases for ReadRecentBuffer() that would
> benefit from a strategy, but I very well might just not be thinking wide
> enough.

I think it's rather strange that there is no Extended variant of
ReadRecentBuffer, like how there is a ReadBufferExtended for
ReadBuffer. Yes, ReadRecentBuffer has more arguments to fill and so
has a smaller difference versus ReadBufferExtended, but it's no
complete replacement for ReadBuffer[Ext] when you're aware of a recent
buffer of the page.

I'm not saying it will definitely happen, but I could see that e.g.
amcheck might want to keep track of buffer IDs of recent heap pages it
accessed to verify index's results, without holding a pin on all the
pages; and instead using ReadRecentBuffer[Extended] with a
BufferAccessStrategy to allow re-acquiring the buffer pin without
blowing out shared buffers or making parts of the pool take forever to
evict again.

> > 0003's UnlockBufHdrExt:
> > This is implemented with CAS, even when we only want to change bits we
> > know the state of (or could know, if we spent the effort).
> > Given its inline nature, wouldn't it be better to use atomic_sub
> > instructions? Or is this to handle cases where the bits we want to
> > (un)set might be (un)set by a concurrent process?
>
> Yes, it's to handle concurrent changes to the buffer state.
>
> > If the latter, could we specialize this to do a single atomic_sub
> > whenever we want to change state bits that we know can be only changed
> > whilst holding the spinlock?
>
> We probably could optimize some cases as an atomic-sub, some others as an
> atomic-and and others again as an atomic-or. The latter to however are
> implemented as a CAS on x86 anyway...
>
> After 0004 I don't think any of the paths using this are actually particularly
> hot, so I'm somewhat doubtful it's worth to try to optimize this too much. If
> there are hot paths, we really should try to work towards not even needing the
> buffer header spinlock, that has a bigger impact that improving the code for
> unlocking the buffer header...

Fair enough; I guess we'll see if further optimization would have much
impact once this all has been committed.

Kind regards,

Matthias van de Meent
Databricks



pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: Add mode column to pg_stat_progress_vacuum
Next
From: Tom Lane
Date:
Subject: Re: Optimize LISTEN/NOTIFY