Re: SLRUs in the main buffer pool, redux - Mailing list pgsql-hackers

From Robert Haas
Subject Re: SLRUs in the main buffer pool, redux
Date
Msg-id CA+TgmoYhJ1hFZkySu-amSuJ6E4c03OVicNFH3fLKPqaTpx7huw@mail.gmail.com
Whole thread Raw
In response to SLRUs in the main buffer pool, redux  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers
On Thu, Jan 13, 2022 at 9:00 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> I was re-reviewing the proposed batch of GUCs for controlling the SLRU
> cache sizes[1], and I couldn't resist sketching out $SUBJECT as an
> obvious alternative.  This patch is highly experimental and full of
> unresolved bits and pieces (see below for some), but it passes basic
> tests and is enough to start trying the idea out and figuring out
> where the real problems lie.  The hypothesis here is that CLOG,
> multixact, etc data should compete for space with relation data in one
> unified buffer pool so you don't have to tune them, and they can
> benefit from the better common implementation (mapping, locking,
> replacement, bgwriter, checksums, etc and eventually new things like
> AIO, TDE, ...).

I endorse this hypothesis. The performance cliff when the XID range
you're regularly querying exceeds the hardcoded constant is quite
steep, and yet we can't just keep pushing that constant up. Linear
search does not scale well to infinitely large arrays.[citation
needed]

> [ long list of dumpster-fire level problems with the patch ]

Honestly, none of this sounds that bad. I mean, it sounds bad in the
sense that you're going to have to fix all of this somehow and I'm
going to unhelpfully give you no advice whatsoever about how to do
that, but my guess is that a moderate amount of persistence will be
sufficient for you to get the job done. None of it sounds hopeless.

Before fixing all of that, one thing you might want to consider is
whether it uh, works. And by "work" I don't mean "get the right
answer" even though I agree with my esteemed fellow hacker that this
is an important thing to do.[1] What I mean is that it would be good
to see some evidence that the number of buffers that end up being used
to cache any particular SLRU is somewhat sensible, and varies by
workload. For example, consider a pgbench workload. As you increase
the scale factor, the age of the oldest XIDs that you regularly
encounter will also increase, because on the average, the row you're
now updating will not have been updated for a larger number of
transactions. So it would be interesting to know whether all of the
CLOG buffers that are regularly being accessed do in fact remain in
cache - and maybe even whether buffers that stop being regularly
accessed get evicted in the face of cache pressure.

Also, the existing clog code contains a guard that absolutely prevents
the latest CLOG buffer from being evicted. Because - at least in a
pgbench test like the one postulated above, and probably in general -
the frequency of access to older CLOG buffers decays exponentially,
evicting the newest or even the second or third newest CLOG buffer is
really bad. At present, there's a hard-coded guard to prevent the
newest buffer from being evicted, which is a band-aid, but an
effective one. Even with that band-aid, evicting any of the most
recent few can produce a system-wide stall, where every backend ends
up waiting for the evicted buffer to be retrieved. It would be
interesting to know whether the same problem can be recreated with
your patch, because the buffer eviction algorithm for shared buffers
is only a little bit less dumb than the one for SLRUs, and can pretty
commonly devolve into little more than evict-at-random.
Evict-at-random is very bad here, because evicting a hot CLOG page is
probably even worse than evicting, say, a btree root page.

Another interesting test might be one that puts pressure on some other
SLRU, like pg_multixact or pg_subtrans. In general SLRU pages that are
actually being used are hot enough that we should keep them in cache
almost no matter what else is competing for cache space ... but the
number of such pages can be different from one SLRU to another, and
can change over time.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

[1] http://postgr.es/m/3151122.1642086632@sss.pgh.pa.us



pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: Schema variables - new implementation for Postgres 15
Next
From: Andres Freund
Date:
Subject: Re: Time to drop plpython2?