Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock
Date
Msg-id 202401261738.zf2evzaborgt@alvherre.pgsql
Whole thread Raw
In response to Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock
Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock
List pgsql-hackers
I've continued to review this and decided that I don't like the mess
this patch proposes in order to support pg_commit_ts's deletion of all
files.  (Yes, I know that I was the one that proposed this idea. It's
still a bad one).  I'd like to change that code by removing the limit
that we can only have 128 bank locks in a SLRU.  To recap, the reason we
did this is that commit_ts sometimes wants to delete all files while
running (DeactivateCommitTs), and for this it needs to acquire all bank
locks.  Since going above the MAX_SIMUL_LWLOCKS limit is disallowed, we
added the SLRU limit making multiple banks share lwlocks.

I propose two alternative solutions:

1. The easiest is to have DeactivateCommitTs continue to hold
CommitTsLock until the end, including while SlruScanDirectory does its
thing.  This sounds terrible, but considering that this code only runs
when the module is being deactivated, I don't think it's really all that
bad in practice.  I mean, if you deactivate the commit_ts module and
then try to read commit timestamp data, you deserve to wait for a few
seconds just as a punishment for your stupidity.  AFAICT the cases where
anything is done in the module mostly check without locking that
commitTsActive is set, so we're not slowing down any critical
operations.  Also, if you don't like to be delayed for a couple of
seconds, just don't deactivate the module.

2. If we want some refinement, the other idea is to change
SlruScanDirCbDeleteAll (the callback that SlruScanDirectory uses in this
case) so that it acquires the bank lock appropriate for all the slots
used by the file that's going to be deleted.  This is OK because in the
default compilation each file only has 32 segments, so that requires
only 32 lwlocks held at once while the file is being deleted.  I think
we don't need to bother with this, but it's an option if we see the
above as unworkable for whatever reason.


The only other user of SlruScanDirCbDeleteAll is async.c (the LISTEN/
NOTIFY code), and what that does is delete all the files at server
start, where nothing is running concurrently anyway.  So whatever we do
for commit_ts, it won't affect async.c.


So, if we do away with the SLRU_MAX_BANKLOCKS idea, then we're assured
one LWLock per bank (instead of sharing some lwlocks to multiple banks),
and that makes the code simpler to reason about.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"In fact, the basic problem with Perl 5's subroutines is that they're not
crufty enough, so the cruft leaks out into user-defined code instead, by
the Conservation of Cruft Principle."  (Larry Wall, Apocalypse 6)



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: [EXTERNAL] Re: Add non-blocking version of PQcancel
Next
From: Nathan Bossart
Date:
Subject: Re: cleanup patches for incremental backup