Thread: pgsql: Prevent concurrent SimpleLruTruncate() for any given SLRU.

pgsql: Prevent concurrent SimpleLruTruncate() for any given SLRU.

From
Noah Misch
Date:
Prevent concurrent SimpleLruTruncate() for any given SLRU.

The SimpleLruTruncate() header comment states the new coding rule.  To
achieve this, add locktype "frozenid" and two LWLocks.  This closes a
rare opportunity for data loss, which manifested as "apparent
wraparound" or "could not access status of transaction" errors.  Data
loss is more likely in pg_multixact, due to released branches' thin
margin between multiStopLimit and multiWrapLimit.  If a user's physical
replication primary logged ":  apparent wraparound" messages, the user
should rebuild standbys of that primary regardless of symptoms.  At less
risk is a cluster having emitted "not accepting commands" errors or
"must be vacuumed" warnings at some point.  One can test a cluster for
this data loss by running VACUUM FREEZE in every database.  Back-patch
to 9.5 (all supported versions).

Discussion: https://postgr.es/m/20190218073103.GA1434723@rfd.leadboat.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/566372b3d6435639e4cc4476d79b8505a0297c87

Modified Files
--------------
doc/src/sgml/catalogs.sgml               |  4 +++-
doc/src/sgml/monitoring.sgml             | 16 ++++++++++++++
src/backend/access/transam/slru.c        |  8 +++++++
src/backend/access/transam/subtrans.c    |  4 ++--
src/backend/commands/async.c             | 37 +++++++++++++++++++++++---------
src/backend/commands/vacuum.c            | 13 +++++++++++
src/backend/storage/lmgr/lmgr.c          | 20 +++++++++++++++++
src/backend/storage/lmgr/lwlocknames.txt |  3 +++
src/backend/utils/adt/lockfuncs.c        | 12 +++++++++++
src/include/storage/lmgr.h               |  3 +++
src/include/storage/lock.h               | 10 +++++++++
11 files changed, 117 insertions(+), 13 deletions(-)


Re: pgsql: Prevent concurrent SimpleLruTruncate() for any given SLRU.

From
Tom Lane
Date:
Noah Misch <noah@leadboat.com> writes:
> Prevent concurrent SimpleLruTruncate() for any given SLRU.

I find it fairly scary that you've changed enum LockTagType in the
back branches.  Are we quite certain that no extensions have compiled-in
values of that enum?

Safer from an ABI standpoint would be to add the new value at the end,
at least in the back branches.

            regards, tom lane



Re: pgsql: Prevent concurrent SimpleLruTruncate() for any given SLRU.

From
Noah Misch
Date:
On Sat, Aug 15, 2020 at 04:39:05PM -0400, Tom Lane wrote:
> Noah Misch <noah@leadboat.com> writes:
> > Prevent concurrent SimpleLruTruncate() for any given SLRU.
> 
> I find it fairly scary that you've changed enum LockTagType in the
> back branches.  Are we quite certain that no extensions have compiled-in
> values of that enum?
> 
> Safer from an ABI standpoint would be to add the new value at the end,
> at least in the back branches.

Yeah, that was negligent.  Several PGXN modules do refer to locktags.  I'll
move it.