pgsql: Tighten ComputeXidHorizons' handling of walsenders. - Mailing list pgsql-committers

From Tom Lane
Subject pgsql: Tighten ComputeXidHorizons' handling of walsenders.
Date
Msg-id E1nfTpV-000gfp-Gd@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Tighten ComputeXidHorizons' handling of walsenders.

ComputeXidHorizons (nee GetOldestXmin) thought that it could identify
walsenders by checking for proc->databaseId == 0.  Perhaps that was
safe when the code was written, but it's been wrong at least since
autovacuum was invented.  Background processes that aren't connected
to any particular database, such as the autovacuum launcher and
logical replication launcher, look like that too.

This imprecision is harmful because when such a process advertises an
xmin, the result is to hold back dead-tuple cleanup in all databases,
though it'd be sufficient to hold it back in shared catalogs (which
are the only relations such a process can access).  Aside from being
generally inefficient, this has recently been seen to cause regression
test failures in the buildfarm, as a consequence of the logical
replication launcher's startup transaction preventing VACUUM from
marking pages of a user table as all-visible.

We only want that global hold-back effect for the case where a
walsender is advertising a hot standby feedback xmin.  Therefore,
invent a new PGPROC flag that says that a process' xmin should be
considered globally, and check that instead of using the incorrect
databaseId == 0 test.  Currently only a walsender sets that flag,
and only if it is not connected to any particular database.  (This is
for bug-compatibility with the undocumented behavior of the existing
code, namely that feedback sent by a client who has connected to a
particular database would not be applied globally.  I'm not sure this
is a great definition; however, such a client is capable of issuing
plain SQL commands, and I don't think we want xmins advertised for
such commands to be applied globally.  Perhaps this could do with
refinement later.)

While at it, I rewrote the comment in ComputeXidHorizons, and
re-ordered the commented-upon if-tests, to make them match up
for intelligibility's sake.

This is arguably a back-patchable bug fix, but given the lack of
complaints I think it prudent to let it age awhile in HEAD first.

Discussion: https://postgr.es/m/1346227.1649887693@sss.pgh.pa.us

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/6fea65508a1aa6a1caa5f3e7b4d27bcccb0740d8

Modified Files
--------------
src/backend/replication/walsender.c | 15 +++++++++++++
src/backend/storage/ipc/procarray.c | 45 +++++++++++++++++++++++--------------
src/include/storage/proc.h          |  3 +++
3 files changed, 46 insertions(+), 17 deletions(-)


pgsql-committers by date:

Previous
From: Peter Geoghegan
Date:
Subject: pgsql: VACUUM VERBOSE: Show dead items for an empty table.
Next
From: Tom Lane
Date:
Subject: pgsql: psql: fix \l display for pre-v15 databases.