pgsql: Fix race conditions with drop of reused pgstats entries - Mailing list pgsql-committers

From Michael Paquier
Subject pgsql: Fix race conditions with drop of reused pgstats entries
Date
Msg-id E1tBm95-001s0P-D1@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Fix race conditions with drop of reused pgstats entries

This fixes a set of race conditions with cumulative statistics where a
shared stats entry could be dropped while it should still be valid in
the event when it is reused: an entry may refer to a different object
but requires the same hash key.  This can happen with various stats
kinds, like:
- Replication slots that compute internally an index number, for
different slot names.
- Stats kinds that use an OID in the object key, where a wraparound
causes the same key to be used if an OID is used for the same object.
- As of PostgreSQL 18, custom pgstats kinds could also be an issue,
depending on their implementation.

This issue is fixed by introducing a counter called "generation" in the
shared entries via PgStatShared_HashEntry, initialized at 0 when an
entry is created and incremented when the same entry is reused, to avoid
concurrent issues on drop because of other backends still holding a
reference to it.  This "generation" is copied to the local copy that a
backend holds when looking at an object, then cross-checked with the
shared entry to make sure that the entry is not dropped even if its
"refcount" justifies that if it has been reused.

This problem could show up when a backend shuts down and needs to
discard any entries it still holds, causing statistics to be removed
when they should not, or even an assertion failure.  Another report
involved a failure in a standby after an OID wraparound, where the
startup process would FATAL on a "can only drop stats once", stopping
recovery abruptly.  The buildfarm has been sporadically complaining
about the problem, as well, but the window is hard to reach with the
in-core tests.

Note that the issue can be reproduced easily by adding a sleep before
dshash_find() in pgstat_release_entry_ref() to enlarge the problematic
window while repeating test_decoding's isolation test oldest_xmin a
couple of times, for example, as pointed out by Alexander Lakhin.

Reported-by: Alexander Lakhin, Peter Smith
Author: Kyotaro Horiguchi, Michael Paquier
Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/CAA4eK1KxuMVyAryz_Vk5yq3ejgKYcL6F45Hj9ZnMNBS-g+PuZg@mail.gmail.com
Discussion: https://postgr.es/m/17947-b9554521ad963c9c@postgresql.org
Backpatch-through: 15

Branch
------
REL_15_STABLE

Details
-------
https://git.postgresql.org/pg/commitdiff/154c5b42a3d80424f7b7beef33a69600245c147d

Modified Files
--------------
src/backend/utils/activity/pgstat_shmem.c | 40 +++++++++++++++++++++++++++----
src/include/utils/pgstat_internal.h       | 19 +++++++++++++++
2 files changed, 54 insertions(+), 5 deletions(-)


pgsql-committers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: pgsql: Kill dead-end children when there's nothing else left
Next
From: Peter Eisentraut
Date:
Subject: pgsql: Add an assertion in get_object_address()