Thread: pgsql: Improve runtime and output of tests for replication slots checkp

pgsql: Improve runtime and output of tests for replication slots checkp

From
Alexander Korotkov
Date:
Improve runtime and output of tests for replication slots checkpointing.

The TAP tests that verify logical and physical replication slot behavior
during checkpoints (046_checkpoint_logical_slot.pl and
047_checkpoint_physical_slot.pl) inserted two batches of 2 million rows each,
generating approximately 520 MB of WAL.  On slow machines, or when compiled
with '-DRELCACHE_FORCE_RELEASE -DCATCACHE_FORCE_RELEASE', this caused the
tests to run for 8-9 minutes and occasionally time out, as seen on the
buildfarm animal prion.

This commit modifies the mentioned tests to utilize the $node->advance_wal()
function, thereby reducing runtime. Once we do not use the generated data,
the proposed function is a good alternative, which cuts the total wall-clock
run time.

While here, remove superfluous '\n' characters from several note() calls;
these appeared literally in the build-farm logs and looked odd.  Also, remove
excessive 'shared_preload_libraries' GUC from the config and add a check for
'injection_points' extension availability.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Author: Alexander Korotkov <aekorotkov@gmail.com>
Author: Vitaly Davydov <v.davydov@postgrespro.ru>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Discussion: https://postgr.es/m/fbc5d94e-6fbd-4a64-85d4-c9e284a58eb2%40gmail.com
Backpatch-through: 17

Branch
------
REL_17_STABLE

Details
-------
https://git.postgresql.org/pg/commitdiff/5ed50f9386f04d9b8a88af8902b06cb747db7b54

Modified Files
--------------
src/test/recovery/t/046_checkpoint_logical_slot.pl | 34 +++++++++++-----------
.../recovery/t/047_checkpoint_physical_slot.pl     | 26 +++++++++--------
2 files changed, 31 insertions(+), 29 deletions(-)



On Thu, Jun 19, 2025 at 7:31 PM Alexander Korotkov <akorotkov@postgresql.org> wrote:
Improve runtime and output of tests for replication slots checkpointing.

The TAP tests that verify logical and physical replication slot behavior
during checkpoints (046_checkpoint_logical_slot.pl and
047_checkpoint_physical_slot.pl) inserted two batches of 2 million rows each,
generating approximately 520 MB of WAL.  On slow machines, or when compiled
with '-DRELCACHE_FORCE_RELEASE -DCATCACHE_FORCE_RELEASE', this caused the
tests to run for 8-9 minutes and occasionally time out, as seen on the
buildfarm animal prion.

Quite a few animals have started failing since this commit (for example [1]) . I haven't looked into why, but I suspect something is wrong.
stderr:
#   Failed test 'Logical slot still valid'
#   at /home/bf/bf-build/flaviventris/HEAD/pgsql/src/test/recovery/t/046_checkpoint_logical_slot.pl line 134.
#          got: 'death by signal at /home/bf/bf-build/flaviventris/HEAD/pgsql/src/test/perl/PostgreSQL/Test/Cluster.pm line 181.
# '
#     expected: ''

I see this failing on my fork's CI, so it seems like it could have been caught earlier?

- Melanie