A failure in t/038_save_logical_slots_shutdown.pl - Mailing list pgsql-hackers
From | Bharath Rupireddy |
---|---|
Subject | A failure in t/038_save_logical_slots_shutdown.pl |
Date | |
Msg-id | CALj2ACVLzH5CN-h9=S26mdRHPuJ9yDLUw70yh4JOiPw03WL0CQ@mail.gmail.com Whole thread Raw |
Responses |
Re: A failure in t/038_save_logical_slots_shutdown.pl
Re: A failure in t/038_save_logical_slots_shutdown.pl |
List | pgsql-hackers |
Hi, I've been observing a failure in t/038_save_logical_slots_shutdown.pl of late on my developer system: t/038_save_logical_slots_shutdown.pl .. 1/? # Failed test 'Check that the slot's confirmed_flush LSN is the same as the latest_checkpoint location' # at t/038_save_logical_slots_shutdown.pl line 35. # Looks like you failed 1 test of 2. t/038_save_logical_slots_shutdown.pl .. Dubious, test returned 1 (wstat 256, 0x100) Failed 1/2 subtests I did a quick analysis of the failure and commit https://github.com/postgres/postgres/commit/e0b2eed047df9045664da6f724cb42c10f8b12f0 that introduced this test. I think the issue is that the slot's confirmed_flush LSN (0/1508000) and shutdown checkpoint LSN (0/1508018) are not the same: tmp_check/log/038_save_logical_slots_shutdown_pub.log: 2024-01-10 07:55:44.539 UTC [57621] sub LOG: starting logical decoding for slot "sub" 2024-01-10 07:55:44.539 UTC [57621] sub DETAIL: Streaming transactions committing after 0/1508000, reading WAL from 0/1507FC8. 2024-01-10 07:55:44.539 UTC [57621] sub STATEMENT: START_REPLICATION SLOT "sub" LOGICAL 0/0 (proto_version '4', origin 'any', publication_names '"pub"') ubuntu:~/postgres$ pg17/bin/pg_controldata -D src/test/recovery/tmp_check/t_038_save_logical_slots_shutdown_pub_data/pgdata/ Database cluster state: in production pg_control last modified: Wed Jan 10 07:55:44 2024 Latest checkpoint location: 0/1508018 Latest checkpoint's REDO location: 0/1508018 But the tests added by t/038_save_logical_slots_shutdown.pl expects both LSNs to be same: sub compare_confirmed_flush { # Is it same as the value read from log? ok( $latest_checkpoint eq $confirmed_flush_from_log, "Check that the slot's confirmed_flush LSN is the same as the latest_checkpoint location" ); I suspect that it's quite not right to expect the slot's confirmed_flush and latest checkpoint location to be same in the test. This is because the shutdown checkpoint gets an LSN that's greater than the slot's confirmed_flush LSN - see the shutdown checkpoint record getting inserted into WAL after the slot is marked dirty in CheckPointReplicationSlots(). With this analysis in mind, I think the tests need to do something like the following: diff --git a/src/test/recovery/t/038_save_logical_slots_shutdown.pl b/src/test/recovery/t/038_save_logical_slots_shut down.pl index 5a4f5dc1d4..d49e6014fc 100644 --- a/src/test/recovery/t/038_save_logical_slots_shutdown.pl +++ b/src/test/recovery/t/038_save_logical_slots_shutdown.pl @@ -32,7 +32,7 @@ sub compare_confirmed_flush unless defined($latest_checkpoint); # Is it same as the value read from log? - ok( $latest_checkpoint eq $confirmed_flush_from_log, + ok( $latest_checkpoint ge $confirmed_flush_from_log, "Check that the slot's confirmed_flush LSN is the same as the latest_checkpoint location" ); Thoughts? -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: