Re: pgsql: Add TAP test for archive_cleanup_command and recovery_end_comman - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: pgsql: Add TAP test for archive_cleanup_command and recovery_end_comman
Date
Msg-id CA+hUKGL=BN-Fk8JbbWGtL2HK1tJxQRg+4Czs76v+bSN7Si0A5w@mail.gmail.com
Whole thread Raw
In response to Re: pgsql: Add TAP test for archive_cleanup_command and recovery_end_comman  (Michael Paquier <michael@paquier.xyz>)
Responses Re: pgsql: Add TAP test for archive_cleanup_command and recovery_end_comman  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
On Tue, Apr 12, 2022 at 3:49 PM Michael Paquier <michael@paquier.xyz> wrote:
> All that stuff leads me to the attached.  Thoughts?

Under valgrind I got "Undefined subroutine &main::usleep called at
t/002_archiving.pl line 103" so I added "use Time::HiRes qw(usleep);",
and now I get past the first 4 tests with your patch, but then
promotion times out, not sure why:

+++ tap check in src/test/recovery +++
t/002_archiving.pl ..
ok 1 - check content from archives
ok 2 - archive_cleanup_command executed on checkpoint
ok 3 - recovery_end_command not executed yet
# found 00000002.history after 14 attempts
ok 4 - recovery_end_command executed after promotion
Bailout called.  Further testing stopped:  command "pg_ctl -D
/home/tmunro/projects/postgresql/src/test/recovery/tmp_check/t_002_archiving_standby2_data/pgdata
-l /home/tmunro/projects/postgresql/src/test/recovery/tmp_check/log/002_archiving_standby2.log
promote" exited with value 1

Since it's quite painful to run TAP tests under valgrind, I found a
place to stick a plain old sleep to repro these problems:

--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -1035,7 +1035,7 @@ sub enable_restoring
        my $copy_command =
          $PostgreSQL::Test::Utils::windows_os
          ? qq{copy "$path\\\\%f" "%p"}
-         : qq{cp "$path/%f" "%p"};
+         : qq{sleep 1 && cp "$path/%f" "%p"};

Soon I'll push the fix to the slowness that xlogprefetcher.c
accidentally introduced to continuous archive recovery, ie the problem
of calling a failing restore_command repeatedly as we approach the end
of a WAL segment instead of just once every 5 seconds after we run out
of data, and after that you'll probably need to revert that fix
locally to repro this.



pgsql-hackers by date:

Previous
From: Jesper Pedersen
Date:
Subject: Re: GSoC: pgmoneta: Write-Ahead Log (WAL) infrastructure (2022)
Next
From: Andres Freund
Date:
Subject: Re: Crash in new pgstats code