A failure in 031_recovery_conflict.pl on Debian/s390x - Mailing list pgsql-hackers
From | Christoph Berg |
---|---|
Subject | A failure in 031_recovery_conflict.pl on Debian/s390x |
Date | |
Msg-id | ZMzycdG5IWCM73ll@msg.df7cb.de Whole thread Raw |
In response to | Re: A failure in 031_recovery_conflict.pl on Cirrus CI Windows Server (Noah Misch <noah@leadboat.com>) |
Responses |
Re: A failure in 031_recovery_conflict.pl on Debian/s390x
|
List | pgsql-hackers |
Re: Noah Misch > On Tue, Jul 25, 2023 at 01:56:41PM +0530, Bharath Rupireddy wrote: > > I've observed the following failure once in one of my Cirrus CI runs > > on Windows Server on HEAD: > > > > timed out waiting for match: (?^:User was holding shared buffer pin > > for too long) at > > C:/cirrus/src/test/recovery/t/031_recovery_conflict.pl line 318. > > # Postmaster PID for node "primary" is 696 > > > > https://api.cirrus-ci.com/v1/artifact/task/5272062399348736/testrun/build/testrun/recovery/031_recovery_conflict/log/regress_log_031_recovery_conflict > > > > https://github.com/BRupireddy/postgres/runs/15296698158 > > Known: > https://postgr.es/m/flat/20220409045515.35ypjzddp25v72ou%40alap3.anarazel.de > https://postgr.es/m/flat/CA+hUKGK3PGKwcKqzoosamn36YW-fsuTdOPPF1i_rtEO=nEYKSg@mail.gmail.com > > A good next step would be patch "Fix recovery conflict SIGUSR1 handling" from > https://postgr.es/m/CA+hUKG+Hi5P1j_8cVeqLLwNSVyJh4RntF04fYWkeNXfTrH2MYA@mail.gmail.com > (near the end of that second thread). I am also seeing problems with t/031_recovery_conflict.pl, on the newly added s390x architecture on apt.postgresql.org. The failure is flaky, but has been popping up in build logs often enough. I managed to reproduce it on the shell by running the test in a loop a few times. The failure looks like this: echo "# +++ tap check in src/test/recovery +++" && rm -rf '/home/myon/postgresql/pg/postgresql/build/src/test/recovery'/tmp_check&& /bin/mkdir -p '/home/myon/postgresql/pg/postgresql/build/src/test/recovery'/tmp_check&& cd /home/myon/postgresql/pg/postgresql/build/../src/test/recovery&& TESTLOGDIR='/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/log' TESTDATADIR='/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check' PATH="/home/myon/postgresql/pg/postgresql/build/tmp_install/usr/lib/postgresql/17/bin:/home/myon/postgresql/pg/postgresql/build/src/test/recovery:$PATH" LD_LIBRARY_PATH="/home/myon/postgresql/pg/postgresql/build/tmp_install/usr/lib/s390x-linux-gnu" PGPORT='65432' top_builddir='/home/myon/postgresql/pg/postgresql/build/src/test/recovery/../../..' PG_REGRESS='/home/myon/postgresql/pg/postgresql/build/src/test/recovery/../../../src/test/regress/pg_regress' /usr/bin/prove-I /home/myon/postgresql/pg/postgresql/build/../src/test/perl/ -I /home/myon/postgresql/pg/postgresql/build/../src/test/recovery--verbose t/031_recovery_conflict.pl # +++ tap check in src/test/recovery +++ t/031_recovery_conflict.pl .. # issuing query via background psql: # BEGIN; # DECLARE test_recovery_conflict_cursor CURSOR FOR SELECT b FROM test_recovery_conflict_table1; # FETCH FORWARD FROM test_recovery_conflict_cursor; ok 1 - buffer pin conflict: cursor with conflicting pin established ok 2 - buffer pin conflict: logfile contains terminated connection due to recovery conflict ok 3 - buffer pin conflict: stats show conflict on standby # issuing query via background psql: # BEGIN; # DECLARE test_recovery_conflict_cursor CURSOR FOR SELECT b FROM test_recovery_conflict_table1; # FETCH FORWARD FROM test_recovery_conflict_cursor; # ok 4 - snapshot conflict: cursor with conflicting snapshot established ok 5 - snapshot conflict: logfile contains terminated connection due to recovery conflict ok 6 - snapshot conflict: stats show conflict on standby # issuing query via background psql: # BEGIN; # LOCK TABLE test_recovery_conflict_table1 IN ACCESS SHARE MODE; # SELECT 1; # ok 7 - lock conflict: conflicting lock acquired # Tests were run but no plan was declared and done_testing() was not seen. # Looks like your test exited with 255 just after 7. Dubious, test returned 255 (wstat 65280, 0xff00) All 7 subtests passed Test Summary Report ------------------- t/031_recovery_conflict.pl (Wstat: 65280 Tests: 7 Failed: 0) Non-zero exit status: 255 Parse errors: No plan found in TAP output Files=1, Tests=7, 186 wallclock secs ( 0.01 usr 0.00 sys + 0.74 cusr 0.71 csys = 1.46 CPU) Result: FAIL make: *** [Makefile:23: check] Error 1 I.e. the test file just exits after test 7, without running the rest. Is it dying because it's running into a deadlock instead of some other expected error message? timed out waiting for match: (?^:User was holding a relation lock for too long) at t/031_recovery_conflict.pl line 318. 2023-08-04 12:20:38.061 UTC [1920073] 031_recovery_conflict.pl FATAL: terminating connection due to conflict with recovery Christoph
Attachment
pgsql-hackers by date: