Re: xid_wraparound tests intermittent failure. - Mailing list pgsql-hackers

From Tom Lane
Subject Re: xid_wraparound tests intermittent failure.
Date
Msg-id 732722.1721597808@sss.pgh.pa.us
Whole thread Raw
In response to Re: xid_wraparound tests intermittent failure.  (Andrew Dunstan <andrew@dunslane.net>)
Responses Re: xid_wraparound tests intermittent failure.
List pgsql-hackers
Andrew Dunstan <andrew@dunslane.net> writes:
> On 2024-07-21 Su 1:34 PM, Tom Lane wrote:
>> Locally, I've not managed to reproduce the failure yet; so perhaps
>> there is some platform dependency.  What are you testing on?

> Linux ub22arm 5.15.0-116-generic #126-Ubuntu SMP Mon Jul 1 10:08:40 UTC
> 2024 aarch64 aarch64 aarch64 GNU/Linux
> It's a VM running on UTM/Apple Silicon

Hmm, doesn't sound like that ought to be slow.

I did manage to reproduce dodo's failures by running xid_wraparound
manually on mamba's very slow host:

$ time make -s installcheck PROVE_FLAGS=--timer
# +++ tap install-check in src/test/modules/xid_wraparound +++
[13:37:49] t/001_emergency_vacuum.pl .. 1/? # poll_query_until timed out executing this query:
#
# SELECT NOT EXISTS (
#       SELECT *
#       FROM pg_database
#       WHERE age(datfrozenxid) > current_setting('autovacuum_freeze_max_age')::int)
#
# expecting this output:
# t
# last actual query output:
# f
# with stderr:
# Tests were run but no plan was declared and done_testing() was not seen.
# Looks like your test exited with 4 just after 1.
[13:37:49] t/001_emergency_vacuum.pl .. Dubious, test returned 4 (wstat 1024, 0x400)
All 1 subtests passed
[14:06:51] t/002_limits.pl ............ 2/? # Tests were run but no plan was declared and done_testing() was not seen.
# Looks like your test exited with 29 just after 2.
[14:06:51] t/002_limits.pl ............ Dubious, test returned 29 (wstat 7424, 0x1d00)
All 2 subtests passed
[14:31:16] t/003_wraparounds.pl ....... ok  7564763 ms ( 0.00 usr  0.01 sys + 13.82 cusr  9.26 csys = 23.09 CPU)
[16:37:21]

Test Summary Report
-------------------
t/001_emergency_vacuum.pl (Wstat: 1024 (exited 4) Tests: 1 Failed: 0)
  Non-zero exit status: 4
  Parse errors: No plan found in TAP output
t/002_limits.pl          (Wstat: 7424 (exited 29) Tests: 2 Failed: 0)
  Non-zero exit status: 29
  Parse errors: No plan found in TAP output
Files=3, Tests=4, 10772 wallclock secs ( 0.15 usr  0.06 sys + 58.50 cusr 59.88 csys = 118.59 CPU)
Result: FAIL
make: *** [../../../../src/makefiles/pgxs.mk:442: installcheck] Error 1
    10772.99 real        59.34 user        60.14 sys

Each of those two failures looks just like something that dodo has
shown at one time or another.  So it's at least plausible that
"slow machine" is the whole explanation.  I'm still wondering
though if there's some effect that causes the test's runtime to
be unstable in itself, sometimes leading to timeouts.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: CI, macports, darwin version problems
Next
From: Alena Rybakina
Date:
Subject: Re: POC, WIP: OR-clause support for indexes