On Sun, Jan 07, 2024 at 05:00:00PM +0300, Alexander Lakhin wrote:
> Yes, I wrote exactly about that upthread and referenced my previous
> investigation. But what I'm observing now, is that the failure probability
> greatly increased with c161ab74f, so something really changed in the test
> behaviour. (I need a couple of days to investigate this.)
As far as I've cross-checked the logs between successful and failed
runs on skink and my own machines (not reproduced it locally
unfortunately), I did not notice a correlation with autovacuum running
while VACUUM (with or without FULL) is executed on the catalogs.
Perhaps a next sensible step would be to plug-in pg_waldump or
pg_walinspect and get some sense from the WAL records if we fail to
detect an invalidation from the log contents, from a LSN retrieved
slightly at the beginning of each scenario.
I would be tempted to add more increments of $Test::Builder::Level as
well in the subroutines of the test because it is kind of hard to find
out from where a failure comes now. One needs to grep for the
slot names, whose strings are built from prefixes and suffixes defined
as arguments of these subroutines...
--
Michael