On Thu, 14 Apr 2022 at 10:54, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> After a bit more navel-contemplation I see a way that the pgstats
> work could have changed timing in this area. We used to have a
> rate limit on how often stats reports would be sent to the
> collector, which'd ensure half a second or so delay before a
> transaction's change counts became visible to the autovac daemon.
> I've not looked at the new code, but I'm betting that that's gone
> and the autovac launcher might start a worker nearly immediately
> after some foreground process finishes inserting some rows.
> So that could result in autovac activity occurring concurrently
> with test_setup where it didn't before.
It's not quite clear to me why the manual vacuum wouldn't just cancel
the autovacuum and complete the job. I can't quite see how there's
room for competing page locks here. Also, see [1]. One of the
reported failing tests there is the same as one of the failing tests
on wrasse. My investigation for the AIO branch found that
relallvisible was not equal to relpages. I don't recall the reason why
that was happening now.
> As to what to do about it ... maybe apply the FREEZE and
> DISABLE_PAGE_SKIPPING options in test_setup's vacuums?
> It seems like DISABLE_PAGE_SKIPPING is necessary but perhaps
> not sufficient.
We should likely try and confirm it's due to relallvisible first.
David
[1] https://www.postgresql.org/message-id/20220224153339.pqn64kseb5gpgl74@alap3.anarazel.de