Mihail Nikalayeu <mihailnikalayeu@gmail.com> wrote:
> > Indeed, the server log seems to indicate relationship to
> > VACUUM:
> > 2026-02-01 16:44:58.878 UTC autovacuum worker[22589] LOG: automatic vacuum of table
"postgres.pg_catalog.pg_class":index scans: 1
>
> O, it's a good clue!
>
> I have added some vacuum calls for pg_class in a stress test - and now it fails much more often (check attachment).
>
> It is "ERROR: cache lookup failed for relation" - but I think it may share the cause with "attempted to overwrite
invisibletuple.
I've just reported one issue [1] that causes this, but that does not seem to
be related to the "attempted to overwrite invisible tuple" error.
> See:
> https://cirrus-ci.com/build/4852126532239360 - with "Use multiple snapshots to copy the data."
> https://cirrus-ci.com/build/6429084491710464 - with "Use background worker to do logical decoding."
>
> But I am unable to reproduce the issue with only "Add CONCURRENTLY option to REPACK command."
> https://cirrus-ci.com/build/6467070524653568
No idea why VACUUM makes the issue happen too often. Maybe it's related to the
PD_ALL_VISIBLE flage, but I've got no detailed explanation. I also don't know
why it does not reproduce w/o the logical decoding worker.
Thanks again for your testing!
[1] https://www.postgresql.org/message-id/61812.1770637345%40localhost
--
Antonin Houska
Web: https://www.cybertec-postgresql.com