On Thu, 5 Jan 2023 at 14:12, Michail Nikolaev <
michail.nikolaev@gmail.com> wrote:
>
> Hello, hackers.
>
> It seems like PG 14 works incorrectly with vacuum_defer_cleanup_age
> (or just not cleared rows, not sure) and SELECT FOR UPDATE + UPDATE.
> I am not certain, but hot_standby_feedback probably able to cause the
> same issues.
>
> Steps to reproduce:
>
> [steps]
>
> I was able to see such a set of errors (looks scary):
>
> ERROR: MultiXactId 30818104 has not been created yet -- apparent wraparound
> ERROR: could not open file "base/13757/16385.1" (target block
> 39591744): previous segment is only 24 blocks
This looks quite suspicious too - it wants to access a block at 296GB of data, where only 196kB exist.
> ERROR: attempted to lock invisible tuple
> ERROR: could not access status of transaction 38195704
> DETAIL: Could not open file "pg_subtrans/0246": No such file or directory.
I just saw two instances of this "attempted to lock invisible tuple" error for the 15.1 image (run on Docker in Ubuntu in WSL) with your reproducer script, so this does not seem to be specific to PG14 (.6).
And, after some vacuum and restarting the process, I got the following:
client 29 script 0 aborted in command 2 query 0: ERROR: heap tid from index tuple (111,1) points past end of heap page line pointer array at offset 262 of block 1 in index "something_is_wrong_here_pkey"
There is indeed something wrong there; the page can't be read by pageinspect:
$ select get_raw_page('public.something_is_wrong_here', 111)::bytea;
ERROR: invalid page in block 111 of relation base/5/16385
I don't have access to the v14 data anymore (I tried a restart, which dropped the data :-( ), but will retain my v15 instance for some time to help any debugging.
Kind regards,
Matthias van de Meent