> On 28 May 2022, at 12:02, Andres Freund <andres@anarazel.de> wrote:
>
> I think you basically need to force some, but not all, of the modifying
> transactions to be open for a bit longer, so that it's more likely that
> there's a chance to prune vs CIC waiting. Might also be helpful to update rows
> multiple times within an xact.
Now I've got 2 different versions of test for master branch. Both fail in 50% of cases on my machine. Both take
approximately4 seconds of wallclock time and 1 second of CPU time.
v3: wait with a fraction of waiting transactions.
This test fails with
0 postgres 0x00000001049ec508 ExceptionalCondition + 124
1 postgres 0x00000001045ea284 heap_page_prune + 2992
2 postgres 0x00000001045e9670 heap_page_prune_opt + 424
3 postgres 0x00000001045e25c0 heapam_index_fetch_tuple + 140
4 postgres 0x0000000100272d60 index_fetch_heap + 104
5 postgres 0x0000000100272e18 index_getnext_slot + 88
6 postgres 0x00000001003bbf4c check_exclusion_or_unique_constraint + 440
7 postgres 0x00000001003bc360 ExecCheckIndexConstraints + 232
8 postgres 0x00000001003ea30c ExecInsert + 1024
9 postgres 0x00000001003e90cc ExecModifyTable + 1536
10 postgres 0x00000001003bd0cc standard_ExecutorRun + 268
11 postgres 0x0000000100542d94 ProcessQuery + 160
12 postgres 0x00000001005423c8 PortalRunMulti + 396
13 postgres 0x0000000100541cfc PortalRun + 476
And reverting d9d0762 does not fix the issue. I'm not sure if I'm observing some other problem here.
v4 of a test not use pg_sleep() and fails with regular amcheck failure. Reverting d9d0762 fixes the test. Unless I
executethe test for 1 million transactions, then it fail even with a revert...
I suspect that v3 and v4 triggers different problems.
Best regards, Andrey Borodin.