Hi,
On 2022-02-11 16:41:24 -0800, Andres Freund wrote:
> FWIW, I've indeed reproduced this fairly easily with such a setup. A pgbench
> r/w workload that's been modified to start 70 savepoints at the start shows
>
> pgbench: error: client 22 script 0 aborted in command 12 query 0: ERROR: t_xmin 3853739 is uncommitted in tuple
(2,159)to be updated in table "pgbench_branches"
> pgbench: error: client 13 script 0 aborted in command 12 query 0: ERROR: t_xmin 3954305 is uncommitted in tuple
(2,58)to be updated in table "pgbench_branches"
> pgbench: error: client 7 script 0 aborted in command 12 query 0: ERROR: t_xmin 4017908 is uncommitted in tuple
(3,44)to be updated in table "pgbench_branches"
>
> after a few minutes of running with a local, not slowed down, syncrep. Without
> any other artifical slowdowns or such.
And this can easily be triggered even without subtransactions, in a completely
reliable way.
The only reason I'm so far not succeeding in turning it into an
isolationtester spec is that a transaction waiting for SyncRep doesn't count
as waiting for isolationtester.
Basically
S1: BEGIN; $xid = txid_current(); UPDATE; COMMIT; <commit wait for syncrep>
S2: SELECT pg_xact_status($xid);
S2: UPDATE;
suffices, because the pg_xact_status() causes an xlog fetch, priming the xid
cache, which then causes the TransactionIdIsInProgress() to take the early
return path, despite the transaction still being in progress. Which then
allows the update to proceed, despite the S1 not having "properly committed"
yet.
Greetings,
Andres Freund