Hi, Antonin!
> I assume you are concerned with the patch part 0005 of the v12 patch
> ("Preserve visibility information of the concurrent data changes."), aren't
> you?
Yes, of course. I got an idea while trying to find a way to optimize it.
> Not sure I understand in all details, but I don't think SnapshotSelf is the
> correct snapshot. Note that HeapTupleSatisfiesSelf() does not use its
> 'snapshot' argument at all. Instead, it considers the set of running
> transactions as it is at the time the function is called.
Yes, and it is almost the same behavior when a typical MVCC snapshot
encounters a tuple created by its own transaction.
So, how it works in the non MVCC-safe case (current patch behaviour):
1) we have a whole initial table snapshot with all the xmin = repack XID
2) appling transaction sees ALL the self-alive (no xmax) tuples in it
because all tuples created\deleted by transaction itself
3) each update/delete during the replay selects the last existing
tuple version, updates it xmax and inserts a new one
4) so, there is no any real MVCC involved - just find the latest
version and create a new version
5) and it works correctly because all ordering issues were resolved by
locking mechanisms on the original table or by reordering buffer
How it maps to MVCC-safe case (SnapshotSelf):
1) we have a whole initial table snapshot with all xmin copied from
the original table. All such xmin are committed.
2) appling transaction sees ALL the self-alive (no xmax) tuple in it
because its xmin\xmax is committed and SnapshotSelf is happy with it
3) each update/delete during the replay selects the last existing
tuple version, updates it xmax=original xid and inserts a new one
keeping with xmin=orignal xid
4) --//--
5) --//--
> However, at the time we're replaying the UPDATE in the new table, the tuple
> may have been already deleted from the old table, and the deleting transaction
> may already have committed. In such a case, HeapTupleSatisfiesSelf() will
> conclude the old version invisible and the we'll fail to replay the UPDATE.
No, it will see it - because its xmax will be empty in the repacked
version of the table.
From your question I feel you understood the concept - but feel free
to ask for an additional explanation/scheme.
Best regards,
Mikhail.