On Wed, 27 Aug 2025 14:44:55 +0100
Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
> On Sun, 24 Aug 2025 at 18:34, Yugo Nagata <nagata@sraoss.co.jp> wrote:
> >
> > I confirmed this issue by executing the following query concurrently
> > in three transactions. (With only two transactions, the issue does not occur.)
>
> Yes, I think 3 transactions are required to reproduce this (2 separate
> concurrent updates).
>
> > I don't completely understand how this race condition occurs,
> > but I believe the bug is due to the misuse of TM_FailureData
> > returned by table_tuple_lock in ExecMergeMatched().
> >
> > Currently, TM_FailureData.ctid is used as a reference to the
> > latest version of oldtuple, but this is not always correct.
> > Instead, the tupleid passed to table_tuple_lock should be used.
> >
> > I've attached a patch to fix this.
>
> Thanks. That makes sense.
>
> I think we also should update the isolation tests to test this.
> Attached is an update to the merge-match-recheck isolation test, doing
> so. As you found, it doesn't always seem to fail with the unpatched
> code (though I didn't look to see why), but with your patch, it always
> passes.
Thank you for your suggestion and the test patch. The test looks good
to me, so I’ve attached an updated patch including it.
Regards,
Yugo Nagata
--
Yugo Nagata <nagata@sraoss.co.jp>