Re: Conflict detection for update_deleted in logical replication - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Conflict detection for update_deleted in logical replication
Date
Msg-id CAA4eK1+2tZ0rGowwpfmPQA03KdBOaeaK6D5omBN76UTP2EPx6w@mail.gmail.com
Whole thread Raw
In response to Re: Conflict detection for update_deleted in logical replication  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Conflict detection for update_deleted in logical replication
List pgsql-hackers
On Fri, Aug 1, 2025 at 4:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Aug 1, 2025 at 3:58 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > 4.
> > +   /*
> > +    * Instead of invoking GetOldestNonRemovableTransactionId() for conflict
> > +    * detection, we use the conflict detection slot.xmin. This value will be
> > +    * greater than or equal to the other threshold and provides a more direct
> > +    * and efficient way to identify recently deleted dead tuples relevant to
> > +    * the conflict detection. The oldest_nonremovable_xid is not used here,
> > +    * as it is maintained only by the leader apply worker and unavailable to
> > +    * table sync and parallel apply workers.
> > +    */
> > +   slot = SearchNamedReplicationSlot(CONFLICT_DETECTION_SLOT, true);
> >
> > This comment seems a bit confusing to me, Isn't it actually correct to
> > just use the "conflict detection slot.xmin" even without any other
> > reasoning?
> >
>
> But it is *not* wrong to use even GetOldestNonRemovableTransactionId()
> because it will anyway consider conflict detection slot's xmin.
> However, the value returned by that function could be much older, so
> slot's xmin is a better choice. Similarly, it is sufficient to use
> oldest_nonremovable_xid value of apply worker and ideally would be
> better than slot's xmin because it could give update_deleted in fewer
> cases, however, we can't use that because of reasons mentioned in the
> comments. Do you think this comment needs improvement for clarity and
> if so, do you have any proposal?
>

How about something like:
/*
* For conflict detection, we use the conflict slot's xmin value instead of
* invoking GetOldestNonRemovableTransactionId(). The slot.xmin acts as a
* threshold to identify tuples that were recently deleted. These tuples are
* not visible to concurrent transactions, but we log an update_deleted conflict
* if such a tuple matches the remote update being applied.
*
* Although GetOldestNonRemovableTransactionId() can return a value older than
* the slot's xmin, for our current purpose it is acceptable to treat tuples
* deleted by transactions prior to slot.xmin as update_missing conflicts.
*
* Ideally, we would use oldest_nonremovable_xid, which is directly maintained
* by the leader apply worker. However, this value is not available to table
* synchronization or parallel apply workers, making slot.xmin a practical
* alternative in those contexts.
*/

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: Adding REPACK [concurrently]
Next
From: "Zhijie Hou (Fujitsu)"
Date:
Subject: RE: Conflict detection for update_deleted in logical replication