Re: Conflict detection for update_deleted in logical replication - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: Conflict detection for update_deleted in logical replication
Date
Msg-id CAFiTN-tk_fFUcCKDsgR7RUFGf=g=FPCbfxG=VdRv9n04+QK_tg@mail.gmail.com
Whole thread Raw
In response to Re: Conflict detection for update_deleted in logical replication  (Dilip Kumar <dilipbalaut@gmail.com>)
List pgsql-hackers
On Fri, Jul 4, 2025 at 4:48 PM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
>
> On Wed, Jul 2, 2025 at 3:28 PM Hou, Zhijie wrote:
> > Kindly use the latest patch set for performance testing.
>
> During testing, we observed a limitation in cascading logical replication
> setups, such as (A -> B -> C). When retain_conflict_info is enabled on Node C,
> it may not retain information necessary for conflict detection when applying
> changes originally replicated from Node A. This happens because Node C only
> waits for locally originated changes on Node B to be applied before advancing
> the non-removable transaction ID.
>
> For example, Consider a logical replication setup as mentioned above : A -> B -> C.
>  - All three nodes have a table t1 with two tuples (1,1) (2,2).
>  - Node B subscribed to all changes of t1 from Node A
>  - Node-C subscribed to all changes from Node B.
>  - Subscriptions use the default origin=ANY, as this is not a bidirectional
>    setup.
>
> Now, consider two concurrent operations:
>   - @9:00 Node A - UPDATE (1,1) -> (1,11)
>
>   - @9:02 Node C - DELETE (1,1)
>
> Assume a slight delay at Node B before it applies the update from Node A.
>
>  @9:03 Node C - advances the non-removable XID because it sees no concurrent
>  transactions from Node B. It is unaware of Node A’s concurrent update.
>
>   @9:04 Node B - receives Node A's UPDATE and applies (1,1) -> (1,11)
>   t1 has tuples : (1,11), (2,2)
>
>   @9:05 Node C - receives the UPDATE (1,1) -> (1,11)
>     - As conflict slot’s xmin is advanced, the deleted tuple may already have
>       been removed.
>     - Conflict resolution fails to detect update_deleted and instead raises
>       update_missing.
>
> Note that, as per decoding logic Node C sees the commit timestamp of the update
> as 9:00 (origin commit_ts from Node A), not 9:04 (commit time on Node B). In
> this case, since the UPDATE's timestamp is earlier than the DELETE, Node C
> should ideally detect an update_deleted conflict. However, it cannot, because
> it no longer retains the deleted tuple.
>
> Even if Node C attempts to retrieve the latest WAL position from Node A, Node C
> doesn't maintain any LSN which we could use to compare with it.
>
> This scenario is similar to another restriction in the patch where
> retain_conflict_info is not supported if the publisher is also a physical
> standby, as the required transaction information from the original primary is
> unavailable. Moreover, this limitation is relevant only when the subscription
> origin option is set to ANY, as only in that case changes from other origins
> can be replicated. Since retain_conflict_info is primarily useful for conflict
> detection in bidirectional clusters where the origin option is set to NONE,
> this limitation appears acceptable.
>
> Given these findings, to help users avoid unintended configurations, we plan to
> issue a warning in scenarios where replicated changes may include origins other
> than the direct publisher, similar to the existing checks in the
> check_publications_origin() function.
>
> Here is the latest patch that implements the warning and documents
> this case. Only 0001 is modified for this.
>
> A big thanks to Nisha for invaluable assistance in identifying this
> case and preparing the analysis for it.

In this setup if we have A->B->C->A then after we implement conflict
resolution is it possible that node A will just left with (2,2),
because (1,11) will be deleted while applying the changes from Node C
whereas node C has detected the indirect conflicting update from Node
A as update missing and has inserted the row and it will left with
(1,11) and (2,2).  So can it cause divergence as I explained here, or
it will not?  If not then can you explain how?

--
Regards,
Dilip Kumar
Google



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: [HACKERS] pg_upgrade failed with error - ERROR: column "a" in child table must be marked NOT NULL
Next
From: Peter Eisentraut
Date:
Subject: Re: Adding support for SSLKEYLOGFILE in the frontend