Re: Conflict Detection and Resolution - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Conflict Detection and Resolution
Date
Msg-id CAA4eK1JTMiBOoGqkt=aLPLU8Rs45ihbLhXaGHsz8XC76+OG3+Q@mail.gmail.com
Whole thread Raw
In response to Re: Conflict Detection and Resolution  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: Conflict Detection and Resolution
List pgsql-hackers
On Sat, May 25, 2024 at 2:39 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> On 5/23/24 08:36, shveta malik wrote:
> >
> > Conflict Resolution
> > ----------------
> > a) latest_timestamp_wins:    The change with later commit timestamp wins.
> > b) earliest_timestamp_wins:   The change with earlier commit timestamp wins.
> > c) apply:   Always apply the remote change.
> > d) skip:    Remote change is skipped.
> > e) error:   Error out on conflict. Replication is stopped, manual
> > action is needed.
> >
>
> Why not to have some support for user-defined conflict resolution
> methods, allowing to do more complex stuff (e.g. merging the rows in
> some way, perhaps even with datatype-specific behavior)?
>
> > The change will be converted to 'UPDATE' and applied if the decision
> > is in favor of applying remote change.
> >
> > It is important to have commit timestamp info available on subscriber
> > when latest_timestamp_wins or earliest_timestamp_wins method is chosen
> > as resolution method.  Thus ‘track_commit_timestamp’ must be enabled
> > on subscriber, in absence of which, configuring the said
> > timestamp-based resolution methods will result in error.
> >
> > Note: If the user has chosen the latest or earliest_timestamp_wins,
> > and the remote and local timestamps are the same, then it will go by
> > system identifier. The change with a higher system identifier will
> > win. This will ensure that the same change is picked on all the nodes.
>
> How is this going to deal with the fact that commit LSN and timestamps
> may not correlate perfectly? That is, commits may happen with LSN1 <
> LSN2 but with T1 > T2.
>

One of the possible scenarios discussed at pgconf.dev with Tomas for
this was as follows:

Say there are two publisher nodes PN1, PN2, and subscriber node SN3.
The logical replication is configured such that a subscription on SN3
has publications from both PN1 and PN2. For example, SN3 (sub) -> PN1,
PN2 (p1, p2)

Now, on PN1, we have the following operations that update the same row:

T1
Update-1 on table t1 at LSN1 (1000) on time (200)

T2
Update-2 on table t1 at LSN2 (2000) on time (100)

Then in parallel, we have the following operation on node PN2 that
updates the same row as Update-1, and Update-2 on node PN1.

T3
Update-3 on table t1 at LSN(1500) on time (150)

By theory, we can have a different state on subscribers depending on
the order of updates arriving at SN3 which shouldn't happen. Say, the
order in which they reach SN3 is: Update-1, Update-2, Update-3 then
the final row we have is by Update-3 considering we have configured
last_update_wins as a conflict resolution method. Now, consider the
other order:  Update-1, Update-3, Update-2, in this case, the final
row will be by Update-2 because when we try to apply Update-3, it will
generate a conflict and as per the resolution method
(last_update_wins) we need to retain Update-1.

On further thinking, the operations on node-1 PN-1 as defined above
seem impossible because one of the Updates needs to wait for the other
to write a commit record. So the commits may happen with LSN1 < LSN2
but with T1 > T2 but they can't be on the same row due to locks. So,
the order of apply should still be consistent. Am, I missing
something?

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Aleksander Alekseev
Date:
Subject: Re: meson "experimental"?
Next
From: Shubham Khanna
Date:
Subject: Re: Pgoutput not capturing the generated columns