Re: Conflict Detection and Resolution - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Conflict Detection and Resolution
Date
Msg-id CAA4eK1Lp+EDApVBVq6cDLPFpdX8rV1LehXgTv0Et9MX_8fK8BQ@mail.gmail.com
Whole thread Raw
In response to Re: Conflict Detection and Resolution  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
On Mon, Jul 1, 2024 at 11:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, May 23, 2024 at 3:37 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > DELETE
> > ================
> > Conflict Type:
> > ----------------
> > delete_missing: An incoming delete is trying to delete a row on a
> > target node which does not exist.
>
> IIUC the 'delete_missing' conflict doesn't cover the case where an
> incoming delete message is trying to delete a row that has already
> been updated locally or by another node. I think in update/delete
> conflict situations, we need to resolve the conflicts based on commit
> timestamps like we do for update/update and insert/update conflicts.
>
> For example, suppose there are two node-A and node-B and setup
> bi-directional replication, and suppose further that both have the row
> with id = 1, consider the following sequences:
>
> 09:00:00  DELETE ... WHERE id = 1 on node-A.
> 09:00:05  UPDATE ... WHERE id = 1 on node-B.
> 09:00:10  node-A received the update message from node-B.
> 09:00:15  node-B received the delete message from node-A.
>
> At 09:00:10 on node-A, an update_deleted conflict is generated since
> the row on node-A is already deleted locally. Suppose that we use
> 'apply_or_skip' resolution for this conflict, we convert the update
> message into an insertion, so node-A now has the row with id = 1. At
> 09:00:15 on node-B, the incoming delete message is applied and deletes
> the row with id = 1, even though the row has already been modified
> locally. The node-A and node-B are now inconsistent. This
> inconsistency can be avoided by using 'skip' resolution for the
> 'update_deleted' conflict on node-A, and 'skip' resolution is the
> default method for that actually. However, if we handle it as
> 'update_missing', the 'apply_or_skip' resolution is used by default.
>
> IIUC with the proposed architecture, DELETE always takes precedence
> over UPDATE since both 'update_deleted' and 'update_missing' don't use
> commit timestamps to resolve the conflicts. As long as that is true, I
> think there is no use case for 'apply_or_skip' and 'apply_or_error'
> resolutions in update/delete conflict cases. In short, I think we need
> something like 'delete_differ' conflict type as well. FYI PGD and
> Oracle GoldenGate seem to have this conflict type[1][2].
>

Your explanation makes sense to me and I agree that we should
implement 'delete_differ' conflict type.

> The 'delete'_differ' conflict type would have at least
> 'latest_timestamp_wins' resolution. With the timestamp based
> resolution method, we would deal with update/delete conflicts as
> follows:
>
> 09:00:00: DELETE ... WHERE id = 1 on node-A.
> 09:00:05: UPDATE ... WHERE id = 1 on node-B.
>     - the updated row doesn't have the origin since it's a local change.
> 09:00:10: node-A received the update message from node-B.
>     - the incoming update message has the origin of node-B whereas the
> local row is already removed locally.
>     - 'update_deleted' conflict is generated.
>

FYI, as of now, we don't have a reliable way to detect
'update_deleted' type of conflicts but we had some discussion about
the same [1].

>     - do the insert of the new row instead, because the commit
> timestamp of UPDATE is newer than DELETE's one.
> 09:00:15: node-B received the delete message from node-A.
>     - the incoming delete message has the origin of node-B whereas the
> (updated) row doesn't have the origin.
>     - 'update_differ' conflict is generated.
>     - discard DELETE, because the commit timestamp of UPDATE is newer
> than DELETE' one.ard DELETE, because the commit timestamp of UPDATE is
> newer than DELETE' one.
>
> As a result, both nodes have the new version row.
>

Right, it seems to me that we should implement 'latest_time_wins' if
we want consistency in such cases.

[1] - https://www.postgresql.org/message-id/CAA4eK1Lj-PWrP789KnKxZydisHajd38rSihWXO8MVBLDwxG1Kg%40mail.gmail.com

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: Converting README documentation to Markdown
Next
From: Amit Kapila
Date:
Subject: Re: Conflict Detection and Resolution