Re: Conflict Detection and Resolution - Mailing list pgsql-hackers

From shveta malik
Subject Re: Conflict Detection and Resolution
Date
Msg-id CAJpy0uC6Zs5WwwiyuvG_kEB6Q3wyDWpya7PXm3SMT_YG=XJJ1w@mail.gmail.com
Whole thread Raw
In response to Re: Conflict Detection and Resolution  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: Conflict Detection and Resolution
List pgsql-hackers
On Fri, Jun 7, 2024 at 6:08 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> >>>
> >>>  UPDATE
> >>> ================
> >>>
> >>> Conflict Detection Method:
> >>> --------------------------------
> >>> Origin conflict detection: The ‘origin’ info is used to detect
> >>> conflict which can be obtained from commit-timestamp generated for
> >>> incoming txn at the source node. To compare remote’s origin with the
> >>> local’s origin, we must have origin information for local txns as well
> >>> which can be obtained from commit-timestamp after enabling
> >>> ‘track_commit_timestamp’ locally.
> >>> The one drawback here is the ‘origin’ information cannot be obtained
> >>> once the row is frozen and the commit-timestamp info is removed by
> >>> vacuum. For a frozen row, conflicts cannot be raised, and thus the
> >>> incoming changes will be applied in all the cases.
> >>>
> >>> Conflict Types:
> >>> ----------------
> >>> a) update_differ: The origin of an incoming update's key row differs
> >>> from the local row i.e.; the row has already been updated locally or
> >>> by different nodes.
> >>> b) update_missing: The row with the same value as that incoming
> >>> update's key does not exist. Remote is trying to update a row which
> >>> does not exist locally.
> >>> c) update_deleted: The row with the same value as that incoming
> >>> update's key does not exist. The row is already deleted. This conflict
> >>> type is generated only if the deleted row is still detectable i.e., it
> >>> is not removed by VACUUM yet. If the row is removed by VACUUM already,
> >>> it cannot detect this conflict. It will detect it as update_missing
> >>> and will follow the default or configured resolver of update_missing
> >>> itself.
> >>>
> >>
> >> I don't understand the why should update_missing or update_deleted be
> >> different, especially considering it's not detected reliably. And also
> >> that even if we happen to find the row the associated TOAST data may
> >> have already been removed. So why would this matter?
> >
> > Here, we are trying to tackle the case where the row is 'recently'
> > deleted i.e. concurrent UPDATE and DELETE on pub and sub. User may
> > want to opt for a different resolution in such a case as against the
> > one where the corresponding row was not even present in the first
> > place. The case where the row was deleted long back may not fall into
> > this category as there are higher chances that they have been removed
> > by vacuum and can be considered equivalent to the update_ missing
> > case.
> >
>
> My point is that if we can't detect the difference reliably, it's not
> very useful. Consider this example:
>
> Node A:
>
>   T1: INSERT INTO t (id, value) VALUES (1,1);
>
>   T2: DELETE FROM t WHERE id = 1;
>
> Node B:
>
>   T3: UPDATE t SET value = 2 WHERE id = 1;
>
> The "correct" order of received messages on a third node is T1-T3-T2.
> But we may also see T1-T2-T3 and T3-T1-T2, e.g. due to network issues
> and so on. For T1-T2-T3 the right decision is to discard the update,
> while for T3-T1-T2 it's to either wait for the INSERT or wait for the
> insert to arrive.
>
> But if we misdetect the situation, we either end up with a row that
> shouldn't be there, or losing an update.

Doesn't the above example indicate that 'update_deleted' should also
be considered a necessary conflict type?  Please see the possibilities
of conflicts in all three cases:


The "correct" order of receiving messages on node C (as suggested
above) is T1-T3-T2   (case1)
----------
T1 will insert the row.
T3 will have update_differ conflict; latest_timestamp wins or apply
will apply it. earliest_timestamp_wins or skip will skip it.
T2 will delete the row (irrespective of whether the update happened or not).
End Result: No Data.

T1-T2-T3
----------
T1 will insert the row.
T2 will delete the row.
T3 will have conflict update_deleted. If it is 'update_deleted', the
chances are that the resolver set here is to 'skip' (default is also
'skip' in this case).

If vacuum has deleted that row (or if we don't support
'update_deleted' conflict), it will be 'update_missing' conflict. In
that case, the user may end up inserting the row if resolver chosen is
in favor of apply (which seems an obvious choice for 'update_missing'
conflict; default is also 'apply_or_skip').

End result:
Row inserted with 'update_missing'.
Row correctly skipped with 'update_deleted' (assuming the obvious
choice seems to be 'skip' for update_deleted case).

So it seems that with 'update_deleted' conflict, there are higher
chances of opting for right decision here (which is to discard the
update), as 'update_deleted' conveys correct info to the user.  The
'update_missing' OTOH does not convey correct info and user may end up
inserting the data by choosing apply favoring resolvers for
'update_missing'. Again, we get benefit of 'update_deleted' for
*recently* deleted rows only.

T3-T1-T2
----------
T3 may end up inserting the record if the resolver is in favor of
'apply' and all the columns are received from remote.
T1 will have' insert_exists' conflict and thus may either overwrite
'updated' values or may leave the data as is (based on whether
resolver is in favor of apply or not)
T2 will end up deleting it.
End Result: No Data.

I feel for second case (and similar cases), 'update_deleted' serves a
better conflict type.

thanks
Shveta



pgsql-hackers by date:

Previous
From: Ilia Evdokimov
Date:
Subject: list_free in addRangeTableEntryForJoin
Next
From: shveta malik
Date:
Subject: Re: Conflict Detection and Resolution