Re: Conflict Detection and Resolution - Mailing list pgsql-hackers

From Nisha Moond
Subject Re: Conflict Detection and Resolution
Date
Msg-id CABdArM43epV60iedut8Ee7CoM1nYq+AM0ZCejb_0ne3YV2JobQ@mail.gmail.com
Whole thread Raw
In response to Re: Conflict Detection and Resolution  (shveta malik <shveta.malik@gmail.com>)
List pgsql-hackers
On Fri, Aug 23, 2024 at 10:39 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Thu, Aug 22, 2024 at 3:44 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> >
> > For clock-skew and timestamp based resolution, if needed, I will post
> > another email for the design items where suggestions are needed.
> >
>
> Please find issues which need some thoughts and approval for
> time-based resolution and clock-skew.
>
> 1)
> Time based conflict resolution and two phase transactions:
>
> Time based conflict resolution (last_update_wins) is the one
> resolution which will not result in data-divergence considering
> clock-skew is taken care of. But when it comes to two-phase
> transactions, it might not be the case. For two-phase transaction, we
> do not have commit timestamp when the changes are being applied. Thus
> for time-based comparison, initially it was decided to user prepare
> timestamp but it may result in data-divergence.  Please see the
> example at [1].
>
> Example at [1] is a tricky situation, and thus in the initial draft,
> we decided to restrict usage of 2pc and CDR together. The plan is:
>
> a) During Create subscription, if the user has given last_update_wins
> resolver for any conflict_type and 'two_phase' is also enabled, we
> ERROR out.
> b) During Alter subscription, if the user tries to update resolver to
> 'last_update_wins' but 'two_phase' is enabled, we error out.
>
> Another solution could be to save both prepare_ts and commit_ts. And
> when any txn comes for conflict resolution, we first check if
> prepare_ts is available, use that else use commit_ts.  Availability of
> prepare_ts would indicate it was a prepared txn and thus even if it is
> committed, we should use prepare_ts for comparison for consistency.
> This will have some overhead of storing prepare_ts along with
> commit_ts. But if the number of prepared txns are reasonably small,
> this overhead should be less.
>
> We currently plan to go with restricting 2pc and last_update_wins
> together, unless others have different opinions.
>
> ~~
>
> 2)
> parallel apply worker and conflict-resolution:
> As discussed in [2] (see last paragraph in [2]), for streaming of
> in-progress transactions by parallel worker, we do not have
> commit-timestamp with each change and thus it makes sense to disable
> parallel apply worker with CDR. The plan is to not start parallel
> apply worker if 'last_update_wins' is configured for any
> conflict_type.
>
>  ~~
>
> 3)
> parallel apply worker and clock skew management:
> Regarding clock-skew management as discussed in [3], we will wait for
> the local clock to come within tolerable range during 'begin' rather
> than before 'commit'. And this wait needs commit-timestamp in the
> beginning, thus we plan to restrict starting pa-worker even when
> clock-skew related GUCs are configured.
>
> Earlier we had restricted both 2pc and parallel worker worker start
> when detect_conflict was enabled, but now since detect_conflict
> parameter is removed, we will change the implementation to restrict
> all 3 above cases when last_update_wins is configured. When the
> changes are done, we will post the patch.
>
> ~~
>
> 4)
> <not related to timestamp and clock skew>
> Earlier when 'detect_conflict' was enabled, we were giving WARNING if
> 'track_commit_timestamp' was not enabled. This was during CREATE and
> ALTER subscription. Now with this parameter removed, this WARNING has
> also been removed. But I think we need to bring back this WARNING.
> Currently default resolvers set may work without
> 'track_commit_timestamp' but when user gives CONFLICT RESOLVER in
> create-sub or alter-sub explicitly making them configured to
> non-default values (or say any values, does not matter if few are
> defaults), we may still emit this warning to alert user:
>
> 2024-07-26 09:14:03.152 IST [195415] WARNING:  conflict detection
> could be incomplete due to disabled track_commit_timestamp
> 2024-07-26 09:14:03.152 IST [195415] DETAIL:  Conflicts update_differ
> and delete_differ cannot be detected, and the origin and commit
> timestamp for the local row will not be logged.
>
> Thoughts?
>
> If we emit this WARNING during each resolution, then it may flood our
> log files, thus it seems better to emit it during create or alter
> subscription instead of during resolution.
>
>
Done.
v10 has implemented the suggested warning when a user gives CONFLICT
RESOLVER in create-sub or alter-sub explicitly.

Thanks,
Nisha



pgsql-hackers by date:

Previous
From: Nisha Moond
Date:
Subject: Re: Conflict Detection and Resolution
Next
From: Amit Kapila
Date:
Subject: Re: Doc: fix the note related to the GUC "synchronized_standby_slots"