Conflict Detection and Resolution - Mailing list pgsql-hackers

From shveta malik
Subject Conflict Detection and Resolution
Date
Msg-id CAJpy0uD0-DpYVMtsxK5R=zszXauZBayQMAYET9sWr_w0CNWXxQ@mail.gmail.com
Whole thread Raw
Responses Re: Conflict Detection and Resolution
Re: Conflict Detection and Resolution
Re: Conflict Detection and Resolution
List pgsql-hackers
Hello hackers,

Please find the proposal for Conflict Detection and Resolution (CDR)
for Logical replication.
<Thanks to Nisha, Hou-San, and Amit who helped in figuring out the
below details.>

Introduction
================
In case the node is subscribed to multiple providers, or when local
writes happen on a subscriber, conflicts can arise for the incoming
changes.  CDR is the mechanism to automatically detect and resolve
these conflicts depending on the application and configurations.
CDR is not applicable for the initial table sync. If locally, there
exists conflicting data on the table, the table sync worker will fail.
Please find the details on CDR in apply worker for INSERT, UPDATE and
DELETE operations:

INSERT
================
To resolve INSERT conflict on subscriber, it is important to find out
the conflicting row (if any) before we attempt an insertion. The
indexes or search preference for the same will be:
First check for replica identity (RI) index.
  - if not found, check for the primary key (PK) index.
    - if not found, then check for unique indexes (individual ones or
added by unique constraints)
         - if unique index also not found, skip CDR

Note: if no RI index, PK, or unique index is found but
REPLICA_IDENTITY_FULL is defined, CDR will still be skipped.
The reason being that even though a row can be identified with
REPLICAT_IDENTITY_FULL, such tables are allowed to have duplicate
rows. Hence, we should not go for conflict detection in such a case.

In case of replica identity ‘nothing’ and in absence of any suitable
index (as defined above), CDR will be skipped for INSERT.

Conflict Type:
----------------
insert_exists: A conflict is detected when the table has the same
value for a key column as the new value in the incoming row.

Conflict Resolution
----------------
a) latest_timestamp_wins:    The change with later commit timestamp wins.
b) earliest_timestamp_wins:   The change with earlier commit timestamp wins.
c) apply:   Always apply the remote change.
d) skip:    Remote change is skipped.
e) error:   Error out on conflict. Replication is stopped, manual
action is needed.

The change will be converted to 'UPDATE' and applied if the decision
is in favor of applying remote change.

It is important to have commit timestamp info available on subscriber
when latest_timestamp_wins or earliest_timestamp_wins method is chosen
as resolution method.  Thus ‘track_commit_timestamp’ must be enabled
on subscriber, in absence of which, configuring the said
timestamp-based resolution methods will result in error.

Note: If the user has chosen the latest or earliest_timestamp_wins,
and the remote and local timestamps are the same, then it will go by
system identifier. The change with a higher system identifier will
win. This will ensure that the same change is picked on all the nodes.

 UPDATE
================

Conflict Detection Method:
--------------------------------
Origin conflict detection: The ‘origin’ info is used to detect
conflict which can be obtained from commit-timestamp generated for
incoming txn at the source node. To compare remote’s origin with the
local’s origin, we must have origin information for local txns as well
which can be obtained from commit-timestamp after enabling
‘track_commit_timestamp’ locally.
The one drawback here is the ‘origin’ information cannot be obtained
once the row is frozen and the commit-timestamp info is removed by
vacuum. For a frozen row, conflicts cannot be raised, and thus the
incoming changes will be applied in all the cases.

Conflict Types:
----------------
a) update_differ: The origin of an incoming update's key row differs
from the local row i.e.; the row has already been updated locally or
by different nodes.
b) update_missing: The row with the same value as that incoming
update's key does not exist. Remote is trying to update a row which
does not exist locally.
c) update_deleted: The row with the same value as that incoming
update's key does not exist. The row is already deleted. This conflict
type is generated only if the deleted row is still detectable i.e., it
is not removed by VACUUM yet. If the row is removed by VACUUM already,
it cannot detect this conflict. It will detect it as update_missing
and will follow the default or configured resolver of update_missing
itself.

 Conflict Resolutions:
----------------
a)   latest_timestamp_wins:    The change with later commit timestamp
wins. Can be used for ‘update_differ’.
b)   earliest_timestamp_wins:   The change with earlier commit
timestamp wins. Can be used for ‘update_differ’.
c)   apply:   The remote change is always applied.  Can be used for
‘update_differ’.
d)   apply_or_skip: Remote change is converted to INSERT and is
applied. If the complete row cannot be constructed from the info
provided by the publisher, then the change is skipped. Can be used for
‘update_missing’ or ‘update_deleted’.
e)   apply_or_error: Remote change is converted to INSERT and is
applied. If the complete row cannot be constructed from the info
provided by the publisher, then error is raised. Can be used for
‘update_missing’ or ‘update_deleted’.
f) skip: Remote change is skipped and local one is retained. Can be
used for any conflict type.
g)   error: Error out of conflict. Replication is stopped, manual
action is needed.  Can be used for any conflict type.

 To support UPDATE CDR, the presence of either replica identity Index
or primary key is required on target node.  Update CDR will not be
supported in absence of replica identity index or primary key even
though REPLICA IDENTITY FULL is set.  Please refer to "UPDATE" in
"Noteworthey Scenarios" section in [1] for further details.

DELETE
================
Conflict Type:
----------------
delete_missing: An incoming delete is trying to delete a row on a
target node which does not exist.

Conflict Resolutions:
----------------
a)   error : Error out on conflict. Replication is stopped, manual
action is needed.
b)   skip  : The remote change is skipped.

Configuring Conflict Resolution:
------------------------------------------------
There are two parts when it comes to configuring CDR:

a) Enabling/Disabling conflict detection.
b) Configuring conflict resolvers for different conflict types.

 Users can sometimes create multiple subscriptions on the same node,
subscribing to different tables to improve replication performance by
starting multiple apply workers. If the tables in one subscription are
less likely to cause conflict, then it is possible that user may want
conflict detection disabled for that subscription to avoid detection
latency while enabling it for other subscriptions.  This generates a
requirement to make ‘conflict detection’ configurable per
subscription. While the conflict resolver configuration can remain
global. All the subscriptions which opt for ‘conflict detection’ will
follow global conflict resolver configuration.

To implement the above, subscription commands will be changed to have
one more parameter 'conflict_resolution=on/off', default will be OFF.

To configure global resolvers, new DDL command will be introduced:

CONFLICT RESOLVER ON <conflict_type> IS <conflict_resolver>

-------------------------

Apart from the above three main operations and resolver configuration,
there are more conflict types like primary-key updates, multiple
unique constraints etc and some special scenarios to be considered.
Complete design details can be found in [1].

[1]: https://wiki.postgresql.org/wiki/Conflict_Detection_and_Resolution

thanks
Shveta



pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: Schema variables - new implementation for Postgres 15
Next
From: Amit Kapila
Date:
Subject: Re: State of pg_createsubscriber