Re: Conflict detection for multiple_unique_conflicts in logical replication - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: Conflict detection for multiple_unique_conflicts in logical replication
Date
Msg-id CAFiTN-vxAo543mMG_xNr4rwPbydRN0T4UsfExGeTDjmX1Rq0Ww@mail.gmail.com
Whole thread Raw
In response to Conflict detection for multiple_unique_conflicts in logical replication  (Nisha Moond <nisha.moond412@gmail.com>)
Responses Re: Conflict detection for multiple_unique_conflicts in logical replication
List pgsql-hackers
On Thu, Feb 20, 2025 at 5:01 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
>
> Hi Hackers,
> (CCing people involved in related discussions)
>
> I am starting this thread to propose a new conflict detection type
> "multiple_unique_conflicts" that identifies when an incoming row
> during logical replication violates more than one UNIQUE constraint.
> If multiple constraints (such as the primary key/replica identity
> along with additional unique indexes) are violated for the same
> incoming tuple, the apply_worker will trigger a dedicated
> multiple_unique_conflicts conflict.
>
> ----
> ISSUE:
> ----
> Currently, when the apply_worker encounters an 'insert_exists'
> conflict (where the incoming row conflicts with an existing row based
> on the replica identity), it logs the conflict and errors out
> immediately. If the user tries to resolve this manually by deleting
> the conflicting row, the apply worker may fail again due to another
> unique constraint violation on a different column. This forces users
> to fix conflicts one by one, making resolution tedious and
> inefficient.
>
> Example:
> Schema:
> CREATE TABLE tab1 (col1 integer PRIMARY KEY, col2 integer UNIQUE, col3
> integer UNIQUE);
>   - col1 is Replica Identity.
>
> Data:
>  - on pub: (1, 11, 111)
>  - on sub: 3 additional local Inserts: (2, 22, 222); (3, 33, 333); (4, 44, 444)
>  - Concurrently on pub, new insert: (2, 33, 444)
>
> When the new incoming tuple (2, 33, 444) is applied on the subscriber:
>  - The apply worker first detects an 'insert_exists' conflict on col1
> (primary key) and errors out.
>  - If the user deletes the conflicting row (key col1=2) : (2, 22,
> 222), the apply worker fails again because col2=33 violates another
> unique constraint for tuple (3, 33, 333);
>  - If the user deletes col2=33, the apply worker fails yet again due
> to tuple (4, 44, 444) because col3=444 is also unique.
> Conflicts on both col2 and col3 (which are independent of each other)
> are an example of a 'Multiple Unique Constraints' violation.
>
> In such cases, users are forced to resolve conflicts one by one,
> making the process slow and error-prone.
>
> ---
> SOLUTION:
> ---
> During an INSERT or UPDATE conflict check, instead of stopping at the
> first encountered conflict, the apply_worker will now check all unique
> indexes before reporting a conflict. If multiple unique key violations
> are found, it will report a 'multiple_unique_conflicts' conflict,
> listing all conflicting tuples in the logs. If only a single key
> conflict is detected, the existing 'insert_exists' conflict will be
> raised as it is now.

I think it makes sense to report all the unique key conflicts for a
single row at once, rather than stopping after the first one. However,
I don't understand the need to create a new conflict type. Can't we
report multiple conflicts for the row using the existing conflict
type, if different columns in the incoming row conflict with different
existing rows?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Naga Appani
Date:
Subject: Re: [Proposal] Expose internal MultiXact member count function for efficient monitoring
Next
From: Junwang Zhao
Date:
Subject: Re: SQL Property Graph Queries (SQL/PGQ)