Re: Conflict detection and logging in logical replication - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Conflict detection and logging in logical replication
Date
Msg-id CAA4eK1Lmu=oVySfGjxEUykCT3FPnL1YFDHKr1ZMwFy7WUgfc6g@mail.gmail.com
Whole thread Raw
In response to Re: Conflict detection and logging in logical replication  (shveta malik <shveta.malik@gmail.com>)
Responses Re: Conflict detection and logging in logical replication
List pgsql-hackers
On Fri, Jul 26, 2024 at 3:37 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Fri, Jul 26, 2024 at 3:03 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
> >
> > On Thu, Jul 25, 2024 at 12:04 PM Zhijie Hou (Fujitsu)
> > <houzj.fnst@fujitsu.com> wrote:
> > > Here is the V6 patch set which addressed Shveta and Nisha's comments
> > > in [1][2][3][4].
> >
> > Thanks for the patch.
> > I tested the v6-0001 patch with partition table scenarios. Please
> > review the following scenario where Pub updates a tuple, causing it to
> > move from one partition to another on Sub.
> >
> > Setup:
> > Pub:
> >  create table tab (a int not null, b int not null);
> >  alter table tab add constraint tab_pk primary key (a,b);
> > Sub:
> >  create table tab (a int not null, b int not null) partition by range (b);
> >  alter table tab add constraint tab_pk primary key (a,b);
> >  create table tab_1 partition of tab FOR values from (MINVALUE) TO (100);
> >  create table tab_2 partition of tab FOR values from (101) TO (MAXVALUE);
> >
> > Test:
> >  Pub: insert into tab values (1,1);
> >  Sub: update tab set a=1 where a=1;  > just to make it Sub's origin
> >  Sub: insert into tab values (1,101);
> >  Pub: update b=101 where b=1; --> Both 'update_differ' and
> > 'insert_exists' are detected.
> >
> > For non-partitioned tables, a similar update results in
> > 'update_differ' and 'update_exists' conflicts. After detecting
> > 'update_differ', the apply worker proceeds to apply the remote update
> > and if a tuple with the updated key already exists, it raises
> > 'update_exists'.
> > This same behavior is expected for partitioned tables too.
>
> Good catch. Yes, from the user's perspective, an update_* conflict
> should be raised when performing an update operation. But internally
> since we are deleting from one partition and inserting to another, we
> are hitting insert_exist. To convert this insert_exist to udpate_exist
> conflict, perhaps we need to change insert-operation to
> update-operation as the default resolver is 'always apply update' in
> case of update_differ.
>

But we already document that behind the scenes such an update is a
DELETE+INSERT operation [1]. Also, all the privilege checks or before
row triggers are of type insert, so, I think it is okay to consider
this as insert_exists conflict and document it. Later, resolver should
also fire for insert_exists conflict.

One more thing we need to consider is whether we should LOG or ERROR
for update/delete_differ conflicts. If we LOG as the patch is doing
then we are intentionally overwriting the row when the user may not
expect it. OTOH, without a patch anyway we are overwriting, so there
is an argument that logging by default is what the user will expect.
What do you think?

[1] - https://www.postgresql.org/docs/devel/sql-update.html (See ...
Behind the scenes, the row movement is actually a DELETE and INSERT
operation.)

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: shveta malik
Date:
Subject: Re: Conflict detection and logging in logical replication
Next
From: shveta malik
Date:
Subject: Re: Conflict detection and logging in logical replication