Re: Conflict detection and logging in logical replication - Mailing list pgsql-hackers

From shveta malik
Subject Re: Conflict detection and logging in logical replication
Date
Msg-id CAJpy0uBP8ubVix9vHhJiHOt37mbc2fsET6AdiRHq_nH_TGP1mA@mail.gmail.com
Whole thread Raw
In response to RE: Conflict detection and logging in logical replication  ("Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>)
Responses RE: Conflict detection and logging in logical replication
List pgsql-hackers
On Thu, Jul 11, 2024 at 7:47 AM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
>
> On Wednesday, July 10, 2024 5:39 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Wed, Jul 3, 2024 at 8:31 AM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com>
> > wrote:
> > >
> > > On Wednesday, June 26, 2024 10:58 AM Zhijie Hou (Fujitsu)
> > <houzj.fnst@fujitsu.com> wrote:
> > > >
> > >
> > > Hi,
> > >
> > > As suggested by Sawada-san in another thread[1].
> > >
> > > I am attaching the V4 patch set which tracks the delete_differ
> > > conflict in logical replication.
> > >
> > > delete_differ means that the replicated DELETE is deleting a row that
> > > was modified by a different origin.
> > >
> >
> > Thanks for the patch. I am still in process of review but please find few
> > comments:
>
> Thanks for the comments!
>
> > 1) When I try to *insert* primary/unique key on pub, which already exists on
> > sub, conflict gets detected. But when I try to *update* primary/unique key to a
> > value on pub which already exists on sub, conflict is not detected. I get the
> > error:
> >
> > 2024-07-10 14:21:09.976 IST [647678] ERROR:  duplicate key value violates
> > unique constraint "t1_pkey"
> > 2024-07-10 14:21:09.976 IST [647678] DETAIL:  Key (pk)=(4) already exists.
>
> Yes, I think the detection of this conflict is not added with the
> intention to control the size of the patch in the first version.
>
> >
> > This is because such conflict detection needs detection of constraint violation
> > using the *new value* rather than *existing* value during UPDATE. INSERT
> > conflict detection takes care of this case i.e. the columns of incoming row are
> > considered as new values and it tries to see if all unique indexes are okay to
> > digest such new values (all incoming columns) but update's logic is different.
> > It searches based on oldTuple *only* and thus above detection is missing.
>
> I think the logic is the same if we want to detect the unique violation
> for UDPATE, we need to check if the new value of the UPDATE violates any
> unique constraints same as the detection of insert_exists (e.g. check
> the conflict around ExecInsertIndexTuples())
>
> >
> > Shall we support such detection? If not, is it worth docuementing?
>
> I am personally OK to support this detection.

+1. I think it should not be a complex or too big change.

> And
> I think it's already documented that we only detect unique violation for
> insert which mean update conflict is not detected.
>
> > 2)
> > Another case which might confuse user:
> >
> > CREATE TABLE t1 (pk integer primary key, val1 integer, val2 integer);
> >
> > On PUB: insert into t1 values(1,10,10); insert into t1 values(2,20,20);
> >
> > On SUB: update t1 set pk=3 where pk=2;
> >
> > Data on PUB: {1,10,10}, {2,20,20}
> > Data on SUB: {1,10,10}, {3,20,20}
> >
> > Now on PUB: update t1 set val1=200 where val1=20;
> >
> > On Sub, I get this:
> > 2024-07-10 14:44:00.160 IST [648287] LOG:  conflict update_missing detected
> > on relation "public.t1"
> > 2024-07-10 14:44:00.160 IST [648287] DETAIL:  Did not find the row to be
> > updated.
> > 2024-07-10 14:44:00.160 IST [648287] CONTEXT:  processing remote data for
> > replication origin "pg_16389" during message type "UPDATE" for replication
> > target relation "public.t1" in transaction 760, finished at 0/156D658
> >
> > To user, it could be quite confusing, as val1=20 exists on sub but still he gets
> > update_missing conflict and the 'DETAIL' is not sufficient to give the clarity. I
> > think on HEAD as well (have not tested), we will get same behavior i.e. update
> > will be ignored as we make search based on RI (pk in this case). So we are not
> > worsening the situation, but now since we are detecting conflict, is it possible
> > to give better details in 'DETAIL' section indicating what is actually missing?
>
> I think It's doable to report the row value that cannot be found in the local
> relation, but the concern is the potential risk of exposing some
> sensitive data in the log. This may be OK, as we are already reporting the
> key value for constraints violation, so if others also agree, we can add
> the row value in the DETAIL as well.

Okay, let's see what others say. JFYI, the same situation holds valid
for delete_missing case.

I have one concern about how we deal with conflicts. As for
insert_exists, we keep on erroring out while raising conflict, until
it is manually resolved:
ERROR:  conflict insert_exists detected

But for other cases, we just log conflict and either skip or apply the
operation. I
LOG:  conflict update_differ detected
DETAIL:  Updating a row that was modified by a different origin

I know that it is no different than HEAD. But now since we are logging
conflicts explicitly, we should call out default behavior on each
conflict. I see some incomplete and scattered info in '31.5.
Conflicts' section saying that:
 "When replicating UPDATE or DELETE operations, missing data will not
produce a conflict and such operations will simply be skipped."
(lets say it as pt a)

Also some more info in a later section saying (pt b):
:A conflict will produce an error and will stop the replication; it
must be resolved manually by the user."

My suggestions:
1) in point a above, shall we have:
missing data or differing data (i.e. somehow reword to accommodate
update_differ and delete_differ cases)

2) Now since we have a section explaining conflicts detected and
logged with detect_conflict=true, shall we mention default behaviour
with each?

insert_exists: error will be raised until resolved manually.
update_differ: update will be applied
update_missing: update will be skipped
delete_missing: delete will be skipped
delete_differ: delete will be applied.

thanks
Shveta



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: relfilenode statistics
Next
From: "cca5507"
Date:
Subject: Redundant syscache access in get_rel_sync_entry()