Re: Conflict detection and logging in logical replication - Mailing list pgsql-hackers
From | shveta malik |
---|---|
Subject | Re: Conflict detection and logging in logical replication |
Date | |
Msg-id | CAJpy0uBtGwj+_8QTQZDi8ZW=eYTbRbK3Pt2MCMX2hzCxf==4Vw@mail.gmail.com Whole thread Raw |
In response to | RE: Conflict detection and logging in logical replication ("Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>) |
Responses |
RE: Conflict detection and logging in logical replication
Re: Conflict detection and logging in logical replication |
List | pgsql-hackers |
On Wed, Jul 3, 2024 at 8:31 AM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com> wrote: > > On Wednesday, June 26, 2024 10:58 AM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com> wrote: > > > > Hi, > > As suggested by Sawada-san in another thread[1]. > > I am attaching the V4 patch set which tracks the delete_differ > conflict in logical replication. > > delete_differ means that the replicated DELETE is deleting a row > that was modified by a different origin. > Thanks for the patch. I am still in process of review but please find few comments: 1) When I try to *insert* primary/unique key on pub, which already exists on sub, conflict gets detected. But when I try to *update* primary/unique key to a value on pub which already exists on sub, conflict is not detected. I get the error: 2024-07-10 14:21:09.976 IST [647678] ERROR: duplicate key value violates unique constraint "t1_pkey" 2024-07-10 14:21:09.976 IST [647678] DETAIL: Key (pk)=(4) already exists. This is because such conflict detection needs detection of constraint violation using the *new value* rather than *existing* value during UPDATE. INSERT conflict detection takes care of this case i.e. the columns of incoming row are considered as new values and it tries to see if all unique indexes are okay to digest such new values (all incoming columns) but update's logic is different. It searches based on oldTuple *only* and thus above detection is missing. Shall we support such detection? If not, is it worth docuementing? It basically falls in 'pkey_exists' conflict category but to user it might seem like any ordinary update leading to 'unique key constraint violation'. 2) Another case which might confuse user: CREATE TABLE t1 (pk integer primary key, val1 integer, val2 integer); On PUB: insert into t1 values(1,10,10); insert into t1 values(2,20,20); On SUB: update t1 set pk=3 where pk=2; Data on PUB: {1,10,10}, {2,20,20} Data on SUB: {1,10,10}, {3,20,20} Now on PUB: update t1 set val1=200 where val1=20; On Sub, I get this: 2024-07-10 14:44:00.160 IST [648287] LOG: conflict update_missing detected on relation "public.t1" 2024-07-10 14:44:00.160 IST [648287] DETAIL: Did not find the row to be updated. 2024-07-10 14:44:00.160 IST [648287] CONTEXT: processing remote data for replication origin "pg_16389" during message type "UPDATE" for replication target relation "public.t1" in transaction 760, finished at 0/156D658 To user, it could be quite confusing, as val1=20 exists on sub but still he gets update_missing conflict and the 'DETAIL' is not sufficient to give the clarity. I think on HEAD as well (have not tested), we will get same behavior i.e. update will be ignored as we make search based on RI (pk in this case). So we are not worsening the situation, but now since we are detecting conflict, is it possible to give better details in 'DETAIL' section indicating what is actually missing? thanks Shveta
pgsql-hackers by date: