Re: Conflict detection and logging in logical replication - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Conflict detection and logging in logical replication
Date
Msg-id CAA4eK1LHb_g-08WfhhW1XGNvmqopxU8Aecb2Z5r8ukjwOO8MSA@mail.gmail.com
Whole thread Raw
In response to Re: Conflict detection and logging in logical replication  (shveta malik <shveta.malik@gmail.com>)
Responses Re: Conflict detection and logging in logical replication
RE: Conflict detection and logging in logical replication
List pgsql-hackers
On Fri, Aug 16, 2024 at 11:48 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Fri, Aug 16, 2024 at 10:46 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > 3)
> > For update_exists(), we dump:
> > Key (a, b)=(2, 1)
> >
> > For delete_missing, update_missing, update_differ, we dump:
> > Replica identity (a, b)=(2, 1).
> >
> > For update_exists as well, shouldn't we dump 'Replica identity'? Only
> > for insert case, it should be referred as 'Key'.
> >
>
> On rethinking, is it because for update_exists case 'Key' dumped is
> not the one used to search the row to be updated? Instead it is the
> one used to search the conflicting row. Unlike update_differ, the row
> to be updated and the row currently conflicting will be different for
> update_exists case. I earlier thought that 'KEY' and 'Existing local
> tuple' dumped always belong to the row currently being
> updated/deleted/inserted. But for 'update_eixsts', that is not the
> case. We are dumping 'Existing local tuple' and 'Key' for the row
> which is conflicting and not the one being updated.  Example:
>
> ERROR:  conflict detected on relation "public.tab_1": conflict=update_exists
> Key (a, b)=(2, 1); existing local tuple (2, 1); remote tuple (2, 1).
>
> Operations performed were:
> Pub: insert into tab values (1,1);
> Sub: insert into tab values (2,1);
> Pub: update tab set a=2 where a=1;
>
> Here Key and local tuple are both 2,1 instead of 1,1. While replica
> identity value (used to search original row) will be 1,1 only.
>
> It may be slightly confusing or say tricky to understand when compared
> to other conflicts' LOGs. But not sure what better we can do here.
>

The update_exists behaves more like insert_exists as we detect that
only while inserting into index. It is also not clear to me if we can
do better than to clarify this in docs.

> --------------------
>
> One more comment:
>
> 5)
> For insert/update_exists, the sequence is:
> Key .. ; existing local tuple .. ; remote tuple ...
>
> For rest of the conflicts, sequence is:
>  Existing local tuple .. ; remote tuple .. ; replica identity ..
>
> Is it intentional? Shall the 'Key' or 'Replica Identity' be the first
> one to come in all conflicts?
>

This is worth considering but Replica Identity signifies the old tuple
values, that is why it is probably kept at the end. But let's see what
Hou-San or others think about this.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Conflict detection and logging in logical replication
Next
From: Jelte Fennema-Nio
Date:
Subject: Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs