Re: Conflict detection and logging in logical replication - Mailing list pgsql-hackers

From shveta malik
Subject Re: Conflict detection and logging in logical replication
Date
Msg-id CAJpy0uCJQ=n9sjYZLZDJJqoRb_RdTrbg+w_PdV3YMtsWF=Qjkg@mail.gmail.com
Whole thread Raw
In response to Re: Conflict detection and logging in logical replication  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Fri, Aug 16, 2024 at 12:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Aug 16, 2024 at 11:48 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Fri, Aug 16, 2024 at 10:46 AM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > 3)
> > > For update_exists(), we dump:
> > > Key (a, b)=(2, 1)
> > >
> > > For delete_missing, update_missing, update_differ, we dump:
> > > Replica identity (a, b)=(2, 1).
> > >
> > > For update_exists as well, shouldn't we dump 'Replica identity'? Only
> > > for insert case, it should be referred as 'Key'.
> > >
> >
> > On rethinking, is it because for update_exists case 'Key' dumped is
> > not the one used to search the row to be updated? Instead it is the
> > one used to search the conflicting row. Unlike update_differ, the row
> > to be updated and the row currently conflicting will be different for
> > update_exists case. I earlier thought that 'KEY' and 'Existing local
> > tuple' dumped always belong to the row currently being
> > updated/deleted/inserted. But for 'update_eixsts', that is not the
> > case. We are dumping 'Existing local tuple' and 'Key' for the row
> > which is conflicting and not the one being updated.  Example:
> >
> > ERROR:  conflict detected on relation "public.tab_1": conflict=update_exists
> > Key (a, b)=(2, 1); existing local tuple (2, 1); remote tuple (2, 1).
> >
> > Operations performed were:
> > Pub: insert into tab values (1,1);
> > Sub: insert into tab values (2,1);
> > Pub: update tab set a=2 where a=1;
> >
> > Here Key and local tuple are both 2,1 instead of 1,1. While replica
> > identity value (used to search original row) will be 1,1 only.
> >
> > It may be slightly confusing or say tricky to understand when compared
> > to other conflicts' LOGs. But not sure what better we can do here.
> >
>
> The update_exists behaves more like insert_exists as we detect that
> only while inserting into index. It is also not clear to me if we can
> do better than to clarify this in docs.
>

Instead of 'existing local tuple', will it be slightly better to have
'conflicting local tuple'?

Few trivial comments:

1)
errdetail_apply_conflict() header says:

 * 2. Display of conflicting key, existing local tuple, remote new tuple, and
 *    replica identity columns,  if any.

We may mention that existing *conflicting* local tuple.

Looking at build_tuple_value_details(), the cases where we display
'KEY 'and the ones where we display 'replica identity' are mutually
exclusives (we have ASSERTs like that).  Shall we add this info in
header that either Key or   'replica identity' is displayed. Or if we
don't want to make it mutually exclusive then update_exists is one
such casw where we can have both Key and 'Replica Identity cols'.


2)
BuildIndexValueDescription() header comment says:

 * This is currently used
 * for building unique-constraint, exclusion-constraint and logical replication
 * tuple missing conflict error messages

Is it being used only for 'tuple missing conflict' flow? I thought, it
will be hit for other flows as well.

thanks
Shveta



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Parallel CREATE INDEX for BRIN indexes
Next
From: Heikki Linnakangas
Date:
Subject: Re: race condition in pg_class