Re: Conflict detection and logging in logical replication - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Conflict detection and logging in logical replication
Date
Msg-id CAA4eK1JY+z0cnAknteiekJygYvOVqxryibXESTRTcnpt7u0ZoQ@mail.gmail.com
Whole thread Raw
In response to RE: Conflict detection and logging in logical replication  ("Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>)
Responses RE: Conflict detection and logging in logical replication
List pgsql-hackers
On Tue, Aug 6, 2024 at 1:45 PM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
>
> On Monday, August 5, 2024 6:52 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Sun, Aug 4, 2024 at 1:22 PM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com>
> > wrote:
> > >
> > > On Friday, August 2, 2024 7:03 PM Amit Kapila <amit.kapila16@gmail.com>
> > wrote:
> > > >
> > >
> > > Here is the V11 patch set which addressed above and Kuroda-san[1]
> > comments.
> > >
> >
> > A few design-level points:
> >
> > *
> > @@ -525,10 +602,33 @@ ExecSimpleRelationInsert(ResultRelInfo
> > *resultRelInfo,
> >   /* OK, store the tuple and create index entries for it */
> >   simple_table_tuple_insert(resultRelInfo->ri_RelationDesc, slot);
> >
> > + conflictindexes = resultRelInfo->ri_onConflictArbiterIndexes;
> > +
> >   if (resultRelInfo->ri_NumIndices > 0)
> >   recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
> > -    slot, estate, false, false,
> > -    NULL, NIL, false);
> > +    slot, estate, false,
> > +    conflictindexes ? true : false,
> > +    &conflict,
> > +    conflictindexes, false);
> > +
> > + /*
> > + * Checks the conflict indexes to fetch the conflicting local tuple
> > + * and reports the conflict. We perform this check here, instead of
> > + * performing an additional index scan before the actual insertion and
> > + * reporting the conflict if any conflicting tuples are found. This is
> > + * to avoid the overhead of executing the extra scan for each INSERT
> > + * operation, even when no conflict arises, which could introduce
> > + * significant overhead to replication, particularly in cases where
> > + * conflicts are rare.
> > + *
> > + * XXX OTOH, this could lead to clean-up effort for dead tuples added
> > + * in heap and index in case of conflicts. But as conflicts shouldn't
> > + * be a frequent thing so we preferred to save the performance overhead
> > + * of extra scan before each insertion.
> > + */
> > + if (conflict)
> > + CheckAndReportConflict(resultRelInfo, estate, CT_INSERT_EXISTS,
> > +    recheckIndexes, slot);
> >
> > I was thinking about this case where we have some pros and cons of doing
> > additional scans only after we found the conflict. I was wondering how we will
> > handle the resolution strategy for this when we have to remote_apply the tuple
> > for insert_exists/update_exists cases.
> > We would have already inserted the remote tuple in the heap and index before
> > we found the conflict which means we have to roll back that change and then
> > start a forest transaction to perform remote_apply which probably has to
> > update the existing tuple. We may have to perform something like speculative
> > insertion and then abort it. That doesn't sound cheap either. Do you have any
> > better ideas?
>
> Since most of the codes of conflict detection can be reused in the later
> resolution patch. I am thinking we can go for re-scan after insertion approach
> for detection patch. Then in resolution patch we can probably have a check in
> the patch that if the resolver is remote_apply/last_update_win we detect
> conflict before, otherwise detect it after. This way we can save an
> subscription option in the detection patch because we are not introducing overhead
> for the detection.
>

Sounds reasonable to me.

>
>
> >
> > *
> > -ERROR:  duplicate key value violates unique constraint "test_pkey"
> > -DETAIL:  Key (c)=(1) already exists.
> > +ERROR:  conflict insert_exists detected on relation "public.test"
> > +DETAIL:  Key (c)=(1) already exists in unique index "t_pkey", which
> > was modified locally in transaction 740 at 2024-06-26 10:47:04.727375+08.
> >
> > I think the format to display conflicts is not very clear. The conflict should be
> > apparent just by seeing the LOG/ERROR message. I am thinking of something
> > like below:
> >
> > LOG: CONFLICT: <insert_exisits or whatever names we document>;
> > DESCRIPTION: If any .. ; RESOLUTION: (This one can be added later)
> > DEATAIL: remote_tuple (tuple values); local_tuple (tuple values);
> >
> > With the above, one can easily identify the conflict's reason and action taken by
> > apply worker.
>
> Thanks for the idea! I thought about few styles based on the suggested format,
> what do you think about the following ?
>
> ---
> Version 1
> ---
> LOG: CONFLICT: insert_exists; DESCRIPTION: remote INSERT violates unique constraint "uniqueindex" on relation
"public.test".
> DETAIL: Existing local tuple (a, b, c) = (2, 3, 4) xid=123,origin="pub",timestamp=xxx; remote tuple (a, b, c) = (2,
4,5). 
>

Won't this case be ERROR? If so, the error message format like the
above appears odd to me because in some cases, the user may want to
add some filter based on the error message though that is not ideal.
Also, the primary error message starts with a small case letter and
should be short.

> LOG: CONFLICT: update_differ; DESCRIPTION: updating a row with key (a, b) = (2, 4) on relation "public.test" was
modifiedby a different source. 
> DETAIL: Existing local tuple (a, b, c) = (2, 3, 4) xid=123,origin="pub",timestamp=xxx; remote tuple (a, b, c) = (2,
4,5). 
>
> LOG: CONFLICT: update_missing; DESCRIPTION: did not find the row with key (a, b) = (2, 4) on "public.test" to update.
> DETAIL: remote tuple (a, b, c) = (2, 4, 5).
>
> ---
> Version 2
> It moves most the details to the DETAIL line compared to version 1.
> ---
> LOG: CONFLICT: insert_exists on relation "public.test".
> DETAIL: Key (a)=(1) already exists in unique index "uniqueindex", which was modified by origin "pub" in transaction
123at 2024xxx; 
>                 Existing local tuple (a, b, c) = (1, 3, 4), remote tuple (a, b, c) = (1, 4, 5).
>
> LOG: CONFLICT: update_differ on relation "public.test".
> DETAIL: Updating a row with key (a, b) = (2, 4) that was modified by a different origin "pub" in transaction 123 at
2024xxx;
>                 Existing local tuple (a, b, c) = (2, 3, 4); remote tuple (a, b, c) = (2, 4, 5).
>
> LOG: CONFLICT: update_missing on relation "public.test".
> DETAIL: Did not find the row with key (a, b) = (2, 4) to update;
>                 Remote tuple (a, b, c) = (2, 4, 5).
>

I think we can combine sentences with full stop.

...
> ---
> Version 3
> It is similar to the style in the current patch, I only added the key value for
> differ and missing conflicts without outputting the complete
> remote/local tuple value.
> ---
> LOG: conflict insert_exists detected on relation "public.test".
> DETAIL: Key (a)=(1) already exists in unique index "uniqueindex", which was modified by origin "pub" in transaction
123at 2024xxx. 
>

For ERROR messages this appears suitable.

Considering all the above points, I propose yet another version:

LOG: conflict detected for relation "public.test": conflict=insert_exists
DETAIL: Key (a)=(1) already exists in unique index "uniqueindex",
which was modified by the origin "pub" in transaction 123 at 2024xxx.
Existing local tuple (a, b, c) = (1, 3, 4), remote tuple (a, b, c) =
(1, 4, 5).

LOG: conflict detected for relation "public.test": conflict=update_differ
DETAIL: Updating a row with key (a, b) = (2, 4) that was modified by a
different origin "pub" in transaction 123 at 2024xxx. Existing local
tuple (a, b, c) = (2, 3, 4); remote tuple (a, b, c) = (2, 4, 5).

LOG:  conflict detected for relation "public.test": conflict=update_missing
DETAIL: Could not find the row with key (a, b) = (2, 4) to update.
Remote tuple (a, b, c) = (2, 4, 5).


--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: shveta malik
Date:
Subject: Re: Logical Replication of sequences
Next
From: Shubham Khanna
Date:
Subject: Re: Pgoutput not capturing the generated columns