RE: Conflict detection and logging in logical replication - Mailing list pgsql-hackers

From Zhijie Hou (Fujitsu)
Subject RE: Conflict detection and logging in logical replication
Date
Msg-id OS0PR01MB57163332C43702CCA192A96694862@OS0PR01MB5716.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Conflict detection and logging in logical replication  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Conflict detection and logging in logical replication
List pgsql-hackers
On Monday, August 12, 2024 7:41 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> 
> On Fri, Aug 9, 2024 at 12:29 PM Zhijie Hou (Fujitsu)
> <houzj.fnst@fujitsu.com> wrote:
> >
> > Here is the V12 patch that improved the log format as discussed.
> >
> 
> Review comments:

Thanks for the comments.

> ===============
> 1. The patch doesn't display the remote tuple for delete_differ case.
> However, it shows the remote tuple correctly for update_differ. Is
> there a reason for the same? See below messages:
> 
> update_differ:
> --------------
> LOG:  conflict detected on relation "public.t1": conflict=update_differ
> DETAIL:  Updating the row containing (c1)=(1) that was modified
> locally in transaction 806 at 2024-08-12 11:48:14.970002+05:30.
>         Existing local tuple (1, 3, arun      ); remote tuple (1, 3,
> ajay      ).
> ...
> 
> delete_differ
> --------------
> LOG:  conflict detected on relation "public.t1": conflict=delete_differ
> DETAIL:  Deleting the row containing (c1)=(1) that was modified by
> locally in transaction 809 at 2024-08-12 14:15:41.966467+05:30.
>         Existing local tuple (1, 3, arun      ).
> 
> Note this happens when the publisher table has a REPLICA IDENTITY FULL
> and the subscriber table has primary_key. It would be better to keep
> the messages consistent. One possibility is that we remove
> key/old_tuple from the first line of the DETAIL message and display it
> in the second line as Existing local tuple <local_tuple>; remote tuple
> <..>; key <...>

Agreed. I thought that in delete_differ/missing cases, the remote tuple is covered
In the key values in the first sentence. To be consistent, I have moved the column-values
from the first sentence to the second sentence including the insert_exists conflict.

The new format looks like:

LOG: xxx
DETAIL: Key %s; existing local tuple %s; remote new tuple %s; replica identity %s

The Key will include the conflicting key for xxx_exists conflicts. And the replica identity part
will include the replica identity keys or the full tuple value in replica identity FULL case.

> 
> 2. Similar to above, the remote tuple is not displayed in
> delete_missing but displayed in updated_missing type of conflict. If
> we follow the style mentioned in the previous point then the DETAIL
> message: "DETAIL:  Did not find the row containing (c1)=(1) to be
> updated." can also be changed to: "DETAIL:  Could not find the row to
> be updated." followed by other detail.

Same as above.


> 3. The detail of insert_exists is confusing.
> 
> ERROR:  conflict detected on relation "public.t1": conflict=insert_exists
> DETAIL:  Key (c1)=(1) already exists in unique index "t1_pkey", which
> was modified locally in transaction 802 at 2024-08-12
> 11:11:31.252148+05:30.
> 
> It sounds like the key value "(c1)=(1)" in the index is modified. How
> about changing slightly as: "Key (c1)=(1) already exists in unique
> index "t1_pkey", modified locally in transaction 802 at 2024-08-12
> 11:11:31.252148+05:30."? Feel free to propose if anything better comes
> to your mind.

The suggested message looks good to me.

> 
> 4.
> if (localorigin == InvalidRepOriginId)
> + appendStringInfo(&err_detail, _("Deleting the row containing %s that
> was modified by locally in transaction %u at %s."),
> + val_desc, localxmin, timestamptz_to_str(localts));
> 
> Typo in the above message. /modified by locally/modified locally

Fixed.

> 
> 5.
> @@ -2661,6 +2662,29 @@ apply_handle_update_internal(ApplyExecutionData
> *edata,
> {
> ...
> found = FindReplTupleInLocalRel(edata, localrel,
> &relmapentry->remoterel,
> localindexoid,
> remoteslot, &localslot);
> ...
> ...
> +
> + ReportApplyConflict(LOG, CT_UPDATE_DIFFER, relinfo,
> + GetRelationIdentityOrPK(localrel),
> 
> To find the tuple, we may have used an index other than Replica
> Identity or PK (see IsIndexUsableForReplicaIdentityFull), but while
> reporting conflict we don't consider such an index. I think the reason
> is that such an index scan wouldn't have resulted in a unique tuple
> and that is why we always compare the complete tuple in such cases. Is
> that the reason? Can we write a comment to make it clear?

Added comments atop of ReportApplyConflict for the 'indexoid' parameter.

> 
> 6.
> void ReportApplyConflict(int elevel, ConflictType type,
> + ResultRelInfo *relinfo, Oid indexoid,
> + TransactionId localxmin,
> + RepOriginId localorigin,
> + TimestampTz localts,
> + TupleTableSlot *searchslot,
> + TupleTableSlot *localslot,
> + TupleTableSlot *remoteslot,
> + EState *estate);
> 
> The prototype looks odd with pointers and non-pointer variables in
> mixed order. How about arranging parameters in the following order:
> Estate, ResultRelInfo, TupleTableSlot *searchslot, TupleTableSlot
> *localslot, TupleTableSlot *remoteslot, Oid indexoid, TransactionId
> localxmin, RepOriginId localorigin, TimestampTz localts?
> 
> 7. Like above, check the parameters of other functions like
> errdetail_apply_conflict, build_index_value_desc,
> build_tuple_value_details, etc.

Changed as suggested.

Here is V13 patch set which addressed above comments.

Best Regards,
Hou zj

Attachment

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Rename C23 keyword
Next
From: Ayush Tiwari
Date:
Subject: Drop database command will raise "wrong tuple length" if pg_database tuple contains toast attribute.