Re: Proposal: Conflict log history table for Logical Replication - Mailing list pgsql-hackers
| From | shveta malik |
|---|---|
| Subject | Re: Proposal: Conflict log history table for Logical Replication |
| Date | |
| Msg-id | CAJpy0uBF7B+rdBFEYqGAaNpFC5nuvCnnCWqkxM7pAqX02ka+cA@mail.gmail.com Whole thread Raw |
| In response to | Re: Proposal: Conflict log history table for Logical Replication (Dilip Kumar <dilipbalaut@gmail.com>) |
| List | pgsql-hackers |
On Wed, Nov 19, 2025 at 3:46 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Nov 18, 2025 at 4:47 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > Thanks for the patch. Some feedback about the clt:
> >
> > 1)
> > local_origin is always NULL in my tests for all conflict types I tried.
>
> You need to set the replication origin as shown below
> On subscriber side:
> ---------------------------
> SELECT pg_replication_origin_create('my_remote_source_2');
> SELECT pg_replication_origin_session_setup('my_remote_source_2');
> UPDATE test SET b=200 where a=1;
>
> On remote:
> ---------------
> UPDATE test SET b=300 where a=1; -- conflicting operation with local node
>
> On subscriber
> ------------------
> postgres[1514377]=# select local_origin, remote_origin from
> myschema.conflict_log_history2 ;
> local_origin | remote_origin
> --------------------+---------------------
> my_remote_source_2 | pg_16396
Okay, I see, thanks!
>
> > 2)
> > Do we need 'key_tuple' as such or replica_identity is enough/better?
> > I see 'key_tuple' inserted as {"i":10,"j":null} for delete_missing
> > case where query was 'delete from tab1 where i=10'; here 'i' is PK;
> > which seems okay.
> > But it is '{"i":20,"j":200}' for update_origin_differ case where query
> > was 'update tab1 set j=200 where i =20'. Here too RI is 'i' alone. I
> > feel 'j' should not be part of the key but let me know if I have
> > misunderstood. IMO, 'j' being part of remote_tuple should be good
> > enough.
>
> Yeah we should display the replica identity only, I assumed in
> ReportApplyConflict() the searchslot should only have RI tuple but it
> is sending a remote tuple in the searchslot, so might need to extract
> the RI from this slot, I will work on this.
yeah, we have extracted it already in
errdetail_apply_conflict()->build_tuple_value_details(). See it dumps
it in log:
LOG: conflict detected on relation "public.tab1":
conflict=update_origin_differs
DETAIL: Updating the row that was modified locally in transaction 768
at 2025-11-18 12:09:19.658502+05:30.
Existing local row (20, 100); remote row (20, 200); replica
identity (i)=(20).
We somehow need to reuse it.
>
> > 3)
> > Do we need to have a timestamp column as well to say when conflict was
> > recorded? Or local_commit_ts, remote_commit_ts are sufficient?
> > Thoughts
>
> You mean we can record the timestamp now while inserting, not sure if
> it will add some more meaningful information than remote_commit_ts,
> but let's see what others think.
>
On rethinking, we can skip it. The commit-ts of both sides are enough.
> > 4)
> > Also, it makes sense if we have 'conflict_type' next to 'relid'. I
> > feel relid and conflict_type are primary columns and rest are related
> > details.
>
> Sure
>
> > 5)
> > Do we need table_schema, table_name when we have relid already? If we
> > want to retain these, we can name them as schemaname and relname to be
> > consistent with all other stats tables. IMO, then the order can be:
> > relid, schemaname, relname, conflcit_type and then the rest of the
> > details.
>
> Yeah this makes the table denormalized as we can fetch this
> information by joining with pg_class, but I think it might be better
> for readability, lets see what others think, for now I will reorder as
> suggested.
>
Okay, works for me if we want to keep these. I see that most of the
other statistics tables (pg_stat_all_indexes, pg_statio_all_tables,
pg_statio_all_sequences etc) that maintain a relid also retain the
names.
thanks
Shveta
pgsql-hackers by date: