Re: Proposal: Conflict log history table for Logical Replication - Mailing list pgsql-hackers

From shveta malik
Subject Re: Proposal: Conflict log history table for Logical Replication
Date
Msg-id CAJpy0uCMDqcWGepcTwFPH+hTDjD8b72KnbL-S+d-qd7ChomOyQ@mail.gmail.com
Whole thread Raw
In response to Re: Proposal: Conflict log history table for Logical Replication  (Dilip Kumar <dilipbalaut@gmail.com>)
List pgsql-hackers
On Mon, Nov 17, 2025 at 11:54 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Nov 13, 2025 at 9:17 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Thu, Nov 13, 2025 at 2:39 PM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > > Few observations related to publication.
> > > > ------------------------------
> >
> > Thanks Shveta, for testing and sharing your thoughts.  IMHO for
> > conflict log tables it should be good enough if we restrict it when
> > ALL TABLE options are used, I don't think we need to put extra effort
> > to completely restrict it even if users want to explicitly list it
> > into the publication.
> >
> > > >
> > > > (In the below comments, clt/CLT implies Conflict Log Table)
> > > >
> > > > 1)
> > > > 'select pg_relation_is_publishable(clt)' returns true for conflict-log table.
>
> After putting more thought I have changed this to return false for
> clt, as this is just an exposed function not called by pgoutput layer.
>
> > > > 2)
> > > > '\d+ clt'   shows all-tables publication name. I feel we should not
> > > > show that for clt.
> >
> Fixed
>
> >
> > > > 3)
> > > > I am able to create a publication for clt table, should it be allowed?
> >
> > I believe we should not do any specific handling to restrict this but
> > I am open for the opinions.
>
> Restricting this as well, lets see what others think.
>
>
> >
> > > > 5) Also, I feel we can add some documentation now to help others to
> > > > understand/review the patch better without going through the long
> > > > thread.
> >
> > Make sense, I will do that in the next version.
> Done that but not compiled the docs as I don't currently have the
> setup so added as WIP patch.
>
>
> > > > 2)
> > > > Conflicts where row on sub is missing, local_ts incorrectly inserted.
> > > > It is '2000-01-01 05:30:00+05:30'. Should it be Null or something
> > > > indicating that it is not applicable for this conflict-type?
> > > >
> > > > Example: delete_missing, update_missing
> > > > pub:
> > > >  insert into tab1 values(10,10);
> > > >  insert into tab1 values(20,10);
> > > >  sub:  delete from tab1 where i=10;
> > > >  pub:  delete from tab1 where i=10;
> >
> > Sure I will test this.
>
> I have fixed this.

Thanks for the patch.  Some feedback about the clt:

1)
local_origin is always NULL in my tests for all conflict types I tried.

2)
Do we need 'key_tuple' as such or replica_identity is enough/better?
I see 'key_tuple' inserted as {"i":10,"j":null} for delete_missing
case where query was 'delete from tab1 where i=10'; here 'i' is PK;
which seems okay.
But it is '{"i":20,"j":200}' for update_origin_differ case where query
was 'update tab1 set j=200 where i =20'. Here too RI is 'i' alone. I
feel 'j' should not be part of the key but let me know if I have
misunderstood. IMO, 'j' being part of remote_tuple should be good
enough.

3)
Do we need to have a timestamp column as well to say when conflict was
recorded? Or local_commit_ts, remote_commit_ts are sufficient?
Thoughts

4)
Also, it makes sense if we have 'conflict_type' next to 'relid'. I
feel relid and conflict_type are primary columns and rest are related
details.

5)
Do we need table_schema, table_name when we have relid already? If we
want to retain these, we can name them as schemaname and relname to be
consistent with all other stats tables. IMO, then the order can be:
relid, schemaname, relname, conflcit_type and then the rest of the
details.

thanks
Shveta



pgsql-hackers by date:

Previous
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: Parallel Apply
Next
From: Vik Fearing
Date:
Subject: Re: Row pattern recognition