Re: Proposal: Conflict log history table for Logical Replication - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: Proposal: Conflict log history table for Logical Replication
Date
Msg-id CAFiTN-vFKE8E_N6h+peX9DP92mxCeFdm5A9Esn4DkLmNcZ-dOA@mail.gmail.com
Whole thread Raw
In response to Re: Proposal: Conflict log history table for Logical Replication  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Proposal: Conflict log history table for Logical Replication
List pgsql-hackers
On Sat, Sep 20, 2025 at 5:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Sep 18, 2025 at 11:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Sep 18, 2025 at 1:33 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > If we compare conflict_history_table with the slot that gets created
> > > with subscription, one can say the same thing about slots. Users can
> > > drop the slots and whole replication will stop. I think this table
> > > will be created with the same privileges as the owner of a
> > > subscription which can be either a superuser or a user with the
> > > privileges of the pg_create_subscription role, so we can rely on such
> > > users.
> >
> > We might want to consider which role inserts the conflict info into
> > the history table. For example, if any table created by a user can be
> > used as the history table for a subscription and the conflict info
> > insertion is performed by the subscription owner, we would end up
> > having the same security issue that was addressed by the run_as_owner
> > subscription option.
> >
>
> Yeah, I don't think we want to open that door. For user created
> tables, we should perform actions with table_owner's privilege. In
> such a case, if one wants to create a subscription with run_as_owner
> option, she should give DML operation permissions to the subscription
> owner. OTOH, if we create this table internally (via subscription
> owner) then irrespective of run_as_owner, we will always insert as
> subscription_owner.
>
> AFAIR, one open point for internally created tables is whether we
> should skip changes to conflict_history table while replicating
> changes? The table will be considered under for ALL TABLES
> publications, if defined? Ideally, these should behave as catalog
> tables, so one option is to mark them as 'user_catalog_table', or the
> other option is we have some hard-code checks during replication. The
> first option has the advantage that it won't write additional WAL for
> these tables which is otherwise required under wal_level=logical. What
> other options do we have?

I was doing more analysis and testing for 'use_catalog_table', so what
I found is when a table is marked as  'use_catalog_table', it will log
extra information i.e. CID[1] so that these tables can be used for
scanning as well during decoding like catalog tables using historical
snapshot.  And I have checked the code and tested as well
'use_catalog_table' does get streamed with ALL TABLE options.  Am I
missing something or are we thinking of changing the behavior of
use_catalog_table so that they do not get decoded, but I think that
will change the existing behaviour so might not be a good option, yet
another idea is to invent some other option for which purpose called
'conflict_history_purpose' but maybe that doesn't justify the purpose
of the new option IMHO.

[1]
/*
* For logical decode we need combo CIDs to properly decode the
* catalog
*/
if (RelationIsAccessibleInLogicalDecoding(relation))
log_heap_new_cid(relation, &tp);


--
Regards,
Dilip Kumar
Google



pgsql-hackers by date:

Previous
From: Alexander Lakhin
Date:
Subject: Re: GNU/Hurd portability patches
Next
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: Newly created replication slot may be invalidated by checkpoint