RE: Support logical replication of DDLs - Mailing list pgsql-hackers

From houzj.fnst@fujitsu.com
Subject RE: Support logical replication of DDLs
Date
Msg-id OS0PR01MB571636F0F1307AB1A868B06A948E9@OS0PR01MB5716.jpnprd01.prod.outlook.com
Whole thread Raw
In response to RE: Support logical replication of DDLs  ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>)
Responses RE: Support logical replication of DDLs  ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>)
List pgsql-hackers
On Tuesday, March 28, 2023 12:13 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:
> 
> On Monday, March 27, 2023 8:08 PM Amit Kapila <amit.kapila16@gmail.com>
> wrote:
> > On Mon, Mar 27, 2023 at 12:07 PM Amit Kapila <amit.kapila16@gmail.com>
> > wrote:
> > >
> > > On Mon, Mar 27, 2023 at 2:52 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > >
> > >
> > > > I suggest taking a couple of steps back from the minutiae of the
> > > > patch, and spending some hard effort thinking about how the thing
> > > > would be controlled in a useful fashion (that is, a real design
> > > > for the filtering that was mentioned at the very outset), and
> > > > about the security issues, and about how we could get to a committable
> patch.
> > > >
> > >
> > > Agreed. I'll try to summarize the discussion we have till now on
> > > this and share my thoughts on the same in a separate email.
> > >
> >
> > The idea to control what could be replicated is to introduce a new
> > publication option 'ddl' along with current options 'publish' and
> > 'publish_via_partition_root'. The values of this new option could be
> > 'table', 'function', 'all', etc. Here 'all' enables the replication of
> > all supported DDL commands. Example usage for this would be:
> > Example:
> > Create a new publication with all ddl replication enabled:
> >   CREATE PUBLICATION pub1 FOR ALL TABLES with (ddl = 'all');
> >
> > Enable table ddl replication for an existing Publication:
> >   ALTER PUBLICATION pub2 SET (ddl = 'table');
> >
> > This is what seems to have been discussed but I think we can even
> > extend it to support based on operations/commands, say one would like
> > to publish only 'create' and 'drop' of tables. Then we can extend the
> > existing publish option to have values like 'create', 'alter', and 'drop'.
> >
> > Another thing we are considering related to this is at what level
> > these additional options should be specified. We have three variants
> > FOR TABLE, FOR ALL TABLES, and FOR TABLES IN SCHEMA that enables
> > replication. Now, for the sake of simplicity, this new option is
> > discussed to be provided only with FOR ALL TABLES variant but I think
> > we can provide it with other variants with some additional
> > restrictions like with FOR TABLE, we can only specify 'alter' and
> > 'drop' for publish option. Now, though possible, it brings additional
> > complexity to support it with variants other than FOR ALL TABLES
> > because then we need to ensure additional filtering and possible
> > modification of the content we have to send to downstream. So, we can even
> decide to first support it only FOR ALL TABLES variant.
> >
> > The other point to consider for publish option 'ddl = table' is
> > whether we need to allow replicating dependent objects like say some
> > user-defined type is used in the table. I guess the difficulty here
> > would be to identify which dependents we want to allow.
> >
> > I think in the first version we should allow to replicate only some of
> > the objects instead of everything. For example, can we consider only
> > allowing tables and indexes in the first version? Then extend it in a phased
> manner?
> 
> I think supporting table related stuff in the first version makes sense and the
> patch size could be reduced to a suitable size.

Based on the discussion, I split the patch into four parts: Table DDL
replication(0001,0002), Index DDL replication(0003), ownership stuff for table
and index(0004), other DDL's replication(0005).

In this version, I mainly tried to split the patch set, and there are few
OPEN items we need to address later:

1) The current publication "ddl" option only have two values: table, all. We
   also need to add index and maybe other objects in the list.

2) Need to improve the syntax stuff. Currently, we store the option value of
   the "with (ddl = xx)" via different columns in the catalog, the
   catalog(pg_publication) will have more and more columns as we add support
   for logical replication of other objects in the future. We could store it as
   an text array instead.

   OTOH, since we have proposed some other more flexible syntax to -hackers, the current
   syntax might be changed which might also solve this problem.

3) The test_ddl_deparse_regress test module is not included in the set, because
   I think we also need to split it into table stuff, index stuff and others,
   we can share it after finishing that.

4) The patch set could be spitted further to make it easier for reviewer like:
   infrastructure for deparser, deparser, logical-decoding, built-in logical
   replication, We can do it after some analysis.

[1]
https://www.postgresql.org/message-id/OS0PR01MB571646874A3E165D93999A9D94889%40OS0PR01MB5716.jpnprd01.prod.outlook.com

Best Regards,
Hou zj

Attachment

pgsql-hackers by date:

Previous
From: "Karl O. Pinc"
Date:
Subject: Re: Doc: Rework contrib appendix -- informative titles, tweaked sentences
Next
From: Peter Eisentraut
Date:
Subject: Re: ICU locale validation / canonicalization