Re: Support logical replication of DDLs - Mailing list pgsql-hackers

From vignesh C
Subject Re: Support logical replication of DDLs
Date
Msg-id CALDaNm2WZ1Dif2gqoso92e5rfSRf1p99i=R280ODLoDJvPTo_w@mail.gmail.com
Whole thread Raw
In response to Re: Support logical replication of DDLs  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
On Wed, 26 Apr 2023 at 12:02, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Mar 28, 2023 at 3:22 PM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > On Tuesday, March 28, 2023 1:41 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Mon, Mar 27, 2023 at 5:37 PM Amit Kapila <amit.kapila16@gmail.com>
> > > wrote:
> > > >
> > > > On Mon, Mar 27, 2023 at 12:07 PM Amit Kapila <amit.kapila16@gmail.com>
> > > wrote:
> > > > >
> > > > > On Mon, Mar 27, 2023 at 2:52 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > > > >
> > > > >
> > > > > > I suggest taking a couple of steps back from the minutiae of the
> > > > > > patch, and spending some hard effort thinking about how the thing
> > > > > > would be controlled in a useful fashion (that is, a real design
> > > > > > for the filtering that was mentioned at the very outset), and
> > > > > > about the security issues, and about how we could get to a committable
> > > patch.
> > > > > >
> > > > >
> > > > > Agreed. I'll try to summarize the discussion we have till now on
> > > > > this and share my thoughts on the same in a separate email.
> > > > >
> > > >
> > > > The idea to control what could be replicated is to introduce a new
> > > > publication option 'ddl' along with current options 'publish' and
> > > > 'publish_via_partition_root'. The values of this new option could be
> > > > 'table', 'function', 'all', etc. Here 'all' enables the replication of
> > > > all supported DDL commands. Example usage for this would be:
> > > > Example:
> > > > Create a new publication with all ddl replication enabled:
> > > >   CREATE PUBLICATION pub1 FOR ALL TABLES with (ddl = 'all');
> > > >
> > > > Enable table ddl replication for an existing Publication:
> > > >   ALTER PUBLICATION pub2 SET (ddl = 'table');
> > > >
> > > > This is what seems to have been discussed but I think we can even
> > > > extend it to support based on operations/commands, say one would like
> > > > to publish only 'create' and 'drop' of tables. Then we can extend the
> > > > existing publish option to have values like 'create', 'alter', and
> > > > 'drop'.
> > > >
> > >
> > > The other idea could be to that for the new option ddl, we input command tags
> > > such that the replication will happen for those commands.
> > > For example, ALTER PUBLICATION pub2 SET (ddl = 'Create Table, Alter
> > > Table, ..'); This will obviate the need to have additional values like 'create', 'alter',
> > > and 'drop' for publish option.
> > >
> > > The other thought related to filtering is that one might want to filter DDLs and
> > > or DMLs performed by specific roles in the future. So, we then need to
> > > introduce another option ddl_role, or something like that.
> > >
> > > Can we think of some other kind of filter for DDL replication?
> >
> > I am thinking another generic syntax for ddl replication like:
> >
> > --
> > CREATE PUBLICATION pubname FOR object_type object_name with (publish = 'ddl_type');
> > --
> >
> > To replicate DDLs that happened on a table, we don't need to add new syntax or
> > option, we can extend the value for the 'publish' option like:
> >
> > To support more non-table objects replication, we can follow the same style and write it like:
> > --
> > CRAETE PUBLICATION FOR FUNCTION f1 with (publish = 'alter'); -- function
> > CRAETE PUBLICATION FOR ALL OPERATORS IN SCHEMA op_schema with (publish = 'drop'); -- operators
> > CRAETE PUBLICATION FOR ALL OBJECTS with (publish = 'alter, create, drop'); -- everything
> > --
> >
> > In this approach, we extend the publication grammar and users can
> > filter the object schema, object name, object type and ddltype. We can also add
> > more options to filter role or other infos in the future.
>
> In this approach, does the subscriber need to track what objects have
> been subscribed similar to tables? For example, suppose that we
> created a publication for function func1 and created a subscription
> for the publication. What if we add function func2 to the publication?
> If we follow the current behavior, DDLs for func2 will be replicated
> to the subscriber but the subscriber won't apply it unless we refresh
> the publication of the subscription. So it seems to me that the
> subscriber needs to have a list of subscribed functions, and we will
> end up having lists of all types of objects.
>
> >
> > ~~~~
> >
> > One more alternative could be like:
> >
> > One more alternative could be like:
> > CREATE PUBLICATION xx FOR pub_create_alter_table WITH (ddl = 'table:create,alter'); -- it will publish create table
andalter table operations. 
> > CREATE PUBLICATION xx FOR pub_all_table WITH (ddl = 'table:all'); -- This means all table operations
create/alter/drop
> > CREATE PUBLICATION xx FOR pub_all_table WITH (ddl = 'table'); -- same as above
> >
> > This can be extended later to:
> > CREATE PUBLICATION xx FOR pub_all_func WITH (ddl = 'function:all');
> > CREATE PUBLICATION xx FOR pub_create_trigger (ddl = 'trigger:create');
> >
> > In this approach, we don't need to add more stuff in gram.y and
> > will give fine-grained control as well.
>
> What did you mean by pub_create_alter_table, pub_all_table,
> pg_all_func, and pub_create_trigger? Are they table names or some
> special keywords indicating groups of objects?

Those were table names, I'm slightly modifying the examples to make it
more clearer:
ex 1: CREATE PUBLICATION pub1 FOR table employee, department, sales
WITH (ddl = 'table:create,alter');
In this I meant that create/alter operation on table employee,
department and sales will be replicated to the subscriber.

ex 2: CREATE PUBLICATION pub2 FOR table employee, department, sales
WITH (ddl = 'table:all');
In this I meant that it will replicate all ddl operations(create/alter
and drop) on table employee, department and sales to the subscriber

ex 3: CREATE PUBLICATION pub3 FOR table employee, department, sales
WITH (ddl = 'table');
This is same as example 2, only advantage here is that user need not
specify ":all" along with the syntax, by default it will replicate all
the table operations create/alter and drop.

The create publication syntax will be the same existing syntax, only
change is the addition of the following in the with option:
ddl = table
ddl = table:all
ddl = table:create|alter|drop

Regards,
Vignesh



pgsql-hackers by date:

Previous
From: vignesh C
Date:
Subject: Re: Add two missing tests in 035_standby_logical_decoding.pl
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: pg_stat_io for the startup process