Re: Support logical replication of DDLs - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Support logical replication of DDLs
Date
Msg-id CAD21AoA-TiHQcBfAujwdDiEY_8C0c_KVea+ZdBzAg_PQxFdCMA@mail.gmail.com
Whole thread Raw
In response to RE: Support logical replication of DDLs  ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>)
Responses Re: Support logical replication of DDLs  (vignesh C <vignesh21@gmail.com>)
RE: Support logical replication of DDLs  ("Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>)
List pgsql-hackers
On Tue, Mar 28, 2023 at 3:22 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Tuesday, March 28, 2023 1:41 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Mar 27, 2023 at 5:37 PM Amit Kapila <amit.kapila16@gmail.com>
> > wrote:
> > >
> > > On Mon, Mar 27, 2023 at 12:07 PM Amit Kapila <amit.kapila16@gmail.com>
> > wrote:
> > > >
> > > > On Mon, Mar 27, 2023 at 2:52 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > > >
> > > >
> > > > > I suggest taking a couple of steps back from the minutiae of the
> > > > > patch, and spending some hard effort thinking about how the thing
> > > > > would be controlled in a useful fashion (that is, a real design
> > > > > for the filtering that was mentioned at the very outset), and
> > > > > about the security issues, and about how we could get to a committable
> > patch.
> > > > >
> > > >
> > > > Agreed. I'll try to summarize the discussion we have till now on
> > > > this and share my thoughts on the same in a separate email.
> > > >
> > >
> > > The idea to control what could be replicated is to introduce a new
> > > publication option 'ddl' along with current options 'publish' and
> > > 'publish_via_partition_root'. The values of this new option could be
> > > 'table', 'function', 'all', etc. Here 'all' enables the replication of
> > > all supported DDL commands. Example usage for this would be:
> > > Example:
> > > Create a new publication with all ddl replication enabled:
> > >   CREATE PUBLICATION pub1 FOR ALL TABLES with (ddl = 'all');
> > >
> > > Enable table ddl replication for an existing Publication:
> > >   ALTER PUBLICATION pub2 SET (ddl = 'table');
> > >
> > > This is what seems to have been discussed but I think we can even
> > > extend it to support based on operations/commands, say one would like
> > > to publish only 'create' and 'drop' of tables. Then we can extend the
> > > existing publish option to have values like 'create', 'alter', and
> > > 'drop'.
> > >
> >
> > The other idea could be to that for the new option ddl, we input command tags
> > such that the replication will happen for those commands.
> > For example, ALTER PUBLICATION pub2 SET (ddl = 'Create Table, Alter
> > Table, ..'); This will obviate the need to have additional values like 'create', 'alter',
> > and 'drop' for publish option.
> >
> > The other thought related to filtering is that one might want to filter DDLs and
> > or DMLs performed by specific roles in the future. So, we then need to
> > introduce another option ddl_role, or something like that.
> >
> > Can we think of some other kind of filter for DDL replication?
>
> I am thinking another generic syntax for ddl replication like:
>
> --
> CREATE PUBLICATION pubname FOR object_type object_name with (publish = 'ddl_type');
> --
>
> To replicate DDLs that happened on a table, we don't need to add new syntax or
> option, we can extend the value for the 'publish' option like:
>
> To support more non-table objects replication, we can follow the same style and write it like:
> --
> CRAETE PUBLICATION FOR FUNCTION f1 with (publish = 'alter'); -- function
> CRAETE PUBLICATION FOR ALL OPERATORS IN SCHEMA op_schema with (publish = 'drop'); -- operators
> CRAETE PUBLICATION FOR ALL OBJECTS with (publish = 'alter, create, drop'); -- everything
> --
>
> In this approach, we extend the publication grammar and users can
> filter the object schema, object name, object type and ddltype. We can also add
> more options to filter role or other infos in the future.

In this approach, does the subscriber need to track what objects have
been subscribed similar to tables? For example, suppose that we
created a publication for function func1 and created a subscription
for the publication. What if we add function func2 to the publication?
If we follow the current behavior, DDLs for func2 will be replicated
to the subscriber but the subscriber won't apply it unless we refresh
the publication of the subscription. So it seems to me that the
subscriber needs to have a list of subscribed functions, and we will
end up having lists of all types of objects.

>
> ~~~~
>
> One more alternative could be like:
>
> One more alternative could be like:
> CREATE PUBLICATION xx FOR pub_create_alter_table WITH (ddl = 'table:create,alter'); -- it will publish create table
andalter table operations. 
> CREATE PUBLICATION xx FOR pub_all_table WITH (ddl = 'table:all'); -- This means all table operations
create/alter/drop
> CREATE PUBLICATION xx FOR pub_all_table WITH (ddl = 'table'); -- same as above
>
> This can be extended later to:
> CREATE PUBLICATION xx FOR pub_all_func WITH (ddl = 'function:all');
> CREATE PUBLICATION xx FOR pub_create_trigger (ddl = 'trigger:create');
>
> In this approach, we don't need to add more stuff in gram.y and
> will give fine-grained control as well.

What did you mean by pub_create_alter_table, pub_all_table,
pg_all_func, and pub_create_trigger? Are they table names or some
special keywords indicating groups of objects?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: Support logical replication of DDLs
Next
From: John Naylor
Date:
Subject: Re: Overhauling "Routine Vacuuming" docs, particularly its handling of freezing