Re: Support logical replication of DDLs - Mailing list pgsql-hackers

From Peter Smith
Subject Re: Support logical replication of DDLs
Date
Msg-id CAHut+PuSwLYTvo042SXDrSsCcmj7BPqxJFP4hvUQ5KMmC2L5Cw@mail.gmail.com
Whole thread Raw
In response to Re: Support logical replication of DDLs  (Zheng Li <zhengli10@gmail.com>)
Responses Re: Support logical replication of DDLs
Re: Support logical replication of DDLs
List pgsql-hackers
Hi, authors on this thread.

The patch v32-0001 is very large, so it will take some time to review
the code in detail.

Meanwhile, here are some general comments about the patch:

======

1. It might be useful to add this thread to the commitfest, if only so
the cfbot can discover the latest patch set and alert about any rebase
problems.

~~~

2. User interface design/documentation?

Please consider adding user interface documentation, so it is
available for review sooner than later. This comment was previous
posted 2 weeks ago [1] but no replies.

I can only guess (from the test of patch 0004) that the idea is to use
another 'ddl' option for the 'publish' parameter:
CREATE PUBLICATION mypub FOR ALL TABLES with (publish = 'insert,
update, delete, ddl');

My first impression is that a blanket option like that could be
painful for some users who DO (for example) want the convenience of
the DDL replication automagically creating new tables on the fly, but
maybe they do NOT want the side-effect of replicating every other kind
of DDL as well.

Maybe such scenarios can be handled by another publication parameter
which can allow more fine-grained DDL replication like:
CREATE PUBLICATION mypub FOR ALL TABLES WITH (ddl = 'tables')

I also have lots of usability questions but probably the documentation
would give the answers to those. IMO the docs for the user interface
and runtime behaviours should not be deferred - they should form part
of this patch 0001.

~~~

3. Why all-or-nothing?

The current strategy for this thread appears to be to implement
*everything* in the underlying code, and then figure out what to do
with it. I'm not sure if the all-or-nothing approach is best - It just
feels risky to me, so I hope it will not result in a ton of work that
ends up unused.

Why not instead implement just some core set of the DDL replications
that are the most wanted ones (e.g. create/alter/drop
tables/whatever?) and try to get that subset committed first? Then the
remainder can be added incrementally. But this harks back to comment
#2: the user interface would need to allow flexibility to do it like
this.

~~~

4. Function name inconsistency.

Even if it was not obvious from the posts, it is clear there are
multiple authors. As an example, the file
src/backend/commands/ddl_deparse.c is 9200+ lines (and growing
rapidly) and the functions in this module are a mixture of many
different naming conventions and they seem scattered around the source
file in different ways.

I suggest this all needs to be brought under some control ASAP, by
introducing some strict naming convention and sticking to it.

For example, you might do something like:
* xxx_util_foo()
* xxx_util_bah()
* xxx_deparse_alter()
* xxx_deparse_create()
* xxx_whatever()
where xxx is the main object (table, sequence, schema, etc).

Then order everything alphabetically, so that related stuff ends up
together. IMO grouping functions like this will also make reviewing
different objects far easier.

~~~

5. File size

As mentioned above in #4, the src/backend/commands/ddl_deparse.c is
huge (9200+ lines as at v32-0001). It is already unwieldy. Is there
some way to reduce this? For example, perhaps many of those
"utility/helper" functions (even though they are static) would be
better moved out to another file simply to get things down to a more
manageable size.

------
[1] https://www.postgresql.org/message-id/CAHut%2BPtKiTmcQ7zXs6YvR-qtuMQ9wgffnfamqCAVpM_ETa2LCg%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: pg_recvlogical prints bogus error when interrupted
Next
From: Andres Freund
Date:
Subject: Re: Proposal to use JSON for Postgres Parser format