Re: Support logical replication of DDLs - Mailing list pgsql-hackers

From Zheng Li
Subject Re: Support logical replication of DDLs
Date
Msg-id CAAD30U+wX_acXs_xKJFDExUVE25xQrQPaXCxjYgb2OVG92iCtw@mail.gmail.com
Whole thread Raw
In response to Re: Support logical replication of DDLs  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Support logical replication of DDLs  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Sun, Mar 26, 2023 at 5:22 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

> I spent some time looking through this thread to try to get a sense
> of the state of things, and I came away quite depressed.  The patchset
> has ballooned to over 2MB, which is a couple orders of magnitude
> larger than anyone could hope to meaningfully review from scratch.
> Despite that, it seems that there are fundamental semantics issues
> remaining, not to mention clear-and-present security dangers, not
> to mention TODO comments all over the code.

Thanks for looking into this!

> I'm also less than sold on the technical details, specifically
> the notion of "let's translate utility parse trees into JSON and
> send that down the wire".  You can probably make that work for now,
> but I wonder if it will be any more robust against cross-version
> changes than just shipping the outfuncs.c representation.  (Perhaps
> it can be made more robust than the raw parse trees, but I see no
> evidence that anyone's thought much about how.)

I explored the idea of using the outfuncs.c representation in [1] and
found this existing format is not designed to be portable between
different major versions. So it can't be directly used for replication
without
serious modification. I think the DDL deparser is a necessary tool if
we want to be able to handle cross-version DDL syntax differences by
providing the capability to machine-edit the JSON representation.

> And TBH, I don't think that I quite believe the premise in the
> first place.  The whole point of using logical rather than physical
> replication is that the subscriber installation(s) aren't exactly like
> the publisher.  Given that, how can we expect that automated DDL
> replication is going to do the right thing often enough to be a useful
> tool rather than a disastrous foot-gun?  The more you expand the scope
> of what gets replicated, the worse that problem becomes --- for
> example, I don't buy for one second that "let's replicate roles"
> is a credible solution for the problems that come from the roles
> not being the same on publisher and subscriber.

I agree that a full fledged DDL deparser and DDL replication is too
big of a task for one patch. I think we may consider approaching this
feature in the following ways:
1. Phased development and testing as discussed in other emails.
Probably support table commands first (as they are the most common
DDLs), then the other commands in multiple phases.
2. Provide a subscription option to receive the DDL change, raise a
notice and to skip applying the change. The users can listen to the
DDL notice and implement application logic to apply the change if
needed. The idea is we can start gathering user feedback by providing
a somewhat useful feature (compared to doing nothing about DDLs), but
also avoid heading straight into the potential footgun situation
caused by automatically applying any mal-formatted DDLs.
3. As cross-version DDL syntax differences are expected to be uncommon
(in real workload), maybe we can think about other options to handle
such edge cases instead of fully automating it? For example, what
about letting the user specify how a DDL should be replicated on the
subscriber by explicitly providing two versions of DDL commands in
some way?

> I'm not sure how we get from here to a committable and useful feature,
> but I don't think we're close to that yet, and I'm not sure that minor
> iterations on a 2MB patchset will accomplish much.

About 1 MB of the patch are testing output files for the DDL deparser
(postgres/src/test/modules/test_ddl_deparse_regress/expected/).

Regards,
Zane

[1] https://www.postgresql.org/message-id/CAAD30U%2Boi6e6Vh_zAzhuXzkqUhagmLGD%2B_iyn2N9w_sNRKsoag%40mail.gmail.com



pgsql-hackers by date:

Previous
From: Dmitry Koval
Date:
Subject: Re: Add SPLIT PARTITION/MERGE PARTITIONS commands
Next
From: Tomas Vondra
Date:
Subject: Re: Add LZ4 compression in pg_dump