Re: Support logical replication of DDLs, take2 - Mailing list pgsql-hackers
| From | Masahiko Sawada |
|---|---|
| Subject | Re: Support logical replication of DDLs, take2 |
| Date | |
| Msg-id | CAD21AoACTBsVvyQUKuWuM7qs6-WV4of9SkdUvSv9KHyC_W+K1A@mail.gmail.com Whole thread |
| In response to | Re: Support logical replication of DDLs, take2 (Amit Kapila <amit.kapila16@gmail.com>) |
| List | pgsql-hackers |
On Mon, Apr 27, 2026 at 11:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Apr 21, 2026 at 4:45 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Mon, Feb 23, 2026 at 5:21 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > >
> > > One idea I'm experimenting with is that we define an abstract data
> > > type that can represent a DDL (like CollectedCommand) and write it to
> > > a new WAL record so that logical decoding processes it. For CREATE
> > > DDLs, we can use pg_get_xxx_def() function while using a historical
> > > snapshot to get the DDLs. We would need to implement the codes to
> > > generate DROP and ALTER DDLs from the data. I believe DROP DDLs would
> > > not be hard. For ALTER DDLs, we would incur the initial implementation
> > > costs, but we would not change these codes often.
> > >
> >
> > DDL support for logical replication is one of the biggest missing
> > pieces in logical replication. I'd like to resume this work for PG20.
> >
> > We made a lot of effort on this feature through 2022 and 2023, but the
> > development is currently inactive. The last patch was submitted on Jul
> > 18, 2023. I've reviewed the previous patches and discussions, and I
> > would like to summarize how DDL replication was implemented, the main
> > reasons it stalled, and propose an alternative design to address those
> > problems.
> >
> > The overall idea of the previous patch set was to implement DDL
> > deparsing and utilize it for DDL replication. It converted a parse
> > tree into a JSON string. For instance, if a user executes "DROP TABLE
> > t1", the deparser generates from its parse tree:
> >
> > {DROPSTMT :objects (("t1")) :removeType 41 :behavior 0 :missing_ok
> > false :concurrent false}
> >
> > to:
> >
> > {"fmt": "DROP TABLE %{objidentity}s", "objidentity": "public.t1"}
> >
> > This JSON string is self-documenting, meaning someone who gets it can
> > easily reconstruct the original DDL with schema-qualified object
> > names. In a dedicated event trigger for logical replication, we
> > deparsed the parse tree of a DDL, wrote it into a WAL record, and then
> > the logical decoding processed it similarly to DML changes.
> >
> > While there are several benefits to the JSON data approach mentioned
> > in the wiki [1] -- most notably the flexibility to easily remap
> > schemas (e.g., mapping "schema A" on the publisher to "schema B" on
> > the subscriber) -- there was a major concern: the huge maintenance
> > burden.
> >
>
> Yes, there will be a maintenance cost of JSON-based deparsing
> approach. But note that multiple senior people (Alvaro Herrera, Robert
> Haas) [1] seems to favor that approach. So, I am not sure we can
> conclude to abandon that approach without those people or some other
> senior people agreeing to abandon it. To be clear, I am not against
> considering a new/different approach for DDL replication but just that
> it is not clear that old/existing approach can be ruled out without
> more discussion on it,
Thank you for pointing it out. Just to be clear, IIUC what they liked
was to use JSON string representation of DDLs, but not JSON string
representation of DDLs that are deparsed from parse nodes, no? I think
if we do versioning the DDL commands sent to subscribers, we can
support JSON-based DDLs in later versions.
>
> We would need to maintain the JSON serialization code whenever
> > creating or modifying parse nodes, regardless of whether the changes
> > were related to DDL replication. IIUC, this was the primary reason the
> > feature didn't cross the finish line.
> >
> > Additionally, I think there is another design issue: it is not
> > output-plugin agnostic. Since the deparsed DDL was written by a
> > logical-replication-specific event trigger, third-party logical
> > decoding plugins cannot easily detect DDL events.
> >
>
> Why RmgrId like RM_LOGICALDDLMSG_ID and XLOG_LOGICAL_DDL_MESSAGE wal
> info is not sufficient for this? Decoder will add a message like
> REORDER_BUFFER_CHANGE_DDL which can be used to detect DDL message, no?
Right, but I'm not sure this is a good developer experience that
additional steps are required to capture DDL events for other plugins
while changes of INSERT/UPDATE/DELETE/TRUNCATE are passed from the
logical decoding by default.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: