Home > mailing lists
Re: Support logical replication of DDLs, take2 - Mailing list pgsql-hackers

From	Masahiko Sawada
Subject	Re: Support logical replication of DDLs, take2
Date	May 5 20:36:24
Msg-id	CAD21AoCF=4HSCfjyAjRbgAgzQGd9hpdqV2Lq4Xuw21+h+dkA+g@mail.gmail.com Whole thread
In response to	Re: Support logical replication of DDLs, take2 (Amit Kapila <amit.kapila16@gmail.com>)
List	pgsql-hackers
Tree view
On Mon, May 4, 2026 at 5:23 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, May 1, 2026 at 2:11 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Apr 29, 2026 at 9:44 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Wed, Apr 29, 2026 at 3:19 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Mon, Apr 27, 2026 at 11:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > Yes, there will be a maintenance cost of JSON-based deparsing
> > > > > approach. But note that multiple senior people (Alvaro Herrera, Robert
> > > > > Haas) [1] seems to favor that approach. So, I am not sure we can
> > > > > conclude to abandon that approach without those people or some other
> > > > > senior people agreeing to abandon it. To be clear, I am not against
> > > > > considering a new/different approach for DDL replication but just that
> > > > > it is not clear that old/existing approach can be ruled out without
> > > > > more discussion on it,
> > > >
> > > > Thank you for pointing it out. Just to be clear, IIUC what they liked
> > > > was to use JSON string representation of DDLs, but not JSON string
> > > > representation of DDLs that are deparsed from parse nodes, no?
> > > >
> > >
> > > As per my understanding, we built deparsing stuff with a goal of
> > > supporting DDL replication and Alvaro was the original author of that
> > > work, see [1]. The benefit it provides flexibility in terms of
> > > filtering by decoding plugin, if any, or changing the DDL (like
> > > schema-mapping) during apply. It is not clear to me if we can achive
> > > similar level of flexibility with other approach.
> >
> > I think we can generate the same JSON-string representation of DDLs
> > from catalog information, it would also require a lot of code, though.
> > It would be independent from parse nodes and if we implement it as an
> > option for pg_get_xxx_ddl() functionality it would be able to be
> > reused by other tools too.
> >
>
> IIRC, this was discussed previously as well but we were not sure if we
> can build all (especially some complex ones) without parsetree. See
> discussion/emails around [1][2].

Right. We would need parsetree somewhat. I think we're able to
generate CREATE and DROP TABLE statements for the particular table
without parsetree. But as for generating CREATE TABLE for the table,
it's going to be a combination of CREATE/ALTER TABLE/INDEX/SEQUENCE
statements, like pg_dump does. For instance, if a user executes
"CREATE TABLE foo (id serial primary key)", we create table, sequence,
and index, but searching system catalogs doesn't tell us these objects
are created in one statement. So we would generate multiple DDLs as
follow:

CREATE TABLE public.foo (id integer NOT NULL);
CREATE SEQUENCE public.foo_id_seq AS integer ...;
ALTER SEQUENCE public.foo OWNED BY foo;
ALTER TABLE public.foo ADD CONSTRAINT foo_pkey PRIMARY KEY (id);

While these queries create the same table as the one on the publisher,
we need to consider whether it's okay to replicate these queries
instead of the oen statement originally executed on the publisher. If
we can use something like pg_get_table_ddl() in DDL replication, that
function would be able to be used also by the initial schema
synchronization.

As for ALTER TABLE, we would need parsetree of ALTER TABLE subcommands.

>
> > >
> > > >
> > > > >
> > > > >  We would need to maintain the JSON serialization code whenever
> > > > > > creating or modifying parse nodes, regardless of whether the changes
> > > > > > were related to DDL replication. IIUC, this was the primary reason the
> > > > > > feature didn't cross the finish line.
> > > > > >
> > > > > > Additionally, I think there is another design issue: it is not
> > > > > > output-plugin agnostic. Since the deparsed DDL was written by a
> > > > > > logical-replication-specific event trigger, third-party logical
> > > > > > decoding plugins cannot easily detect DDL events.
> > > > > >
> > > > >
> > > > > Why RmgrId like RM_LOGICALDDLMSG_ID and XLOG_LOGICAL_DDL_MESSAGE wal
> > > > > info is not sufficient for this? Decoder will add a message like
> > > > > REORDER_BUFFER_CHANGE_DDL which can be used to detect DDL message, no?
> > > >
> > > > Right, but I'm not sure this is a good developer experience that
> > > > additional steps are required to capture DDL events for other plugins
> > > > while changes of  INSERT/UPDATE/DELETE/TRUNCATE are passed from the
> > > > logical decoding by default.
> > > >
> > >
> > > Yes, there could probably be additional steps for plugins but they
> > > must be doing a few things already which are defined at publication
> > > level like column lists, row filtering, something related to RI, etc.
> >
> > I think those publication-level features operate at a somewhat
> > different layer than the fundamental mechanism of capturing DDL
> > events. Plugins filter rows or columns based on configuration, but the
> > logical decoding itself guarantees that the DML events are reliably
> > passed to them. Given that the TRUNCATE in logical replication already
> > works so, I guess DDL should have the same fundamental guarantee.
> >
> > it's unclear to me how plugins could reliably manage these event
> > triggers. While a plugin might create an event trigger during the
> > startup callback if it doesn't exist, it cannot drop it during the
> > shutdown callback. We also cannot establish a dependency between an
> > event trigger and a logical replication slot. We would likely need to
> > invent a new plugin callback specifically invoked at slot drop time
> > just to clean it up. Also, if different plugins want to capture DDL
> > events, they could end up registering different event triggers,
> > emitting multiple DDL WAL records for the same DDL event.
> >
>
> Why would different plugins end up registering different event
> triggers? I mean if they are already registered by the first plugin
> what is the need to re-register.
>

Since you mentioned column lists and row filtering as examples of what
individual plugins already do in a reply to my point that registering
event triggers could be an additional step for other plugins, I
thought you meant that each plugin registering event triggers is not a
huge cumbersome. I think different plugins don't need to register
different event triggers. We can have the common event triggers to
write logical-DDL WAL and register them when the first logical slot is
created. But as I mentioned, we need to be careful about both the
concurrent slot creation/drop and the fact that slot creation/drop
operations are not transactional. Also, we cannot create event
triggers on the replicas even if a logical slot is created there. If a
failover happens before applying the WAL of creating event triggers,
we would need to somehow make sure that even triggers are created on
the new primary if it has logical slots.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date:
From: Tom Lane
Date: 05 May, 19:56:09
Subject: Re: small cleanup for s_lock.h
From: Nathan Bossart
Date: 05 May, 20:52:37
Subject: remove pg_spin_delay() from atomics code
Re: Support logical replication of DDLs, take2 - Mailing list pgsql-hackers

Previous

Next