Home > mailing lists

Re: Support logical replication of DDLs, take2 - Mailing list pgsql-hackers

From	Hannu Krosing
Subject	Re: Support logical replication of DDLs, take2
Date	April 30 17:02:56
Msg-id	CAMT0RQS-Tr9kuP6ChJaL7UjzY1gbtnAqOuPHZesKVpSkwHJiVw@mail.gmail.com Whole thread
In response to	Re: Support logical replication of DDLs, take2 (Andres Freund <andres@anarazel.de>)
Responses	Re: Support logical replication of DDLs, take2
List	pgsql-hackers

Tree view

On Wed, Apr 29, 2026 at 2:10 PM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2026-04-29 10:07:04 +0200, Hannu Krosing wrote:
> > On Wed, Apr 29, 2026 at 5:39 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > > I am trying to understand your idea. If we are trying to deparse from
> > > an actual system table using a snapshot, why don't we just use the
> > > WAL? I mean, the WAL should contain the actual catalog modifications
> > > it has made.
> >
> > We have the full data in the catalog and we would likely need catalog
> > queries for any change, even when de-parsing the tree.
> >
> > And we should not add the extra load on the original DDL side, just as
> > we don't for DML.
>
> That can't be a relevant cost compared to everything else.

Probably not. But unless we somehow encode "everything" at that point
we will make building different DDL decoders harder down the line.

So why not just save the normally serialised parse tree at this point
and let the decoders decide to do whatever they need.

> > At most we could just serialize the statement tree into the WAL,
> > though even that may be an overkill if we can get the change from
> > existing records.
> >
> > - insert new row in pg_class --> extract the CREATE TABLE (or INDEX, or ...)
> > - update row in pg_class or insert, update or delete a row in
> > pg_attribute --> extract ALTER TABLE
> >   - except when it just updates relfilenod --> extract TRUNCATE
> > - delete row in pg_class --> DROP TABLE
> > - dml on pg_constraint --> ALTER TABLE
> >
> > ... etc
>
> That doesn't work in the general case, think of
> ALTER TABLE ... ALTER COLUMN ... TYPE foo USING (...)
>
> There's a big difference between USING(foo::int8) and USING (pg_size_bytes(foo))
> but it's nowhere visible in the WAL.

It can't be a big difference if it is not visible in the WAL.

Currently, we do treat DML exactly this way (or arguably worse).

In the following all the updates are decoded exactly the same

CREATE TABLE t(id int primary key, data text);

INSERT INTO t VALUES(1, 'one');

UPDATE t SET data='one' where id=1;
UPDATE t SET id=id;
UPDATE t SET id=10-9;
UPDATE t SET data='one';

ALL of the above get decoded as "UPDATE t SET data='one' where id=1;"

That is, we do not care how the values got there, as long as the end
result is the same.

And we do not track which fields were actually changed


The only reasons I see why we could not do the same for DDL are
1. it would be significantly more expensive to do so
or
2. we plan to fix some of that for DML as well and to start tracking
more of the intent in DML by extracting that from the statement trees.

--
Hannu

pgsql-hackers by date:

From: Dilip Kumar
Date: 30 April, 16:53:36
Subject: Re: Include schema-qualified names in publication error messages.

From: Ayush Tiwari
Date: 30 April, 17:29:38
Subject: Re: Spurious warnings in crypto-des.c when building with gcc-16 -O3

Re: Support logical replication of DDLs, take2 - Mailing list pgsql-hackers

Previous

Next