Re: Support logical replication of DDLs - Mailing list pgsql-hackers

From Euler Taveira
Subject Re: Support logical replication of DDLs
Date
Msg-id 45d0d97c-3322-4054-b94f-3c08774bbd90@www.fastmail.com
Whole thread Raw
In response to Re: Support logical replication of DDLs  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Support logical replication of DDLs
Re: Support logical replication of DDLs
List pgsql-hackers
On Mon, Apr 11, 2022, at 2:00 AM, Amit Kapila wrote:
On Thu, Apr 7, 2022 at 3:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Mar 23, 2022 at 10:39 AM Japin Li <japinli@hotmail.com> wrote:
>
> 2. For DDL replication, do we need to wait for a consistent point of
> snapshot? For DMLs, that point is a convenient point to initialize
> replication from, which is why we export a snapshot at that point,
> which is used to read normal data. Do we have any similar needs for
> DDL replication?
>

I have thought a bit more about this and I think we need to build the
snapshot for DML replication as we need to read catalog tables to
decode the corresponding WAL but it is not clear to me if we have a
similar requirement for DDL replication. If the catalog access is
required then it makes sense to follow the current snapshot model,
otherwise, we may need to think differently for DDL replication.

One more related point is that for DML replication, we do ensure that
we copy the entire data of the table (via initial sync) which exists
even before the publication for that table exists, so do we want to do
something similar for DDLs? How do we sync the schema of the table
before the user has defined the publication? Say the table has been
created before the publication is defined and after that, there are
only Alter statements, so do we expect, users to create the table on
the subscriber and then we can replicate the Alter statements? And
even if we do that it won't be clear which Alter statements will be
replicated after publication is defined especially if those Alters
happened concurrently with defining publications?
The *initial* DDL replication is a different problem than DDL replication. The
former requires a snapshot to read the current catalog data and build a CREATE
command as part of the subscription process. The subsequent DDLs in that object
will be handled by a different approach that is being discussed here.

I'm planning to work on the initial DDL replication. I'll open a new thread as
soon as I write a design for it. Just as an example, the pglogical approach is
to use pg_dump behind the scenes to provide the schema [1]. It is a reasonable
approach but an optimal solution should be an API to provide the initial DDL
commands. I mean the main point of this feature is to have an API to create an
object that the logical replication can use it for initial schema
synchronization. This "DDL to create an object" was already discussed in the
past [2].




--
Euler Taveira

pgsql-hackers by date:

Previous
From: gkokolatos@pm.me
Date:
Subject: Re: Fixes for compression options of pg_receivewal and refactoring of backup_compression.{c,h}
Next
From: Bharath Rupireddy
Date:
Subject: Re: pg_walinspect - a new extension to get raw WAL data and WAL stats