Home > mailing lists

Re: Support logical replication of DDLs - Mailing list pgsql-hackers

From	Euler Taveira
Subject	Re: Support logical replication of DDLs
Date	April 11, 2022 15:46:06
Msg-id	45d0d97c-3322-4054-b94f-3c08774bbd90@www.fastmail.com Whole thread Raw
In response to	Re: Support logical replication of DDLs (Amit Kapila <amit.kapila16@gmail.com>)
Responses	Re: Support logical replication of DDLs Re: Support logical replication of DDLs
List	pgsql-hackers

Tree view

On Mon, Apr 11, 2022, at 2:00 AM, Amit Kapila wrote:

On Thu, Apr 7, 2022 at 3:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Mar 23, 2022 at 10:39 AM Japin Li <japinli@hotmail.com> wrote:
>
> 2. For DDL replication, do we need to wait for a consistent point of
> snapshot? For DMLs, that point is a convenient point to initialize
> replication from, which is why we export a snapshot at that point,
> which is used to read normal data. Do we have any similar needs for
> DDL replication?
>

I have thought a bit more about this and I think we need to build the
snapshot for DML replication as we need to read catalog tables to
decode the corresponding WAL but it is not clear to me if we have a
similar requirement for DDL replication. If the catalog access is
required then it makes sense to follow the current snapshot model,
otherwise, we may need to think differently for DDL replication.

One more related point is that for DML replication, we do ensure that
we copy the entire data of the table (via initial sync) which exists
even before the publication for that table exists, so do we want to do
something similar for DDLs? How do we sync the schema of the table
before the user has defined the publication? Say the table has been
created before the publication is defined and after that, there are
only Alter statements, so do we expect, users to create the table on
the subscriber and then we can replicate the Alter statements? And
even if we do that it won't be clear which Alter statements will be
replicated after publication is defined especially if those Alters
happened concurrently with defining publications?

The *initial* DDL replication is a different problem than DDL replication. The

former requires a snapshot to read the current catalog data and build a CREATE

command as part of the subscription process. The subsequent DDLs in that object

will be handled by a different approach that is being discussed here.

I'm planning to work on the initial DDL replication. I'll open a new thread as

soon as I write a design for it. Just as an example, the pglogical approach is

to use pg_dump behind the scenes to provide the schema [1]. It is a reasonable

approach but an optimal solution should be an API to provide the initial DDL

commands. I mean the main point of this feature is to have an API to create an

object that the logical replication can use it for initial schema

synchronization. This "DDL to create an object" was already discussed in the

past [2].

[1] https://github.com/2ndQuadrant/pglogical/blob/REL2_x_STABLE/pglogical_sync.c#L942

[2] https://www.postgresql.org/message-id/4E69156E.5060509%40dunslane.net

Euler Taveira

EDB https://www.enterprisedb.com/

pgsql-hackers by date:

From: gkokolatos@pm.me
Date: 11 April 2022, 15:46:02
Subject: Re: Fixes for compression options of pg_receivewal and refactoring of backup_compression.{c,h}

From: Bharath Rupireddy
Date: 11 April 2022, 16:03:06
Subject: Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

Re: Support logical replication of DDLs - Mailing list pgsql-hackers

Previous

Next