Re: Support logical replication of DDLs - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Support logical replication of DDLs
Date
Msg-id CAA4eK1+6A72+3HNrbg12wGaAy=rYPDS+AFkbePtSj=h++1ir8Q@mail.gmail.com
Whole thread Raw
In response to Re: Support logical replication of DDLs  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Responses Re: Support logical replication of DDLs  (Zheng Li <zhengli10@gmail.com>)
Re: Support logical replication of DDLs  (Ajin Cherian <itsajin@gmail.com>)
List pgsql-hackers
On Fri, Apr 8, 2022 at 5:04 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2022-Apr-08, Amit Kapila wrote:
>
> > On Thu, Mar 17, 2022 at 3:36 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> > For runtime conditions, one of the things you have mentioned in that
> > thread is to add schema name in the statement at the required places
> > which this patch deals with in a different way by explicitly sending
> > it along with the DDL statement.
>
> Hmm, ok.  The point of the JSON-blob route is that the publisher sends a
> command representation that can be parsed/processed/transformed
> arbitrarily by the subscriber using generic rules; it should be trivial
> to use a JSON tool to change schema A to schema B in any arbitrary DDL
> command, and produce another working DDL command without having to know
> how to write that command specifically.
>

Attached is a set of two patches as an attempt to evaluate this approach.

The first patch provides functions to deparse DDL commands. Currently,
it is restricted to just a simple CREATE TABLE statement, the required
code is extracted from one of the patches posted in the thread [1].

The second patch allows replicating simple CREATE TABLE DDL
replication. To do that we used an event trigger and DDL deparsing
facilities. While creating a publication, we register a command end
trigger that deparses the DDL as a JSON blob, and WAL logs it. The
event trigger is automatically removed at the time of drop
publication. The WALSender decodes the WAL and sends it downstream
similar to other DML commands. The subscriber then converts JSON back
to the DDL command string and executes it. In the subscriber, we also
add the newly added rel to pg_subscription_rel so that the DML changes
on the new table can be replicated without having to manually run
"ALTER SUBSCRIPTION ... REFRESH PUBLICATION". Some of the code related
to WAL logging and subscriber-side work is taken from the patch posted
by Zheng in this thread but there are quite a few changes in that as
we don't need schema, role, transaction vs. non-transactional
handling.

Note that for now, we have hacked Create Publication code such that
when the user specifies the "FOR ALL TABLES" clause, we invoke this
new functionality. So, this will work only for "FOR ALL TABLES"
publications. For example, we need to below to replicate the simple
Create Table command.

Publisher:
Create Publication pub1 For All Tables;

Subscriber:
Create Subscription sub1 Connection '...' Publication pub1;

Publisher:
Create Table t1(c1 int);

Subscriber:
\d should show t1.

As we have hacked CreatePublication function for this POC, the
regression tests are not passing but we can easily change it so that
we invoke new functionality with the syntax proposed in this thread or
with some other syntax and we shall do that in the next patch unless
this approach is not worth pursuing.

This POC is prepared by Ajin Cherian, Hou-San, and me.

Thoughts?

[1] - https://www.postgresql.org/message-id/20150215044814.GL3391%40alvh.no-ip.org

-- 
With Regards,
Amit Kapila.

Attachment

pgsql-hackers by date:

Previous
From: Matheus Alcantara
Date:
Subject: Trying to add more tests to gistbuild.c
Next
From: Tom Lane
Date:
Subject: Re: Fix typo in comment