Re: Support logical replication of DDLs - Mailing list pgsql-hackers

From Zheng Li
Subject Re: Support logical replication of DDLs
Date
Msg-id CAAD30ULe-cZTELQJbcAahsOFoUO-ftMxorwBTmj7uYK=_=mwxg@mail.gmail.com
Whole thread Raw
In response to RE: Support logical replication of DDLs  ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>)
Responses RE: Support logical replication of DDLs
List pgsql-hackers
> > > > > I've not looked at these patches in-depth yet but with this approach,
> > > > > what do you think we can handle the DDL syntax differences between
> > > > > major versions? DDL syntax or behavior could be changed by future
> > > > > changes and I think we need to somehow deal with the differences. For
> > > >
> > > > > example, if the user uses logical replication for major version
> > > > > upgrade, the publisher is older than the subscriber. We might have to
> > > > > rewrite the DDL before applying to the subscriber because the DDL
> > > > > executed on the publisher no longer work on a new PostgreSQL version
> > > >
> > > > I don't think we will allow this kind of situation to happen in the
> > > > first place for
> > > > backward compatibility. If a DDL no longer works on a new version of
> > > > PostgreSQL, the user will have to change the application code as well.
> > > > So even if it happens for
> > > > whatever reason, we could either
> > > > 1. fail the apply worker and let the user fix such DDL because they'll
> > > > have to fix the application code anyway when this happens.
> > > > 2. add guard rail logic in the apply worker to automatically fix such
> > > > DDL if possible, knowing the version of the source and target. Similar
> > > > logic must have been implemented for pg_dump/restore/upgrade.
> > > >
> > > > > or we might have to add some options to the DDL before the application
> > > > > in order to keep the same behavior. This seems to require a different
> > > > > solution from what the patch does for the problem you mentioned such
> > > >
> > > > > as "DDL involving multiple tables where only some tables are
> > > > > replicated”.
> > > >
> > > > First of all, this case can only happen when the customer chooses to
> > > > only replicate a subset of the tables in a database in which case
> > > > table level DDL replication is chosen instead of database level DDL
> > > > replication (where all tables
> > > > and DDLs are replicated). I think the solution would be:
> > > > 1. make best effort to detect such DDLs on the publisher and avoid
> > > > logging of such DDLs in table level DDL replication.
> > > > 2. apply worker will fail to replay such command due to missing
> > > > objects if such DDLs didn't get filtered on the publisher for some
> > > > reason. This should be rare and I think it's OK even if it happens,
> > > > we'll find out
> > > > why and fix it.
> > > >
> > >
> > > FWIW, both these cases could be handled with the deparsing approach,
> > > and the handling related to the drop of multiple tables where only a
> > > few are published is already done in the last POC patch shared by Ajin
> > > [1].
> > >
> >
> > Right. So I'm inclined to think that deparsing approach is better from
> > this point as well as the point mentioned by Álvaro before[1].
>
> I agree. One more point about deparsing approach is that it can also
> help to replicate CREATE TABLE AS/SELECT INTO in a better way.
>
> The main idea of replicating the CREATE TABLE AS is that we deprase the CREATE
> TABLE AS into a simple CREATE TABLE(without subquery) command and WAL log it
> after creating the table and before writing data into the table and replicate
> the incoming writes later as normal INSERTs. In this apporach, we don't execute
> the subquery on subscriber so that don't need to make sure all the objects
> referenced in the subquery also exists in subscriber. And This approach works
> for all kind of commands(e.g. CRAETE TABLE AS [SELECT][EXECUTE][VALUES])
>
> One problem of this approach is that we cannot use the current trigger to
> deparse or WAL log the CREATE TABLE. Because none of the even trigger is fired
> after creating the table and before inserting the data. To solve this, one idea
> is that we could directly add some code at the end of create_ctas_internal() to
> deparse and WAL log it. Moreover, we could even introduce a new type of event
> trigger(table_create) which would be fired at the expected timing so that we
> can use the trigger function to deparse and WAL log. I am not sure which way is
> better. I temporarily use the second idea which introduce a new type event
> trigger in the 0003 POC patch.

Hi, I agree that an intermediate structured format (with all objects
schema qualified) makes it easier to handle syntax differences between
the publisher and the subscriber. Such structured format is also
likely easier to use by other logical replication consumers.
However, to make things more maintainable, would it be better to use
the existing serialization/deserialization functions in
out/readfuncs.c to generate the parsetree representation of the DDL
command?

It turns out support for DDL commands are mostly missing in
out/readfuncs at the moment. see the following comment from
outfuncs.c:

* Output support for raw parsetrees
* is somewhat incomplete, too; in particular, utility statements are
* almost entirely unsupported. We try to support everything that can
* appear in a raw SELECT, though.

So what about adding support for utility statements in out/readfuncs.c
so that nodeToString()/stringToNode() works on raw parsetrees of
utility statements? I think there'll be some benefits:
1. It's less code compared to introducing brand new
serialization/deserialization code for DDL commands.
2. It's more maintainable since hackers are already familiar with
out/readfuncs.c.
3. Support for utility statements in out/readfuncs.c may turn out to
be useful for future features that need serialization/deserialization
of DDL raw parsetrees.

In patch 0012 I explored the above idea by changing the logging format
of logical ddlmessage to that of the serialization of the raw
parsetree of the DDL command using nodeToString(), the apply worker
deserializes the message back to the raw parsetree by calling
stringToNode().
Here are the detailed changes:
    - Adding serialization/deserialization functions in outfuncs.c/readfuncs.c
      for CreateTableStmt, AlterTableStmt, DropStmt, CreateFunctionStmt and
      AlterFunctionStmt.
    - Modified the serialization process to always schema qualify object names,
      this is done by outQualifiedName() and a change in _outRangeVar().
    - Change the input of LogLogicalDDLMessage() to use nodeToString(parsetree).
    - Change the apply worker to call stringToNode(ddlmessage) to get
the raw parsetree back and then directly execute the parsetree.

This patch doesn't introduce any deparsing functionality yet, but it
also provides the flexibility to edit the command being replayed by
directly modifying the raw parsetree on the apply worker (e.g. setting
the
missing_ok flag for DropStmt is equivalent to adding "IF EXIST" to the
statement).

Thoughts?

Regards,
Zheng Li

Attachment

pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: Collation version tracking for macOS
Next
From: Dmitry Dolgov
Date:
Subject: [RFC] Add jit deform_counter