Re: Support logical replication of DDLs - Mailing list pgsql-hackers
From | Zheng Li |
---|---|
Subject | Re: Support logical replication of DDLs |
Date | |
Msg-id | CAAD30UJ8v7RAxfLLx-nVO9COKNeBiKVzcwu1LdCe5dZJU2bUDA@mail.gmail.com Whole thread Raw |
In response to | Re: Support logical replication of DDLs (Dilip Kumar <dilipbalaut@gmail.com>) |
Responses |
Re: Support logical replication of DDLs
|
List | pgsql-hackers |
Hi Dilip, Thanks for the feedback. > > > > The table creation WAL and table insert WAL are available. The tricky > > > > part is how do we break down this command into two parts (a normal > > > > CREATE TABLE followed by insertions) either from the parsetree or the > > > > WALs. I’ll have to dig more on this. > > I had put some more thought about this, basically, during CTAS we are > > generating the CreateStmt inside "create_ctas_internal" and executing > > it first before inserting the tuple, so can't we generate the > > independent sql just for creating the tuple maybe using deparsing or > > something? Yes, deparsing might help for edge cases like this. However I found a simple solution for this specific case: The idea is to force skipping any direct data population (which can potentially cause data inconsistency on the subscriber) in CREATE AS and SELECT INTO command on the subscriber by forcing the skipData flag in the intoClause of the parsetree after the logical replication worker parses the command. The data sync will be taken care of by the DML replication after the DDL replication finishes. This is implemented in the latest commit: https://github.com/zli236/postgres/commit/116c33451da8d44577b8d6fdb05c4b6998cd0167 > > Apart from that I have one more question, basically if you are > > directly logging the sql query then how you are identifying under > > which schema you need to create that table, are you changing the sql > > and generating schema-qualified name? > > I was going through the patch and it seems you are logging the search > path as well along with the query so I think this will probably work. Yes, currently we log the search path as well as the user name. And we enforce the same search path and user name when applying the DDL command on the subscriber. > I have got one more query while looking into the code. In the below > code snippet you are logging DDL command only if it is a top level > query but there are no comments explaining what sort of queries we > don't want to log. Suppose I am executing a DDL statement inside a PL > then that will not be a top level statement so is your intention to > block that as well or that is an unintentional side effect? > > + /* > + * Consider logging the DDL command if logical logging is > enabled and this is > + * a top level query. > + */ > + if (XLogLogicalInfoActive() && isTopLevel) > + LogLogicalDDLCommand(parsetree, queryString); Good catch. The reason for having isTopLevel in the condition is because I haven't decided if a DDL statement inside a PL should be replicated from the user point of view. For example, if I execute a plpgsql function or a stored procedure which creates a table under the hood, does it always make sense to replicate the DDL without running the same function or stored procedure on the subscriber? It probably depends on the specific use case. Maybe we can consider making this behavior configurable by the user. Thanks, Zheng
pgsql-hackers by date: