RE: Support logical replication of DDLs - Mailing list pgsql-hackers
From | houzj.fnst@fujitsu.com |
---|---|
Subject | RE: Support logical replication of DDLs |
Date | |
Msg-id | OS0PR01MB5716DBA31AC57163D2CF465E94B09@OS0PR01MB5716.jpnprd01.prod.outlook.com Whole thread Raw |
In response to | Re: Support logical replication of DDLs (Zheng Li <zhengli10@gmail.com>) |
Responses |
RE: Support logical replication of DDLs
Re: Support logical replication of DDLs |
List | pgsql-hackers |
On Saturday, June 18, 2022 3:38 AM Zheng Li <zhengli10@gmail.com> wrote: > On Wed, Jun 15, 2022 at 12:00 AM Amit Kapila <amit.kapila16@gmail.com> > wrote: > > > > On Wed, Jun 15, 2022 at 5:44 AM Zheng Li <zhengli10@gmail.com> wrote: > > > > > > > > > While I agree that the deparser is needed to handle the potential > > > syntax differences between the pub/sub, I think it's only relevant > > > for the use cases where only a subset of tables in the database are > > > replicated. For other use cases where all tables, functions and > > > other objects need to be replicated, (for example, creating a > > > logical replica for major version upgrade) there won't be any syntax > > > difference to handle and the schemas are supposed to match exactly > > > between the pub/sub. In other words the user seeks to create an > > > identical replica of the source database and the DDLs should be > > > replicated as is in this case. > > > > > > > I think even for database-level replication we can't assume that > > source and target will always have the same data in which case "Create > > Table As ..", "Alter Table .. " kind of statements can't be replicated > > as it is because that can lead to different results. > "Create Table As .." is already handled by setting the skipData flag of the > statement parsetreee before replay: > > /* > * Force skipping data population to avoid data inconsistency. > * Data should be replicated from the publisher instead. > */ > castmt->into->skipData = true; > > "Alter Table .. " that rewrites with volatile expressions can also be handled > without any syntax change, by enabling the table rewrite replication and > converting the rewrite inserts to updates. ZJ's patch introduced this solution. > I've also adopted this approach in my latest patch > 0012-Support-replication-of-ALTER-TABLE-commands-that-rew.patch > > > The other point > > is tomorrow we can extend the database level option/syntax to exclude > > a few objects (something like [1]) as well in which case we again need > > to filter at the publisher level > > I think for such cases it's not full database replication and we could treat it as > table level DDL replication, i.e. use the the deparser format. Hi, Here are some points in my mind about the two approaches discussed here. 1) search_patch vs schema qualify Again, I still think it will bring more flexibility and security by schema qualify the objects in DDL command as mentioned before[1]. Besides, a schema qualified DDL is also more appropriate for other use cases(e.g. a table-level replication). As it's possible the schema is different between pub/sub and it's easy to cause unexpected and undetectable failure if we just log the search_path. It makes more sense to me to have the same style WAL log(schema qualified) for both database level or table level replication as it will bring more flexibility. > "Create Table As .." is already handled by setting the skipData flag of the > statement parsetreee before replay: 2) About the handling of CREATE TABLE AS: I think it's not a appropriate approach to set the skipdata flag on subscriber as it cannot handle EXECUTE command in CTAS. CREATE TABLE q5_prep_results AS EXECUTE q5(200, 'DTAAAA'); The Prepared statement is a temporary object which we don't replicate. So if you directly execute the original SQL on subscriber, even if you set skipdata it will fail. I think it difficult to make this work as you need handle the create/drop of this prepared statement. And even if we extended subscriber's code to make it work, it doesn't seems like a standard and elegant approach. > "Alter Table .. " that rewrites with volatile expressions can also be handled > without any syntax change, by enabling the table rewrite replication and > converting the rewrite inserts to updates. ZJ's patch introduced this solution. 3) About the handling of ALTER TABLE rewrite. The approach I proposed before is based on the event trigger + deparser approach. We were able to improve that approach as we don't need to replicate the rewrite in many cases. For example: we don't need to replicate rewrite dml if there is no volatile/mutable function. We should check and filter these case at publisher (e.g. via deparser) instead of checking that at subscriber. Besides, as discussed, we need to give warning or error for the cases when DDL contains volatile function which would be executed[2]. We should check this at publisher as well(via deparser). > I think for such cases it's not full database replication and we could treat it as > table level DDL replication, i.e. use the the deparser format. 4) I think the point could be that we should make the WAL log format extendable so that we can extend it to support more useful feature(table filter/schema maps/DDL filter). If we just WAL log the original SQL, it seems it's difficult to extend it in the future ? [1] https://www.postgresql.org/message-id/202204081134.6tcmf5cxl3sz%40alvherre.pgsql [2] https://www.postgresql.org/message-id/CAA4eK1JVynFsj%2BmcRWj9sewR2yNUs6LuNxJ0eN-gNJ83oKcUOQ%40mail.gmail.com Best regards, Hou zj
pgsql-hackers by date: