RE: Support logical replication of DDLs - Mailing list pgsql-hackers

From houzj.fnst@fujitsu.com
Subject RE: Support logical replication of DDLs
Date
Msg-id OS0PR01MB5716B1526C2EDA66907E733B94B39@OS0PR01MB5716.jpnprd01.prod.outlook.com
Whole thread Raw
In response to RE: Support logical replication of DDLs  ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>)
Responses Re: Support logical replication of DDLs
Re: Support logical replication of DDLs
Re: Support logical replication of DDLs
List pgsql-hackers
On Monday, June 20, 2022 11:32 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:
> 
> On Saturday, June 18, 2022 3:38 AM Zheng Li <zhengli10@gmail.com> wrote:
> > On Wed, Jun 15, 2022 at 12:00 AM Amit Kapila <amit.kapila16@gmail.com>
> > wrote:
> > >
> > > On Wed, Jun 15, 2022 at 5:44 AM Zheng Li <zhengli10@gmail.com> wrote:
> > > >
> > > >
> > > > While I agree that the deparser is needed to handle the potential
> > > > syntax differences between the pub/sub, I think it's only relevant
> > > > for the use cases where only a subset of tables in the database are
> > > > replicated. For other use cases where all tables, functions and
> > > > other objects need to be replicated, (for example, creating a
> > > > logical replica for major version upgrade) there won't be any syntax
> > > > difference to handle and the schemas are supposed to match exactly
> > > > between the pub/sub. In other words the user seeks to create an
> > > > identical replica of the source database and the DDLs should be
> > > > replicated as is in this case.
> > > >
> > >
> > > I think even for database-level replication we can't assume that
> > > source and target will always have the same data in which case "Create
> > > Table As ..", "Alter Table .. " kind of statements can't be replicated
> > > as it is because that can lead to different results.
> > "Create Table As .." is already handled by setting the skipData flag of the
> > statement parsetreee before replay:
> >
> > /*
> > * Force skipping data population to avoid data inconsistency.
> > * Data should be replicated from the publisher instead.
> > */
> > castmt->into->skipData = true;
> >
> > "Alter Table .. " that rewrites with volatile expressions can also be handled
> > without any syntax change, by enabling the table rewrite replication and
> > converting the rewrite inserts to updates. ZJ's patch introduced this solution.
> > I've also adopted this approach in my latest patch
> > 0012-Support-replication-of-ALTER-TABLE-commands-that-rew.patch
> >
> > > The other point
> > > is tomorrow we can extend the database level option/syntax to exclude
> > > a few objects (something like [1]) as well in which case we again need
> > > to filter at the publisher level
> >
> > I think for such cases it's not full database replication and we could treat it as
> > table level DDL replication, i.e. use the the deparser format.
> 
> Hi,
> 
> Here are some points in my mind about the two approaches discussed here.
> 
> 1) search_patch vs schema qualify
> 
> Again, I still think it will bring more flexibility and security by schema qualify the
> objects in DDL command as mentioned before[1].
> 
> Besides, a schema qualified DDL is also more appropriate for other use
> cases(e.g. a table-level replication). As it's possible the schema is different
> between pub/sub and it's easy to cause unexpected and undetectable failure if
> we just log the search_path.
> 
> It makes more sense to me to have the same style WAL log(schema qualified)
> for
> both database level or table level replication as it will bring more
> flexibility.
> 
> 
> > "Create Table As .." is already handled by setting the skipData flag of the
> > statement parsetreee before replay:
> 
> 2) About the handling of CREATE TABLE AS:
> 
> I think it's not a appropriate approach to set the skipdata flag on subscriber
> as it cannot handle EXECUTE command in CTAS.
> 
> CREATE TABLE q5_prep_results AS EXECUTE q5(200, 'DTAAAA');
> 
> The Prepared statement is a temporary object which we don't replicate. So if
> you directly execute the original SQL on subscriber, even if you set skipdata
> it will fail.
> 
> I think it difficult to make this work as you need handle the create/drop of
> this prepared statement. And even if we extended subscriber's code to make it
> work, it doesn't seems like a standard and elegant approach.
> 
> 
> > "Alter Table .. " that rewrites with volatile expressions can also be handled
> > without any syntax change, by enabling the table rewrite replication and
> > converting the rewrite inserts to updates. ZJ's patch introduced this solution.
> 
> 3) About the handling of ALTER TABLE rewrite.
> 
> The approach I proposed before is based on the event trigger + deparser
> approach. We were able to improve that approach as we don't need to
> replicate
> the rewrite in many cases. For example: we don't need to replicate rewrite dml
> if there is no volatile/mutable function. We should check and filter these case
> at publisher (e.g. via deparser) instead of checking that at subscriber.
> 
> Besides, as discussed, we need to give warning or error for the cases when DDL
> contains volatile function which would be executed[2]. We should check this at
> publisher as well(via deparser).
> 
> 
> > I think for such cases it's not full database replication and we could treat it as
> > table level DDL replication, i.e. use the the deparser format.
> 
> 4) I think the point could be that we should make the WAL log format
> extendable
> so that we can extend it to support more useful feature(table filter/schema
> maps/DDL filter). If we just WAL log the original SQL, it seems it's difficult
> to extend it in the future ?

Attach the new version patch set which added support for CREATE/DROP/ATER
Sequence and CREATE/DROP Schema ddl commands which are provided by Ajin
Cherian off list.

The new version patch will also check function's volatility[1] in ALTER TABLE
command. If any function to be executed is volatile, we report an ERROR.
Whether WARNING is better to be used here is still under consideration.

[1] https://www.postgresql.org/message-id/CAA4eK1JVynFsj%2BmcRWj9sewR2yNUs6LuNxJ0eN-gNJ83oKcUOQ%40mail.gmail.com

Best regards,
Hou zj



Attachment

pgsql-hackers by date:

Previous
From: Bharath Rupireddy
Date:
Subject: Re: Use fadvise in wal replay
Next
From: Amit Kapila
Date:
Subject: Re: Use fadvise in wal replay