RE: Support logical replication of DDLs - Mailing list pgsql-hackers

From houzj.fnst@fujitsu.com
Subject RE: Support logical replication of DDLs
Date
Msg-id OS0PR01MB571688CA7CAAE8B01089311294AD9@OS0PR01MB5716.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Support logical replication of DDLs  (Zheng Li <zhengli10@gmail.com>)
Responses Re: Support logical replication of DDLs
List pgsql-hackers
On Wednesday, June 15, 2022 8:14 AM Zheng Li <zhengli10@gmail.com> wrote:
> 
> > Thanks for providing this idea.
> >
> > I looked at the string that is used for replication:
> >
> > """
> > {ALTERTABLESTMT :relation {RANGEVAR :schemaname public :relname foo
> > :inh true :relpersistence p :alias <> :location 12} :cmds ({ALTERTABLECMD
> > :subtype 0 :name <> :num 0 :newowner <> :def {COLUMNDEF :colname b
> > :typeName {TYPENAME :names ("public" "timestamptz") :typeOid 0 :setof
> > false :pct_type false :typmods <> :typemod -1 :arrayBounds <> :location
> > 29} :compression <> :inhcount 0 :is_local true :is_not_null false
> > :is_from_type false :storage <> :raw_default <> :cooked_default <>
> > :identity <> :identitySequence <> :generated <> :collClause <> :collOid 0
> > :constraints <> :fdwoptions <> :location 27} :behavior 0 :missing_ok
> > false}) :objtype 41 :missing_ok false}
> > """
> >
> > I think the converted parsetree string includes lots of internal
> > objects(e.g typeOid/pct_type/objtype/collOid/location/...). These are
> > unnecessary stuff for replication and we cannot make sure all the internal
> > stuff are consistent among pub/sub. So I am not sure whether replicating
> > this string is better.
> >
> > Besides, replicating the string from nodetostring() means we would need to
> > deal with the structure difference between the publisher and the
> > subscriber if any related structure has been changed which seems not good.
> 
> Yeah, this existing format is not designed to be portable between different
> major versions. So it can't directly be used for replication without
> serious modification.
> 
> > IMO, The advantages of the deparsing approach(as implemented in the POC
> > patch set[1]) are:
> >
> > 1) We can generate a command representation that can be
> > parsed/processed/transformed arbitrarily by the subscriber using generic
> > rules it(for example: user can easily replace the schema name in it) while
> > the results of nodetostring() seems not a standard json string, so I am
> > not sure can user reuse it without traversing the parsetree again.
> >
> > 2) With event_trigger + deparser, we can filter the unpublished objects
> > easier. For example: "DROP TABLE table_pub, table_unpub;". We can deparse
> > it into two commands "DROP TABLE table_pub" and "DROP TABLE
> table_pub" and
> > only publish the first one.
> >
> > 3) With deparser, we are able to query the catalog in the deparser to
> > build a complete command(filled with schemaname...) which user don't need
> > to do any other work for it. We don't need to force the subscriber to set
> > the same search_path as the publisher which give user more flexibility.
> >
> > 4) For CREATE TABLE AS, we can separate out the CREATE TABLE part with the
> > help of deparser and event trigger. This can avoid executing the subquery
> > on subcriber.
> >
> > 5) For ALTER TABLE command. We might want to filter out the DDL which use
> > volatile function as discussed in [2]. We can achieve this easier by
> > extending the deparser to check the functions used. We can even rebuild a
> > command without unsupported functions to replicate by using deparser.
> >
> > There may be more cases I am missing as we are still analyzing other DDLs.
> 
> How does the deparser deparses CREATE FUNCTION STATEMENT? Will it
> schema qualify
> objects inside the function definition?

The current deparser doesn't schema qualify objects inside the function
source as we won't know the schema of inner objects until the function is
executed. The deparser will only schema qualify the objects around
function declaration Like:

CREATE FUNCTION [public].test_func(i [pg_catalog].int4 ) RETURNS  [pg_catalog].int4 LANGUAGE plpgsql

Best regards,
Hou zj

pgsql-hackers by date:

Previous
From: Peter Smith
Date:
Subject: Re: Handle infinite recursion in logical replication setup
Next
From: Peter Eisentraut
Date:
Subject: Re: "buffer too small" or "path too long"?