RE: Initial Schema Sync for Logical Replication - Mailing list pgsql-hackers

From houzj.fnst@fujitsu.com
Subject RE: Initial Schema Sync for Logical Replication
Date
Msg-id OS0PR01MB57168EFC9390B7591B01B66B94859@OS0PR01MB5716.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Initial Schema Sync for Logical Replication  ("Euler Taveira" <euler@eulerto.com>)
List pgsql-hackers
On Friday, March 24, 2023 11:01 PM Euler Taveira <euler@eulerto.com>  wrote:

Hi,

> On Fri, Mar 24, 2023, at 8:57 AM, mailto:houzj.fnst@fujitsu.com wrote:
> > First, I think the current publisher doesn't know the version number of
> > client(subscriber) so we need to check the feasibility of same. Also, having
> > client's version number checks doesn't seem to be a good idea.
> 
> walrcv_server_version().

I don't think this function works, as it only shows the server version (e.g.
publisher/walsender).

> > Besides, I thought about the problems that will happen if we try to support
> > replicating New PG to older PG. The following examples assume that we support the
> > DDL replication in the mentioned PG.
> > 
> > 1) Assume we want to replicate from a newer PG to a older PG where partition
> >    table has not been introduced. I think even if the publisher is aware of
> >    that, it doesn't have a good way to transform the partition related command,
> >    maybe one could say we can transform that to inherit table, but I feel that
> >    introduces too much complexity.
> > 
> > 2) Another example is generated column. To replicate the newer PG which has
> >    this feature to a older PG without this. I am concerned that is there a way
> >    to transform this without causing inconsistent behavior.
> > 
> > Even if we decide to simply skip sending such unsupported commands or skip
> > applying them, then it's likely that the following dml replication will cause
> > data inconsistency.
>
> As I mentioned in a previous email [1], the publisher can contain code to
> decide if it can proceed or not, in case you are doing a downgrade. I said
> downgrade but it can also happen if we decide to deprecate a syntax. For
> example, when WITH OIDS was deprecated, pg_dump treats it as an acceptable
> removal. The transformation can be (dis)allowed by the protocol version or
> another constant [2].

If most of the new DDL related features won't be supported to be transformed to
old subscriber, I don't see a point in supporting this use case.

I think cases like the removal of WITH OIDS are rare enough that we don't need
to worry about and it doesn't affect the data consistency. But new DDL features
are different.

Not only the features like partition or generated column, features like
nulls_not_distinct are also tricky to be transformed without causing
inconsistent behavior.

> > So, it seems we cannot completely support this use case, there would be some
> > limitations. Personally, I am not sure if it's worth introducing complexity to
> > support it partially.
> 
> Limitations are fine; they have different versions. I wouldn't like to forbid
> downgrade just because I don't want to maintain compatibility with previous
> versions. IMO it is important to be able to downgrade in case of any
> incompatibility with an application. You might argue that this isn't possible
> due to time or patch size and that there is a workaround for it but I wouldn't
> want to close the door for downgrade in the future.

The biggest problem is the data inconsistency that it would cause. I am not
aware of a generic solution to replicate new introduced DDLs to old subscriber.
which wouldn't cause data inconsistency. And apart from that, IMO the
complexity and maintainability of the feature also matters.

Best  Regards,
Hou zj

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: refactoring relation extension and BufferAlloc(), faster COPY
Next
From: Richard Guo
Date:
Subject: About the constant-TRUE clause in reconsider_outer_join_clauses