Re: Support logical replication of DDLs - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Support logical replication of DDLs |
Date | |
Msg-id | CAA4eK1LPX71LzSye6ac-OOx6=wWq7U2PL-1tBf3zvncSko8Srw@mail.gmail.com Whole thread Raw |
In response to | Re: Support logical replication of DDLs ("Euler Taveira" <euler@eulerto.com>) |
Responses |
Re: Support logical replication of DDLs
|
List | pgsql-hackers |
On Thu, Apr 14, 2022 at 7:45 PM Euler Taveira <euler@eulerto.com> wrote: > > You should forbid it. Unless you can decompose the command into multiple SQL > commands to make it a safe operation for logical replication. > > Let's say you want to add a column with a volatile default. > > ALTER TABLE foo ADD COLUMN bar double precision DEFAULT random(); > > If you replicate the DDL command as is, you will have different data > downstream. You should forbid it. However, this operation can be supported if > the DDL command is decomposed in multiple steps. > > -- add a new column without DEFAULT to avoid rewrite > ALTER TABLE foo ADD COLUMN bar double precision; > > -- future rows could use the DEFAULT expression > -- it also doesn't rewrite the table > ALTER TABLE foo ALTER COLUMN bar SET DEFAULT random(); > > -- it effectively rewrites the table > -- all rows are built from one source node > -- data will be the same on all nodes > UPDATE foo SET bar = random(); > While thinking about this, I see more to it than this. Say, we are able to decompose/split the DDL command with the help of deparser, do we want to update the additional rows on the subscriber that didn't exist on the publisher? For example, A table on the publisher side has rows: ddl_test(foo) a ---- 1 2 The same table on the subscriber side has rows: ddl_test(foo) a ---- 1 2 3 4 Now, say, the user has added a bar column with "ALTER TABLE foo ADD COLUMN bar double precision NOT NULL DEFAULT random();" If we compare with replication of DMLs like (UPDATE ddl_test SET bar = random();), the replication won't update rows with values (3 and 4) on subscriber as they don't exist on the publisher. However, if we follow the same here for DDL replication of Alter, it will fail because of NOT NULL constraint. So, it seems we should update all the existing rows on the subscriber to make replication of such constraints successful. It seems that IBM's replication solution allows replication of such DDLs and does update all existing rows on the target table [1][2]. I think it would be tricky to update the rows in subscriber that doesn't exist in the publisher as we need to distinguish/find such rows during apply but I think we can find some way to achieve this if we decide to go this way. We can also conclude that we want to restrict the replication of Alter Table for such cases but as other databases seem to support this, I think it is worth trying to support such an operation. If it turns out to be too complex or not at all feasible then we can always exclude it from the first version. Thoughts? [1] - https://www.ibm.com/docs/en/idr/10.2.1?topic=replication-adding-existing-columns-subscription-unidirectional (... Columns are added to the target table with the same data type, null characteristic, and default value as the matching columns in the source table... Rows that existed in the target table before the new column is added will have a NULL or default value for the new column. [2] - https://www.ibm.com/docs/en/idr/11.4.0?topic=replication-alter-add-column-command-multidirectional -- With Regards, Amit Kapila.
pgsql-hackers by date: