Re: Support logical replication of DDLs - Mailing list pgsql-hackers
From | Euler Taveira |
---|---|
Subject | Re: Support logical replication of DDLs |
Date | |
Msg-id | 3c646317-df34-4cb3-9365-14abeada6587@www.fastmail.com Whole thread Raw |
In response to | Re: Support logical replication of DDLs (Dilip Kumar <dilipbalaut@gmail.com>) |
Responses |
Re: Support logical replication of DDLs
Re: Support logical replication of DDLs Re: Support logical replication of DDLs |
List | pgsql-hackers |
On Thu, Apr 14, 2022, at 6:26 AM, Dilip Kumar wrote:
I agree. But here the bigger question is what is the correct behaviorin case of the Alter Table? I mean for example in the publisher thetable gets rewritten due to the Access Method change then what shouldbe the behavior of the subscriber. One expected behavior is that onsubscriber also the access method gets changed and the data remainsthe same as on the subscriber table(table can be locally rewrittenbased on new AM). Which seems quite sensible behavior to me. But ifwe want this behavior then we can not replay the logical messagesgenerated by DML WAL because of table rewrite, otherwise we will getduplicate data, unless we plan to get rid of the current data and justget all new data from the publisher. And if we do that then the datawill be as per the latest data in the table based on the publisher, soI think first we need to define the correct behavior and then we candesign it accordingly.
You should forbid it. Unless you can decompose the command into multiple SQL
commands to make it a safe operation for logical replication.
Let's say you want to add a column with a volatile default.
ALTER TABLE foo ADD COLUMN bar double precision DEFAULT random();
If you replicate the DDL command as is, you will have different data
downstream. You should forbid it. However, this operation can be supported if
the DDL command is decomposed in multiple steps.
-- add a new column without DEFAULT to avoid rewrite
ALTER TABLE foo ADD COLUMN bar double precision;
-- future rows could use the DEFAULT expression
-- it also doesn't rewrite the table
ALTER TABLE foo ALTER COLUMN bar SET DEFAULT random();
-- it effectively rewrites the table
-- all rows are built from one source node
-- data will be the same on all nodes
UPDATE foo SET bar = random();
The ALTER TABLE ... ALTER COLUMN ... TYPE has a similar issue. This DDL command
can be decomposed to avoid the rewrite. If you are changing the data type, in
general, you add a new column and updates all rows doing the proper conversion.
(If you are updating in batches, you usually add a trigger to automatically
adjust the new column value for INSERTs and UPDATEs. Another case is when you
are reducing the the typmod (for example, varchar(100) to varchar(20)). In this
case, the DDL command can de decomposed removing the typmod information (ALTER
TABLE ... ALTER COLUMN ... TYPE varchar) and replacing it with a CHECK
constraint.
I didn't review this patch in depth but we certainly need to impose some DDL
restrictions if we are replicating DDLs. There are other cases that should be
treated accordingly such as a TABLESPACE specification or a custom data type.
pgsql-hackers by date: