Thread: Logical Replication of Multiple Schema Versions

Logical Replication of Multiple Schema Versions

From
Dan shmidt
Date:
Hi All,

Thanks for having such a great mailing list. I hope I'm sending to the correct distribution. 

I will start with describing my current architecture and where do I use logical replication.
I have several nodes each running a postgres-sql database. Each of these nodes has an API server which uses the database in order to retrieve data.
I recently created an aggregated node which according to the logical replication documentation is a common use-case "Consolidating multiple databases into a single one (for example for analytical purposes)."
The aggregation node also has an API server and should serve the same use cases as a regular node serves but with aggregated information.

My question is about schema upgrades. As very well documented logical replication does not replicate schema changes, thus all schema modifications should be done on both regular-nodes and aggregated-node.
Here are my constraints and relaxation points:
  • I would prefer having the freedom of making any schema change (i.e: removing/renaming a column). 
  • I must have the ability to upgrade each node separately and not taking all of the nodes down.
  • I'm willing to allow a rather big (2 week) downtime in the replication as long as the regular-node keeps on working. (Hopefully aggregated-node can still be operative, and not be up to date with will regular-nodes)
  • I'm willing to allow downtime of regular node as long as the process of upgrading the node is taking place.
I started with a naive approach:
  1. Take down all the nodes
  2. Schema upgrade for the aggregated-node
  3. Schema upgrade for the regular-nodes
  4. Start everything up
The problem I encountered with this approach is that while taking down the regular-nodes there might be some not yet replicated WAL entries containing schema prior to the upgrade, thus after step 4 above replication will fail.
My next approach was to let the logical replication "drain" until there are no changes in the database and then upgrading all of the nodes, but this breaks constraint #2.

What is the correct way to perform such an operation? 
Is there a way to keep constraint #1 or the only option is to not allow "breaking" schema changes between versions. 

Thank you,
Dan.

Re: Logical Replication of Multiple Schema Versions

From
Peter Eisentraut
Date:
On 2019-12-10 08:55, Dan shmidt wrote:
> What is the correct way to perform such an operation?
> Is there a way to keep constraint #1 or the only option is to not allow 
> "breaking" schema changes between versions.

It all depends on the specific schema changes you want to make.  You can 
add columns on the subscriber and remove columns on the publisher 
without breaking things (unless there are not-null constraints). 
Renaming columns will break replication until you rename them 
everywhere.  Column type changes will usually just work as long as the 
data fits into both the old and the new type.

You really need to carefully plan and test each class of scenarios 
separately.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Logical Replication of Multiple Schema Versions

From
Dan shmidt
Date:
Thanks Peter for your answer.

I was hoping to keep the freedom of performing any change to my schema.
Since the entire upgrade might take more than a week, there will be a time in which I have two publisher nodes with different schema versions (which might include a column rename).
I was hoping that were was a way to somehow apply some logic on the subscriber to convert one schema to another. I also believe that limiting schema changes only to those that won't break replication should suffice.

Dan.


From: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
Sent: Wednesday, December 11, 2019 6:13 PM
To: Dan shmidt <dshmidt@hotmail.com>; pgsql-general@lists.postgresql.org <pgsql-general@lists.postgresql.org>
Subject: Re: Logical Replication of Multiple Schema Versions
 
On 2019-12-10 08:55, Dan shmidt wrote:
> What is the correct way to perform such an operation?
> Is there a way to keep constraint #1 or the only option is to not allow
> "breaking" schema changes between versions.

It all depends on the specific schema changes you want to make.  You can
add columns on the subscriber and remove columns on the publisher
without breaking things (unless there are not-null constraints).
Renaming columns will break replication until you rename them
everywhere.  Column type changes will usually just work as long as the
data fits into both the old and the new type.

You really need to carefully plan and test each class of scenarios
separately.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services