Re: speed up a logical replica setup - Mailing list pgsql-hackers

From Andreas Karlsson
Subject Re: speed up a logical replica setup
Date
Msg-id 25dbf79f-bbc3-a563-7971-3301d635e922@proxel.se
Whole thread Raw
In response to speed up a logical replica setup  ("Euler Taveira" <euler@eulerto.com>)
List pgsql-hackers
On 2/21/22 13:09, Euler Taveira wrote:
> DESIGN
> 
> The conversion requires 8 steps.
> 
> 1. Check if the target data directory has the same system identifier 
> than the
> source data directory.
> 2. Stop the target server if it is running as a standby server. (Modify
> recovery parameters requires a restart.)
> 3. Create one replication slot per specified database on the source 
> server. One
> additional replication slot is created at the end to get the consistent LSN
> (This consistent LSN will be used as (a) a stopping point for the recovery
> process and (b) a starting point for the subscriptions).
> 4. Write recovery parameters into the target data directory and start the
> target server (Wait until the target server is promoted).
> 5. Create one publication (FOR ALL TABLES) per specified database on the 
> source
> server.
> 6. Create one subscription per specified database on the target server (Use
> replication slot and publication created in a previous step. Don't 
> enable the
> subscriptions yet).
> 7. Sets the replication progress to the consistent LSN that was got in a
> previous step.
> 8. Enable the subscription for each specified database on the target server.

Very interesting!

I actually just a couple of weeks ago proposed a similar design for 
upgrading a database of a customer of mine. We have not tried it yet so 
it is not decided if we should go ahead with it.

In our case the goal is a bit different so my idea is that we will use 
pg_dump/pg_restore (or pg_upgrade and then some manual cleanup if 
pg_dump/pg_restore is too slow) on the target server. The goal of this 
design is to get a nice clean logical replica at the new version of 
PostgreSQL with indexes with the correct collations, all old invalid 
constraints validated, minimal bloat, etc. And all of this without 
creating bloat or putting too much load on the old master during the 
process. We have plenty of disk space and plenty of time so those are 
not limitations in our case. I can go into more detail if there is interest.

It is nice to see that our approach is not entirely unique. :) And I 
will take a look at this patch when I find the time.

Andreas



pgsql-hackers by date:

Previous
From: "Euler Taveira"
Date:
Subject: Re: logical replication restrictions
Next
From: Peter Smith
Date:
Subject: Re: Optionally automatically disable logical replication subscriptions on error