Home > mailing lists

Re: Fast logical replication jump start with PG 10 - Mailing list pgsql-general

From	Adrian Klaver
Subject	Re: Fast logical replication jump start with PG 10
Date	May 26, 2018 21:00:33
Msg-id	a01d4eeb-21c5-d235-a18c-fe476d2db6a9@aklaver.com Whole thread Raw
In response to	Re: Fast logical replication jump start with PG 10 (Olivier Gautherot <olivier@gautherot.net>)
List	pgsql-general

Tree view

On 05/26/2018 06:23 AM, Olivier Gautherot wrote:
> Hi Adrian!
> 
> On Sat, May 26, 2018 at 12:11 AM, Adrian Klaver 
> <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
> 
>     On 05/25/2018 06:35 PM, Olivier Gautherot wrote:
> 
>         Hi Adrian, thanks for your reply. Here is the clarification.
> 
>         1) It is indeed a pg_upgrade from 9.2 to 10.4. Depending on the
>         test machine, it runs in between 15 and 20 minutes for just over
>         100GB. I can negotiate this time with our customer. The vacuum
>         process took another 5 to 7 minutes. This this what I was
>         referring to with the 30 minutes (point 3 in your questions)
> 
>         2) After pg_upgrade, I published the tables on the database (in
>         the sense "CREATE DATABASE") and subscribed to this publication
>         on the second server (logical replication). The data copy
>         processed started immediately and took around 1 hour. I then
>         loaded the indexes, what took > another 2h20m. At that point the
>         active-passive cluster was ready to go.
> 
> 
>     The index creation was done on the replicated machine I presume,
>     using what command?
> 
> 
> The sequence on the replicated machine was (pseudo-code to simplify the 
> syntax):
> - pg_dump --section=pre-data -h master_machine master_database | psql -h 
> replication_machine replication_database
> # This took seconds, "pre-data" discards the indexes
> 
> - psql -h replication_machine -c "CREATE SUBSCRIPTION mysub CONNECTION 
> "..." PUBLICATION mypub;" replication_database
> # This took about 1 hour for the initial sync
> 
> - pg_dump --section=post-data -h master_machine master_database | psql 
> -h replication_machine replication_database
> # This took 2h20m to load the various indexes
> 
> This sequence follows the recommendation of section 14.4.3 in 
> https://www.postgresql.org/docs/10/static/populate.html . If I stick to 
> streaming as we do today (e.g. pg_upgrade and then rsync to the 
> replication server), I can be ready in about 1 hour (more acceptable for 
> the customer).
> 

Just realized that by setting up the streaming as above you are already 
doing basically the same thing as I suggested in my previous post. 
Streaming and logical replication can exist at the same time:

https://www.postgresql.org/docs/10/static/logical-replication.html

"Logical replication is a method of replicating data objects and their 
changes, based upon their replication identity (usually a primary key). 
We use the term logical in contrast to physical replication, which uses 
exact block addresses and byte-by-byte replication. PostgreSQL supports 
both mechanisms concurrently, see Chapter 26. Logical replication allows 
fine-grained control over both data replication and security."

So you could set up the logical replication after the streaming is done 
using the copy_data=false clause and been done in a relatively short 
period of time. At that point you could decide whether to keep the 
streaming running or not.


-- 
Adrian Klaver
adrian.klaver@aklaver.com

pgsql-general by date:

From: Adrian Klaver
Date: 26 May 2018, 20:27:51
Subject: Re: Fast logical replication jump start with PG 10

From: Martín Marqués
Date: 26 May 2018, 21:13:45
Subject: Re: Fast logical replication jump start with PG 10

Re: Fast logical replication jump start with PG 10 - Mailing list pgsql-general

Previous

Next