Re: Fast logical replication jump start with PG 10 - Mailing list pgsql-general

From Adrian Klaver
Subject Re: Fast logical replication jump start with PG 10
Date
Msg-id 67e76903-b1fb-5a43-4b83-4f721ae2e1b0@aklaver.com
Whole thread Raw
In response to Re: Fast logical replication jump start with PG 10  (Olivier Gautherot <olivier@gautherot.net>)
Responses Re: Fast logical replication jump start with PG 10  (Martín Marqués <martin.marques@2ndquadrant.com>)
Re: Fast logical replication jump start with PG 10  (Olivier Gautherot <olivier@gautherot.net>)
List pgsql-general
On 05/26/2018 06:23 AM, Olivier Gautherot wrote:
> Hi Adrian!
> 
> On Sat, May 26, 2018 at 12:11 AM, Adrian Klaver 
> <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
> 
>     On 05/25/2018 06:35 PM, Olivier Gautherot wrote:
> 
>         Hi Adrian, thanks for your reply. Here is the clarification.
> 
>         1) It is indeed a pg_upgrade from 9.2 to 10.4. Depending on the
>         test machine, it runs in between 15 and 20 minutes for just over
>         100GB. I can negotiate this time with our customer. The vacuum
>         process took another 5 to 7 minutes. This this what I was
>         referring to with the 30 minutes (point 3 in your questions)
> 
>         2) After pg_upgrade, I published the tables on the database (in
>         the sense "CREATE DATABASE") and subscribed to this publication
>         on the second server (logical replication). The data copy
>         processed started immediately and took around 1 hour. I then
>         loaded the indexes, what took > another 2h20m. At that point the
>         active-passive cluster was ready to go.
> 
> 
>     The index creation was done on the replicated machine I presume,
>     using what command?
> 
> 
> The sequence on the replicated machine was (pseudo-code to simplify the 
> syntax):
> - pg_dump --section=pre-data -h master_machine master_database | psql -h 
> replication_machine replication_database
> # This took seconds, "pre-data" discards the indexes
> 
> - psql -h replication_machine -c "CREATE SUBSCRIPTION mysub CONNECTION 
> "..." PUBLICATION mypub;" replication_database
> # This took about 1 hour for the initial sync
> 
> - pg_dump --section=post-data -h master_machine master_database | psql 
> -h replication_machine replication_database
> # This took 2h20m to load the various indexes
> 
> This sequence follows the recommendation of section 14.4.3 in 
> https://www.postgresql.org/docs/10/static/populate.html . If I stick to 
> streaming as we do today (e.g. pg_upgrade and then rsync to the 
> replication server), I can be ready in about 1 hour (more acceptable for 
> the customer).

I am still learning what logical replication is capable of so take the 
following with that in mind.

1) I used 
pg_basebackup(www.postgresql.org/docs/10/static/app-pgbasebackup.html) 
to create a new $DATA directory for a replica instance.

2) I configured the master and the replica for logical replication. Also 
changed the copied over conf files to work for the new instance e.g. 
changed the port number.

3) I set up the PUBLICATION:

CREATE PUBLICATION everything FOR ALL TABLES;

4) I set up the SUBSCRIPTION:

CREATE SUBSCRIPTION test_sub CONNECTION 'dbname=production user=postgres 
port=5432' PUBLICATION everything WITH(copy_data=false);

*NOTE* the copy_data=false.

5) Then I started entering data in the master and it was replicated.

Caveats:

1) This was a small database.

2) The master and replica where on the same machine.

3) There was no activity on the master between the pg_basebackup and the 
CREATE PUBLICATION/CREATE SUBSCRIPTION commands.

> 
> The reasons for the indexes to take so long is the large number of them 
> on big tables (for instance, 7 indexes on a partitioned table, with 3 
> partitions of 15GB of data in 30M rows). I will skip the reasons that 
> got us there (please no flames, I'm aware of the issue :-) ). I don't 
> have definite execution times for the Production environment (in a 
> datacenter), which tends to be kind of a lottery in terms of execution 
> times compared to testing (on a desktop in the office).
> 
> 
>         Note that the active and the passive databases are on different
>         machines.
> 
>         4) By "database" I mean the result of "CREATE DATABASE" and we
>         have 1 per server (or "cluster" in your terminology - I tend to
>         use this word for a group of machines). We are currently using a
>         streaming replication
> 
> 
>     Yeah I understand, it is just that database and cluster have
>     specific meanings in Postgres and it helps to stick to those
>     meanings when discussing replication operations. Lowers the
>     confusion level:)
> 
>         between the 9.2 servers, so it could be a fall-back option after
>         the upgrade (I wanted to remove part of the indexes on the
>         master to lower the load, reason to use the logical
>         replication... if the execution time is not too excessive).
> 
> 
>     So the time you showed was with those indexes removed or not?
> 
> 
> I did try to synchronize the database with the indexes installed and 
> eventually dropped the replication database after a full week-end of 
> hectic activity (apparently, the initial sync job was not finished...). 
> I will try it again just to make sure but I'm fairly positive that I 
> will get to the same result.
> 
> 
> 
>         Hope it clarifies the question
>         Best regards
>         Olivier
> 
> 
>         Olivier Gautherot
>         olivier@gautherot.net <mailto:olivier@gautherot.net>
>         <mailto:olivier@gautherot.net <mailto:olivier@gautherot.net>>
>         Cel:+56 98 730 9361
>         Skype: ogautherot
>         www.gautherot.net <http://www.gautherot.net>
>         <http://www.gautherot.net>
>         http://www.linkedin.com/in/ogautherot
>         <http://www.linkedin.com/in/ogautherot>
> 
> 
>         On Fri, May 25, 2018 at 7:51 PM, Adrian Klaver
>         <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>         <mailto:adrian.klaver@aklaver.com
>         <mailto:adrian.klaver@aklaver.com>>> wrote:
> 
>              On 05/25/2018 02:12 PM, Olivier Gautherot wrote:
> 
>                  Hi,
> 
>                  I just sent the question on StackOverflow but realized
>         that this
>                  audience may be more savvy. So sorry in advance for
>         cross-posting...
> 
>                  I'm in the process of upgrading a PG from 9.2 to 10.4.
>                  pg_upgrade worked fine on the master and was rather
>         fast. The
>                  problem is that the database is replicated and I'm
>         planning to
>                  switch from streaming to logical. The problem is that it is
>                  rather slow (30 minutes for the master and over 3 hours
>         for the
>                  replication, between data transfer and indexes).
> 
> 
>              I am not clear on what you did, so can you clarify the
>         following:
> 
>              1) pg_upgrade from 9.2 master instance to 10.4 master
>         instance, correct?
> 
>              2) What replication are you talking about for the 3 hour value?
> 
>              3) What is the 30 minute value referring to?
> 
>              4) When you say database are you talking about a Postgres
>         cluster or
>              a database in the cluster?
> 
>                  Is there a way to speed up the replication or should I
>         rather
>                  stick to streaming replication? As I have only 1
>         database on the
>                  server, it would not be a show-stopper.
> 
>              See 4) above, but if you are talking about a single
>         database in a
>              cluster streaming replication will not work for that.
> 
>                  Thanks in advance
>                  Olivier Gautherot
>         http://www.linkedin.com/in/ogautherot
>         <http://www.linkedin.com/in/ogautherot>
>                  <http://www.linkedin.com/in/ogautherot
>         <http://www.linkedin.com/in/ogautherot>>
> 
>              --     Adrian Klaver
>         adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>         <mailto:adrian.klaver@aklaver.com
>         <mailto:adrian.klaver@aklaver.com>>
> 
> 
> 
> 
>     -- 
>     Adrian Klaver
>     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
> 
> 
> Olivier


-- 
Adrian Klaver
adrian.klaver@aklaver.com


pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: UPDATE from CTE syntax error
Next
From: Adrian Klaver
Date:
Subject: Re: Fast logical replication jump start with PG 10