Home > mailing lists

Re: Fast logical replication jump start with PG 10 - Mailing list pgsql-general

From	Olivier Gautherot
Subject	Re: Fast logical replication jump start with PG 10
Date	May 26, 2018 19:23:02
Msg-id	CAJ7S9TUZJfcYEqQOxiLEy7d05NHfG7N0AyZYsG-4ngvb9PwLAQ@mail.gmail.com Whole thread Raw
In response to	Re: Fast logical replication jump start with PG 10 (Adrian Klaver <adrian.klaver@aklaver.com>)
Responses	Re: Fast logical replication jump start with PG 10 (Adrian Klaver <adrian.klaver@aklaver.com>) Re: Fast logical replication jump start with PG 10 (Adrian Klaver <adrian.klaver@aklaver.com>)
List	pgsql-general

Tree view

Hi Adrian!

On Sat, May 26, 2018 at 12:11 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 05/25/2018 06:35 PM, Olivier Gautherot wrote:
Hi Adrian, thanks for your reply. Here is the clarification.

1) It is indeed a pg_upgrade from 9.2 to 10.4. Depending on the test machine, it runs in between 15 and 20 minutes for just over 100GB. I can negotiate this time with our customer. The vacuum process took another 5 to 7 minutes. This this what I was referring to with the 30 minutes (point 3 in your questions)

2) After pg_upgrade, I published the tables on the database (in the sense "CREATE DATABASE") and subscribed to this publication on the second server (logical replication). The data copy processed started immediately and took around 1 hour. I then loaded the indexes, what took > another 2h20m. At that point the active-passive cluster was ready to go.

The index creation was done on the replicated machine I presume, using what command?

The sequence on the replicated machine was (pseudo-code to simplify the syntax):

- pg_dump --section=pre-data -h master_machine master_database | psql -h replication_machine replication_database

# This took seconds, "pre-data" discards the indexes

- psql -h replication_machine -c "CREATE SUBSCRIPTION mysub CONNECTION "..." PUBLICATION mypub;" replication_database

# This took about 1 hour for the initial sync

- pg_dump --section=post-data -h master_machine master_database | psql -h replication_machine replication_database

# This took 2h20m to load the various indexes

This sequence follows the recommendation of section 14.4.3 in https://www.postgresql.org/docs/10/static/populate.html . If I stick to streaming as we do today (e.g. pg_upgrade and then rsync to the replication server), I can be ready in about 1 hour (more acceptable for the customer).

The reasons for the indexes to take so long is the large number of them on big tables (for instance, 7 indexes on a partitioned table, with 3 partitions of 15GB of data in 30M rows). I will skip the reasons that got us there (please no flames, I'm aware of the issue :-) ). I don't have definite execution times for the Production environment (in a datacenter), which tends to be kind of a lottery in terms of execution times compared to testing (on a desktop in the office).

Note that the active and the passive databases are on different machines.

4) By "database" I mean the result of "CREATE DATABASE" and we have 1 per server (or "cluster" in your terminology - I tend to use this word for a group of machines). We are currently using a streaming replication

Yeah I understand, it is just that database and cluster have specific meanings in Postgres and it helps to stick to those meanings when discussing replication operations. Lowers the confusion level:)

between the 9.2 servers, so it could be a fall-back option after the upgrade (I wanted to remove part of the indexes on the master to lower the load, reason to use the logical replication... if the execution time is not too excessive).

So the time you showed was with those indexes removed or not?

I did try to synchronize the database with the indexes installed and eventually dropped the replication database after a full week-end of hectic activity (apparently, the initial sync job was not finished...). I will try it again just to make sure but I'm fairly positive that I will get to the same result.

Hope it clarifies the question
Best regards
Olivier

Olivier Gautherot
olivier@gautherot.net <mailto:olivier@gautherot.net>
Cel:+56 98 730 9361
Skype: ogautherot
www.gautherot.net <http://www.gautherot.net>
http://www.linkedin.com/in/ogautherot

On Fri, May 25, 2018 at 7:51 PM, Adrian Klaver <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:

On 05/25/2018 02:12 PM, Olivier Gautherot wrote:

Hi,

I just sent the question on StackOverflow but realized that this
audience may be more savvy. So sorry in advance for cross-posting...

I'm in the process of upgrading a PG from 9.2 to 10.4.
pg_upgrade worked fine on the master and was rather fast. The
problem is that the database is replicated and I'm planning to
switch from streaming to logical. The problem is that it is
rather slow (30 minutes for the master and over 3 hours for the
replication, between data transfer and indexes).

I am not clear on what you did, so can you clarify the following:

1) pg_upgrade from 9.2 master instance to 10.4 master instance, correct?

2) What replication are you talking about for the 3 hour value?

3) What is the 30 minute value referring to?

4) When you say database are you talking about a Postgres cluster or
a database in the cluster?

Is there a way to speed up the replication or should I rather
stick to streaming replication? As I have only 1 database on the
server, it would not be a show-stopper.

See 4) above, but if you are talking about a single database in a
cluster streaming replication will not work for that.

Thanks in advance
Olivier Gautherot
http://www.linkedin.com/in/ogautherot
<http://www.linkedin.com/in/ogautherot>

-- Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

--
Adrian Klaver
adrian.klaver@aklaver.com

Olivier

pgsql-general by date:

From: Jayadevan M
Date: 26 May 2018, 15:30:07
Subject: Re: PostgreSQL backup issue

From: Alexander Farber
Date: 26 May 2018, 22:21:12
Subject: UPDATE from CTE syntax error

Re: Fast logical replication jump start with PG 10 - Mailing list pgsql-general

Previous

Next