Thread: Fast logical replication jump start with PG 10

Fast logical replication jump start with PG 10

From
Olivier Gautherot
Date:
Hi,

I just sent the question on StackOverflow but realized that this audience may be more savvy. So sorry in advance for cross-posting...

I'm in the process of upgrading a PG from 9.2 to 10.4. pg_upgrade worked fine on the master and was rather fast. The problem is that the database is replicated and I'm planning to switch from streaming to logical. The problem is that it is rather slow (30 minutes for the master and over 3 hours for the replication, between data transfer and indexes).

Is there a way to speed up the replication or should I rather stick to streaming replication? As I have only 1 database on the server, it would not be a show-stopper.


Thanks in advance

Re: Fast logical replication jump start with PG 10

From
Adrian Klaver
Date:
On 05/25/2018 02:12 PM, Olivier Gautherot wrote:
> Hi,
> 
> I just sent the question on StackOverflow but realized that this 
> audience may be more savvy. So sorry in advance for cross-posting...
> 
> I'm in the process of upgrading a PG from 9.2 to 10.4. pg_upgrade worked 
> fine on the master and was rather fast. The problem is that the database 
> is replicated and I'm planning to switch from streaming to logical. The 
> problem is that it is rather slow (30 minutes for the master and over 3 
> hours for the replication, between data transfer and indexes).

I am not clear on what you did, so can you clarify the following:

1) pg_upgrade from 9.2 master instance to 10.4 master instance, correct?

2) What replication are you talking about for the 3 hour value?

3) What is the 30 minute value referring to?

4) When you say database are you talking about a Postgres cluster or a 
database in the cluster?

> 
> Is there a way to speed up the replication or should I rather stick to 
> streaming replication? As I have only 1 database on the server, it would 
> not be a show-stopper.

See 4) above, but if you are talking about a single database in a 
cluster streaming replication will not work for that.

> 
> 
> Thanks in advance
> Olivier Gautherot
> http://www.linkedin.com/in/ogautherot


-- 
Adrian Klaver
adrian.klaver@aklaver.com


Re: Fast logical replication jump start with PG 10

From
Olivier Gautherot
Date:
Hi Adrian, thanks for your reply. Here is the clarification.

1) It is indeed a pg_upgrade from 9.2 to 10.4. Depending on the test machine, it runs in between 15 and 20 minutes for just over 100GB. I can negotiate this time with our customer. The vacuum process took another 5 to 7 minutes. This this what I was referring to with the 30 minutes (point 3 in your questions)

2) After pg_upgrade, I published the tables on the database (in the sense "CREATE DATABASE") and subscribed to this publication on the second server (logical replication). The data copy processed started immediately and took around 1 hour. I then loaded the indexes, what took another 2h20m. At that point the active-passive cluster was ready to go. Note that the active and the passive databases are on different machines.

4) By "database" I mean the result of "CREATE DATABASE" and we have 1 per server (or "cluster" in your terminology - I tend to use this word for a group of machines). We are currently using a streaming replication between the 9.2 servers, so it could be a fall-back option after the upgrade (I wanted to remove part of the indexes on the master to lower the load, reason to use the logical replication... if the execution time is not too excessive).

Hope it clarifies the question
Best regards
Olivier


Olivier Gautherot
olivier@gautherot.net
Cel:+56 98 730 9361
Skype: ogautherot
www.gautherot.net
http://www.linkedin.com/in/ogautherot

On Fri, May 25, 2018 at 7:51 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 05/25/2018 02:12 PM, Olivier Gautherot wrote:
Hi,

I just sent the question on StackOverflow but realized that this audience may be more savvy. So sorry in advance for cross-posting...

I'm in the process of upgrading a PG from 9.2 to 10.4. pg_upgrade worked fine on the master and was rather fast. The problem is that the database is replicated and I'm planning to switch from streaming to logical. The problem is that it is rather slow (30 minutes for the master and over 3 hours for the replication, between data transfer and indexes).

I am not clear on what you did, so can you clarify the following:

1) pg_upgrade from 9.2 master instance to 10.4 master instance, correct?

2) What replication are you talking about for the 3 hour value?

3) What is the 30 minute value referring to?

4) When you say database are you talking about a Postgres cluster or a database in the cluster?


Is there a way to speed up the replication or should I rather stick to streaming replication? As I have only 1 database on the server, it would not be a show-stopper.

See 4) above, but if you are talking about a single database in a cluster streaming replication will not work for that.




Thanks in advance
Olivier Gautherot
http://www.linkedin.com/in/ogautherot


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: Fast logical replication jump start with PG 10

From
Adrian Klaver
Date:
On 05/25/2018 06:35 PM, Olivier Gautherot wrote:
> Hi Adrian, thanks for your reply. Here is the clarification.
> 
> 1) It is indeed a pg_upgrade from 9.2 to 10.4. Depending on the test 
> machine, it runs in between 15 and 20 minutes for just over 100GB. I can 
> negotiate this time with our customer. The vacuum process took another 5 
> to 7 minutes. This this what I was referring to with the 30 minutes 
> (point 3 in your questions)
> 
> 2) After pg_upgrade, I published the tables on the database (in the 
> sense "CREATE DATABASE") and subscribed to this publication on the 
> second server (logical replication). The data copy processed started 
> immediately and took around 1 hour. I then loaded the indexes, what took > another 2h20m. At that point the
active-passivecluster was ready to go.
 

The index creation was done on the replicated machine I presume, using 
what command?

> Note that the active and the passive databases are on different machines.
> 
> 4) By "database" I mean the result of "CREATE DATABASE" and we have 1 
> per server (or "cluster" in your terminology - I tend to use this word 
> for a group of machines). We are currently using a streaming replication 

Yeah I understand, it is just that database and cluster have specific 
meanings in Postgres and it helps to stick to those meanings when 
discussing replication operations. Lowers the confusion level:)

> between the 9.2 servers, so it could be a fall-back option after the 
> upgrade (I wanted to remove part of the indexes on the master to lower 
> the load, reason to use the logical replication... if the execution time 
> is not too excessive).

So the time you showed was with those indexes removed or not?

> 
> Hope it clarifies the question
> Best regards
> Olivier
> 
> 
> Olivier Gautherot
> olivier@gautherot.net <mailto:olivier@gautherot.net>
> Cel:+56 98 730 9361
> Skype: ogautherot
> www.gautherot.net <http://www.gautherot.net>
> http://www.linkedin.com/in/ogautherot
> 
> On Fri, May 25, 2018 at 7:51 PM, Adrian Klaver 
> <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
> 
>     On 05/25/2018 02:12 PM, Olivier Gautherot wrote:
> 
>         Hi,
> 
>         I just sent the question on StackOverflow but realized that this
>         audience may be more savvy. So sorry in advance for cross-posting...
> 
>         I'm in the process of upgrading a PG from 9.2 to 10.4.
>         pg_upgrade worked fine on the master and was rather fast. The
>         problem is that the database is replicated and I'm planning to
>         switch from streaming to logical. The problem is that it is
>         rather slow (30 minutes for the master and over 3 hours for the
>         replication, between data transfer and indexes).
> 
> 
>     I am not clear on what you did, so can you clarify the following:
> 
>     1) pg_upgrade from 9.2 master instance to 10.4 master instance, correct?
> 
>     2) What replication are you talking about for the 3 hour value?
> 
>     3) What is the 30 minute value referring to?
> 
>     4) When you say database are you talking about a Postgres cluster or
>     a database in the cluster?
> 
> 
>         Is there a way to speed up the replication or should I rather
>         stick to streaming replication? As I have only 1 database on the
>         server, it would not be a show-stopper.
> 
> 
>     See 4) above, but if you are talking about a single database in a
>     cluster streaming replication will not work for that.
> 
> 
> 
> 
>         Thanks in advance
>         Olivier Gautherot
>         http://www.linkedin.com/in/ogautherot
>         <http://www.linkedin.com/in/ogautherot>
> 
> 
> 
>     -- 
>     Adrian Klaver
>     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com


Re: Fast logical replication jump start with PG 10

From
Olivier Gautherot
Date:
Hi Adrian!

On Sat, May 26, 2018 at 12:11 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 05/25/2018 06:35 PM, Olivier Gautherot wrote:
Hi Adrian, thanks for your reply. Here is the clarification.

1) It is indeed a pg_upgrade from 9.2 to 10.4. Depending on the test machine, it runs in between 15 and 20 minutes for just over 100GB. I can negotiate this time with our customer. The vacuum process took another 5 to 7 minutes. This this what I was referring to with the 30 minutes (point 3 in your questions)

2) After pg_upgrade, I published the tables on the database (in the sense "CREATE DATABASE") and subscribed to this publication on the second server (logical replication). The data copy processed started immediately and took around 1 hour. I then loaded the indexes, what took > another 2h20m. At that point the active-passive cluster was ready to go.

The index creation was done on the replicated machine I presume, using what command?

The sequence on the replicated machine was (pseudo-code to simplify the syntax):
- pg_dump --section=pre-data -h master_machine master_database | psql -h replication_machine replication_database
# This took seconds, "pre-data" discards the indexes

- psql -h replication_machine -c "CREATE SUBSCRIPTION mysub CONNECTION "..." PUBLICATION mypub;" replication_database
# This took about 1 hour for the initial sync

- pg_dump --section=post-data -h master_machine master_database | psql -h replication_machine replication_database
# This took 2h20m to load the various indexes

This sequence follows the recommendation of section 14.4.3 in https://www.postgresql.org/docs/10/static/populate.html . If I stick to streaming as we do today (e.g. pg_upgrade and then rsync to the replication server), I can be ready in about 1 hour (more acceptable for the customer).

The reasons for the indexes to take so long is the large number of them on big tables (for instance, 7 indexes on a partitioned table, with 3 partitions of 15GB of data in 30M rows). I will skip the reasons that got us there (please no flames, I'm aware of the issue :-) ). I don't have definite execution times for the Production environment (in a datacenter), which tends to be kind of a lottery in terms of execution times compared to testing (on a desktop in the office).
 

Note that the active and the passive databases are on different machines.

4) By "database" I mean the result of "CREATE DATABASE" and we have 1 per server (or "cluster" in your terminology - I tend to use this word for a group of machines). We are currently using a streaming replication

Yeah I understand, it is just that database and cluster have specific meanings in Postgres and it helps to stick to those meanings when discussing replication operations. Lowers the confusion level:)

between the 9.2 servers, so it could be a fall-back option after the upgrade (I wanted to remove part of the indexes on the master to lower the load, reason to use the logical replication... if the execution time is not too excessive).

So the time you showed was with those indexes removed or not?

I did try to synchronize the database with the indexes installed and eventually dropped the replication database after a full week-end of hectic activity (apparently, the initial sync job was not finished...). I will try it again just to make sure but I'm fairly positive that I will get to the same result.



Hope it clarifies the question
Best regards
Olivier


Olivier Gautherot
olivier@gautherot.net <mailto:olivier@gautherot.net>
Cel:+56 98 730 9361
Skype: ogautherot
www.gautherot.net <http://www.gautherot.net>
http://www.linkedin.com/in/ogautherot


On Fri, May 25, 2018 at 7:51 PM, Adrian Klaver <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:

    On 05/25/2018 02:12 PM, Olivier Gautherot wrote:

        Hi,

        I just sent the question on StackOverflow but realized that this
        audience may be more savvy. So sorry in advance for cross-posting...

        I'm in the process of upgrading a PG from 9.2 to 10.4.
        pg_upgrade worked fine on the master and was rather fast. The
        problem is that the database is replicated and I'm planning to
        switch from streaming to logical. The problem is that it is
        rather slow (30 minutes for the master and over 3 hours for the
        replication, between data transfer and indexes).


    I am not clear on what you did, so can you clarify the following:

    1) pg_upgrade from 9.2 master instance to 10.4 master instance, correct?

    2) What replication are you talking about for the 3 hour value?

    3) What is the 30 minute value referring to?

    4) When you say database are you talking about a Postgres cluster or
    a database in the cluster?

        Is there a way to speed up the replication or should I rather
        stick to streaming replication? As I have only 1 database on the
        server, it would not be a show-stopper.

    See 4) above, but if you are talking about a single database in a
    cluster streaming replication will not work for that.

        Thanks in advance
        Olivier Gautherot
        http://www.linkedin.com/in/ogautherot
        <http://www.linkedin.com/in/ogautherot>

    --     Adrian Klaver
    adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>




--
Adrian Klaver
adrian.klaver@aklaver.com

Olivier

Re: Fast logical replication jump start with PG 10

From
Adrian Klaver
Date:
On 05/26/2018 06:23 AM, Olivier Gautherot wrote:
> Hi Adrian!
> 
> On Sat, May 26, 2018 at 12:11 AM, Adrian Klaver 
> <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
> 
>     On 05/25/2018 06:35 PM, Olivier Gautherot wrote:
> 
>         Hi Adrian, thanks for your reply. Here is the clarification.
> 
>         1) It is indeed a pg_upgrade from 9.2 to 10.4. Depending on the
>         test machine, it runs in between 15 and 20 minutes for just over
>         100GB. I can negotiate this time with our customer. The vacuum
>         process took another 5 to 7 minutes. This this what I was
>         referring to with the 30 minutes (point 3 in your questions)
> 
>         2) After pg_upgrade, I published the tables on the database (in
>         the sense "CREATE DATABASE") and subscribed to this publication
>         on the second server (logical replication). The data copy
>         processed started immediately and took around 1 hour. I then
>         loaded the indexes, what took > another 2h20m. At that point the
>         active-passive cluster was ready to go.
> 
> 
>     The index creation was done on the replicated machine I presume,
>     using what command?
> 
> 
> The sequence on the replicated machine was (pseudo-code to simplify the 
> syntax):
> - pg_dump --section=pre-data -h master_machine master_database | psql -h 
> replication_machine replication_database
> # This took seconds, "pre-data" discards the indexes
> 
> - psql -h replication_machine -c "CREATE SUBSCRIPTION mysub CONNECTION 
> "..." PUBLICATION mypub;" replication_database
> # This took about 1 hour for the initial sync
> 
> - pg_dump --section=post-data -h master_machine master_database | psql 
> -h replication_machine replication_database
> # This took 2h20m to load the various indexes
> 
> This sequence follows the recommendation of section 14.4.3 in 
> https://www.postgresql.org/docs/10/static/populate.html . If I stick to 
> streaming as we do today (e.g. pg_upgrade and then rsync to the 
> replication server), I can be ready in about 1 hour (more acceptable for 
> the customer).

I am still learning what logical replication is capable of so take the 
following with that in mind.

1) I used 
pg_basebackup(www.postgresql.org/docs/10/static/app-pgbasebackup.html) 
to create a new $DATA directory for a replica instance.

2) I configured the master and the replica for logical replication. Also 
changed the copied over conf files to work for the new instance e.g. 
changed the port number.

3) I set up the PUBLICATION:

CREATE PUBLICATION everything FOR ALL TABLES;

4) I set up the SUBSCRIPTION:

CREATE SUBSCRIPTION test_sub CONNECTION 'dbname=production user=postgres 
port=5432' PUBLICATION everything WITH(copy_data=false);

*NOTE* the copy_data=false.

5) Then I started entering data in the master and it was replicated.

Caveats:

1) This was a small database.

2) The master and replica where on the same machine.

3) There was no activity on the master between the pg_basebackup and the 
CREATE PUBLICATION/CREATE SUBSCRIPTION commands.

> 
> The reasons for the indexes to take so long is the large number of them 
> on big tables (for instance, 7 indexes on a partitioned table, with 3 
> partitions of 15GB of data in 30M rows). I will skip the reasons that 
> got us there (please no flames, I'm aware of the issue :-) ). I don't 
> have definite execution times for the Production environment (in a 
> datacenter), which tends to be kind of a lottery in terms of execution 
> times compared to testing (on a desktop in the office).
> 
> 
>         Note that the active and the passive databases are on different
>         machines.
> 
>         4) By "database" I mean the result of "CREATE DATABASE" and we
>         have 1 per server (or "cluster" in your terminology - I tend to
>         use this word for a group of machines). We are currently using a
>         streaming replication
> 
> 
>     Yeah I understand, it is just that database and cluster have
>     specific meanings in Postgres and it helps to stick to those
>     meanings when discussing replication operations. Lowers the
>     confusion level:)
> 
>         between the 9.2 servers, so it could be a fall-back option after
>         the upgrade (I wanted to remove part of the indexes on the
>         master to lower the load, reason to use the logical
>         replication... if the execution time is not too excessive).
> 
> 
>     So the time you showed was with those indexes removed or not?
> 
> 
> I did try to synchronize the database with the indexes installed and 
> eventually dropped the replication database after a full week-end of 
> hectic activity (apparently, the initial sync job was not finished...). 
> I will try it again just to make sure but I'm fairly positive that I 
> will get to the same result.
> 
> 
> 
>         Hope it clarifies the question
>         Best regards
>         Olivier
> 
> 
>         Olivier Gautherot
>         olivier@gautherot.net <mailto:olivier@gautherot.net>
>         <mailto:olivier@gautherot.net <mailto:olivier@gautherot.net>>
>         Cel:+56 98 730 9361
>         Skype: ogautherot
>         www.gautherot.net <http://www.gautherot.net>
>         <http://www.gautherot.net>
>         http://www.linkedin.com/in/ogautherot
>         <http://www.linkedin.com/in/ogautherot>
> 
> 
>         On Fri, May 25, 2018 at 7:51 PM, Adrian Klaver
>         <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>         <mailto:adrian.klaver@aklaver.com
>         <mailto:adrian.klaver@aklaver.com>>> wrote:
> 
>              On 05/25/2018 02:12 PM, Olivier Gautherot wrote:
> 
>                  Hi,
> 
>                  I just sent the question on StackOverflow but realized
>         that this
>                  audience may be more savvy. So sorry in advance for
>         cross-posting...
> 
>                  I'm in the process of upgrading a PG from 9.2 to 10.4.
>                  pg_upgrade worked fine on the master and was rather
>         fast. The
>                  problem is that the database is replicated and I'm
>         planning to
>                  switch from streaming to logical. The problem is that it is
>                  rather slow (30 minutes for the master and over 3 hours
>         for the
>                  replication, between data transfer and indexes).
> 
> 
>              I am not clear on what you did, so can you clarify the
>         following:
> 
>              1) pg_upgrade from 9.2 master instance to 10.4 master
>         instance, correct?
> 
>              2) What replication are you talking about for the 3 hour value?
> 
>              3) What is the 30 minute value referring to?
> 
>              4) When you say database are you talking about a Postgres
>         cluster or
>              a database in the cluster?
> 
>                  Is there a way to speed up the replication or should I
>         rather
>                  stick to streaming replication? As I have only 1
>         database on the
>                  server, it would not be a show-stopper.
> 
>              See 4) above, but if you are talking about a single
>         database in a
>              cluster streaming replication will not work for that.
> 
>                  Thanks in advance
>                  Olivier Gautherot
>         http://www.linkedin.com/in/ogautherot
>         <http://www.linkedin.com/in/ogautherot>
>                  <http://www.linkedin.com/in/ogautherot
>         <http://www.linkedin.com/in/ogautherot>>
> 
>              --     Adrian Klaver
>         adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>         <mailto:adrian.klaver@aklaver.com
>         <mailto:adrian.klaver@aklaver.com>>
> 
> 
> 
> 
>     -- 
>     Adrian Klaver
>     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
> 
> 
> Olivier


-- 
Adrian Klaver
adrian.klaver@aklaver.com


Re: Fast logical replication jump start with PG 10

From
Adrian Klaver
Date:
On 05/26/2018 06:23 AM, Olivier Gautherot wrote:
> Hi Adrian!
> 
> On Sat, May 26, 2018 at 12:11 AM, Adrian Klaver 
> <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
> 
>     On 05/25/2018 06:35 PM, Olivier Gautherot wrote:
> 
>         Hi Adrian, thanks for your reply. Here is the clarification.
> 
>         1) It is indeed a pg_upgrade from 9.2 to 10.4. Depending on the
>         test machine, it runs in between 15 and 20 minutes for just over
>         100GB. I can negotiate this time with our customer. The vacuum
>         process took another 5 to 7 minutes. This this what I was
>         referring to with the 30 minutes (point 3 in your questions)
> 
>         2) After pg_upgrade, I published the tables on the database (in
>         the sense "CREATE DATABASE") and subscribed to this publication
>         on the second server (logical replication). The data copy
>         processed started immediately and took around 1 hour. I then
>         loaded the indexes, what took > another 2h20m. At that point the
>         active-passive cluster was ready to go.
> 
> 
>     The index creation was done on the replicated machine I presume,
>     using what command?
> 
> 
> The sequence on the replicated machine was (pseudo-code to simplify the 
> syntax):
> - pg_dump --section=pre-data -h master_machine master_database | psql -h 
> replication_machine replication_database
> # This took seconds, "pre-data" discards the indexes
> 
> - psql -h replication_machine -c "CREATE SUBSCRIPTION mysub CONNECTION 
> "..." PUBLICATION mypub;" replication_database
> # This took about 1 hour for the initial sync
> 
> - pg_dump --section=post-data -h master_machine master_database | psql 
> -h replication_machine replication_database
> # This took 2h20m to load the various indexes
> 
> This sequence follows the recommendation of section 14.4.3 in 
> https://www.postgresql.org/docs/10/static/populate.html . If I stick to 
> streaming as we do today (e.g. pg_upgrade and then rsync to the 
> replication server), I can be ready in about 1 hour (more acceptable for 
> the customer).
> 

Just realized that by setting up the streaming as above you are already 
doing basically the same thing as I suggested in my previous post. 
Streaming and logical replication can exist at the same time:

https://www.postgresql.org/docs/10/static/logical-replication.html

"Logical replication is a method of replicating data objects and their 
changes, based upon their replication identity (usually a primary key). 
We use the term logical in contrast to physical replication, which uses 
exact block addresses and byte-by-byte replication. PostgreSQL supports 
both mechanisms concurrently, see Chapter 26. Logical replication allows 
fine-grained control over both data replication and security."

So you could set up the logical replication after the streaming is done 
using the copy_data=false clause and been done in a relatively short 
period of time. At that point you could decide whether to keep the 
streaming running or not.


-- 
Adrian Klaver
adrian.klaver@aklaver.com


Re: Fast logical replication jump start with PG 10

From
Martín Marqués
Date:
Hi,

El 26/05/18 a las 14:27, Adrian Klaver escribió:
> 
> I am still learning what logical replication is capable of so take the
> following with that in mind.
> 
> 1) I used
> pg_basebackup(www.postgresql.org/docs/10/static/app-pgbasebackup.html)
> to create a new $DATA directory for a replica instance.
> 
> 2) I configured the master and the replica for logical replication. Also
> changed the copied over conf files to work for the new instance e.g.
> changed the port number.
> 
> 3) I set up the PUBLICATION:
> 
> CREATE PUBLICATION everything FOR ALL TABLES;
> 
> 4) I set up the SUBSCRIPTION:
> 
> CREATE SUBSCRIPTION test_sub CONNECTION 'dbname=production user=postgres
> port=5432' PUBLICATION everything WITH(copy_data=false);
> 
> *NOTE* the copy_data=false.
> 
> 5) Then I started entering data in the master and it was replicated.
> 
> Caveats:
> 
> 1) This was a small database.
> 
> 2) The master and replica where on the same machine.
> 
> 3) There was no activity on the master between the pg_basebackup and the
> CREATE PUBLICATION/CREATE SUBSCRIPTION commands.

This last caveat is a stopper. If the active node is *active* (receiving
writes statements) you'll lose all those changes.

I would instead suggest using pglogical and the
pglogical_create_subscriber tool to create the subscriber from a basebackup.

Kind Regards,

-- 
Martín Marqués                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


Re: Fast logical replication jump start with PG 10

From
Jeff Janes
Date:
On Fri, May 25, 2018 at 5:12 PM, Olivier Gautherot <olivier@gautherot.net> wrote:

Is there a way to speed up the replication or should I rather stick to streaming replication? As I have only 1 database on the server, it would not be a show-stopper.

You have a method that works, and a client that is already twitchy about downtime and only upgrades their database once every 5 years.

I would not even consider the idea of combining a major-version upgrade with a complete change-over in replication technology in a single step in this situation.

If you will get some kind of benefit from switching to logical replication, you could first upgrade production and get a new physical replica going, then once that is returned to production you can create a new logical replica and get it all synced over at your leisure, then get it all tested and then cut the clients over from the physical replica to the logical replica.  

 Cheers,

Jeff

Re: Fast logical replication jump start with PG 10

From
Olivier Gautherot
Date:

On Sat, May 26, 2018 at 1:27 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 05/26/2018 06:23 AM, Olivier Gautherot wrote:
On Sat, May 26, 2018 at 12:11 AM, Adrian Klaver <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
    On 05/25/2018 06:35 PM, Olivier Gautherot wrote:
[snip]

The sequence on the replicated machine was (pseudo-code to simplify the syntax):
- pg_dump --section=pre-data -h master_machine master_database | psql -h replication_machine replication_database
# This took seconds, "pre-data" discards the indexes

- psql -h replication_machine -c "CREATE SUBSCRIPTION mysub CONNECTION "..." PUBLICATION mypub;" replication_database
# This took about 1 hour for the initial sync

- pg_dump --section=post-data -h master_machine master_database | psql -h replication_machine replication_database
# This took 2h20m to load the various indexes

This sequence follows the recommendation of section 14.4.3 in https://www.postgresql.org/docs/10/static/populate.html . If I stick to streaming as we do today (e.g. pg_upgrade and then rsync to the replication server), I can be ready in about 1 hour (more acceptable for the customer).

I am still learning what logical replication is capable of so take the following with that in mind.

1) I used pg_basebackup(www.postgresql.org/docs/10/static/app-pgbasebackup.html) to create a new $DATA directory for a replica instance.

Good tip, I'll give it a try.

2) I configured the master and the replica for logical replication. Also changed the copied over conf files to work for the new instance e.g. changed the port number.

3) I set up the PUBLICATION:

CREATE PUBLICATION everything FOR ALL TABLES;

This was what I was planning to do, so great.

4) I set up the SUBSCRIPTION:

CREATE SUBSCRIPTION test_sub CONNECTION 'dbname=production user=postgres port=5432' PUBLICATION everything WITH(copy_data=false);

*NOTE* the copy_data=false.

This was the bit I missed! Excellent point!

5) Then I started entering data in the master and it was replicated.

Caveats:

1) This was a small database.

I don't think the size is relevant in this specific case.

2) The master and replica where on the same machine.

Same comment: different ports mean basically different instances.

3) There was no activity on the master between the pg_basebackup and the CREATE PUBLICATION/CREATE SUBSCRIPTION commands.

This is also my plan for Production, so it's fine.

Thanks!!!

[snip]

--
Adrian Klaver
adrian.klaver@aklaver.com