Thread: Advice Needed: Simultaneous Upgrade of Two-Node PostgreSQL 11 Cluster

Dear PostgreSQL Support Team,
I am planning to upgrade both nodes of our PostgreSQL 11 cluster simultaneously and would like your review of our proposed approach:
Current Setup:

Two PostgreSQL 11 nodes managed by Patroni
Planning for full outage during upgrade but I need to upgrade two node not replicate data from primary after upgrade because I have two site and if I use replication for 1.8 TB that will take 1 month

Proposed Upgrade Approach:

Pre-upgrade Steps:

Take full backup using pg_dumpall
Stop all application connections
Stop Patroni on both nodes


Simultaneous Upgrade Process:

Install new PostgreSQL version packages on both nodes
Run pg_upgrade with --check flag on both nodes
Execute pg_upgrade with --link option on both nodes simultaneously
Update Patroni configuration for new version on both nodes


Post-upgrade Steps:

Start Patroni on intended primary node first
Start Patroni on secondary node
Verify cluster health



Please review this approach and advise:

Are there any critical risks in upgrading both nodes simultaneously?
What additional precautions should we take?
What specific problems might we encounter with Patroni after such an upgrade?

Your guidance would be greatly appreciated.
Best regards,
On Sun, 2025-01-05 at 19:57 +0200, kasem adel wrote:
> Proposed Upgrade Approach:
>
> Simultaneous Upgrade Process:
>
> Install new PostgreSQL version packages on both nodes
> Run pg_upgrade with --check flag on both nodes
> Execute pg_upgrade with --link option on both nodes simultaneously
> Update Patroni configuration for new version on both nodes

No, that won't work.

You have to upgrade the primary, then rebuild the standby with
"patronictl reinit".  There is no safe way to upgrade the standby.

Yours,
Laurenz Albe



Dear Laurenz Albe

Our application for reporting  is  dependent on the replica node for handling read-only operations. If we proceed with the proposed changes, it would result in a one-month application downtime, which would significantly impact our operations and is not a feasible option for us.

We would greatly appreciate your insights and support in identifying alternative solutions to minimize down time (one month)or another solution alternative for rebuild replica from scratch. Your expertise and advice would be invaluable in helping us navigate this challenge effectively.

Thank you for your time and assistance.


في الاثنين، ٦ يناير ٢٠٢٥ ٧:١٣ م Laurenz Albe <laurenz.albe@cybertec.at> كتب:
On Sun, 2025-01-05 at 19:57 +0200, kasem adel wrote:
> Proposed Upgrade Approach:
>
> Simultaneous Upgrade Process:
>
> Install new PostgreSQL version packages on both nodes
> Run pg_upgrade with --check flag on both nodes
> Execute pg_upgrade with --link option on both nodes simultaneously
> Update Patroni configuration for new version on both nodes

No, that won't work.

You have to upgrade the primary, then rebuild the standby with
"patronictl reinit".  There is no safe way to upgrade the standby.

Yours,
Laurenz Albe
On Tue, 2025-01-07 at 01:59 +0200, kasem adel wrote:
> Dear Laurenz Albe
> في الاثنين، ٦ يناير ٢٠٢٥ ٧:١٣ م Laurenz Albe <laurenz.albe@cybertec.at> كتب:
> > On Sun, 2025-01-05 at 19:57 +0200, kasem adel wrote:
> > > Proposed Upgrade Approach:
> > >
> > > Simultaneous Upgrade Process:
> > >
> > > Install new PostgreSQL version packages on both nodes
> > > Run pg_upgrade with --check flag on both nodes
> > > Execute pg_upgrade with --link option on both nodes simultaneously
> > > Update Patroni configuration for new version on both nodes
> >
> > No, that won't work.
> >
> > You have to upgrade the primary, then rebuild the standby with
> > "patronictl reinit".  There is no safe way to upgrade the standby.
>
> Our application for reporting  is  dependent on the replica node for handling
> read-only operations. If we proceed with the proposed changes, it would result
> in a one-month application downtime, which would significantly impact our
> operations and is not a feasible option for us.
>
> We would greatly appreciate your insights and support in identifying
> alternative solutions [...]

One month to rebuild a standby?
Before I discuss options, let me ask how you plan to survive the next time you
have to restore your database from a backup.

There are two paths you can consider:
1. Logical replication into a new Patroni cluster.
   That is not simple.

2. Following the steps in
   https://www.postgresql.org/docs/current/pgupgrade.html#PGUPGRADE-STEP-REPLICAS
   to upgrade the standby server.
   That is not for the faint of heart either.

In both cases, you'd have to test the procedure well, and it might be a good idea
to get professional assistance.

Yours,
Laurenz Albe



> On Jan 6, 2025, at 11:57 PM, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>
> One month to rebuild a standby?

In addition to other alternatives presented:

- better hardware, one month to restore seems insane
- separate OLTP from data warehouse for reporting
- if the issue is network bandwidth, then local backup, bzip2 for compression

If you do go to logical replication, be sure to look at new features in 17 for managing slots and syncing up.

How large is this database?


Dear Scott Ribe

Kindly be informed that database size is 1.8 TB and we have an low network bandwidth between two node 10 mbps and to transfer the data from scratch that will take from 21 day to 30 day  for this we need solution in upgrade to prevent load data from scratch .

Thanks


في الثلاثاء، ٧ يناير ٢٠٢٥ ٣:٣٨ م Scott Ribe <scott_ribe@elevated-dev.com> كتب:
> On Jan 6, 2025, at 11:57 PM, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>
> One month to rebuild a standby?

In addition to other alternatives presented:

- better hardware, one month to restore seems insane
- separate OLTP from data warehouse for reporting
- if the issue is network bandwidth, then local backup, bzip2 for compression

If you do go to logical replication, be sure to look at new features in 17 for managing slots and syncing up.

How large is this database?
> On Jan 7, 2025, at 6:50 AM, kasem adel <kasemadel8@gmail.com> wrote:
>
> Kindly be informed that database size is 1.8 TB and we have an low network bandwidth between two node 10 mbps and to
transferthe data from scratch that will take from 21 day to 30 day  for this we need solution in upgrade to prevent
loaddata from scratch . 

So, best case, bzip before transfer might get you down to 3 days, maybe. That's one alternative. Others:

- Obviously, if you could get higher network bandwidth temporarily, that would help tremendously.
- Express ship DVD/NVMe/SSD with the backup.
- I assume that if 10Mbps keeps up with changes, most of this data is inactive, are there inactive tables/partitions?
  - Move inactive data into a separate PG cluster, use foreign data wrappers to access it from active so that your
usersdon't see the split. 
  - Upgrade cluster with inactive data locally.
  - Whatever the process is to move data from active to inactive, you'll have to suspend it for the duration.
  -  Whether it's a month or 3 days, transfer the base backup of inactive over the slow link (keeping WAL, all the
thingsyou have to do to make sure that a replica started against this backup will be in sync and stream when brought
up)
  - Proceed with upgrade of cluster with active data (presuming that it is *much* smaller)

Regardless, managing TB+ databases over a 10Mbps link is a challenge.


> Are you mean do the following step:
>
> 1- upgrade primary node
> 2- take base backup from new  version in primary node and keep wal file for 3 days
> 3- move backup by external hard disk to the replica node
> 4- restore base backup to replica
> 5- start replication to replicate delta from primary node.
>
> Please confirm if this the best approach and it will work without risk.

Yes, that's what I meant. It will work, nothing is completely without risk ;-) Main thing is to make absolutely sure
youdon't lose WAL during that time period. If you could set up WAL archiving to push to the remote site, that would be
greatas then you could configure the remote to pull accumulated WAL locally instead of across the slow network link. 

But, just thought of this:

- pg_upgrade both sides
- with neither side running, rsync the data directories (bonus points for being paranoid and using -c)
- fix up the standby flag on the standby
- fix up the postgresql.conf -- for instance, standby config has been moved out of a separate file into the main one,
sothat you can have common config both sides now, with the only necessary difference being the standby flag 
- bring them up

You could even try to figure out where the catalog tables are stored and only rsync those, since pg_)upgrade doesn't
changethe format of your data files. But personally, I wouldn't. I wouldn't want to introduce the possibility of error
onmy part, and the rsync checksum is a nice check that nothing has gotten corrupted over time from network or disk
glitch.(Excluding disk glitch on the primary...) 

You may be wondering why not just pg_upgrade both sides? Well, pg_upgrade ourput should be deterministic, right? So if
youmake sure that clients are disconnected and standby is completely in sync before starting, why not? Maybe you could.
Butbecause it's not designed nor documented for that use, so although you likely could make it work, that's a dangerous
path.The last thing you want to do is take your server down for this scheduled operation, and wind up at the end with
anunusable standby. 




AW: Advice Needed: Simultaneous Upgrade of Two-Node PostgreSQL 11 Cluster

From
"Dischner, Anton"
Date:
Hi,

if you are using the same slow internet connection for WAL and data-transfer you might considering to use rsync which
hasa full set of throttling, resuming/ibcremential and as mentioned data-compression options so that you do not
saturateyou connection, 

BTW nice challenge,

Best,

Anton

-----Ursprüngliche Nachricht-----
Von: Scott Ribe <scott_ribe@elevated-dev.com>
Gesendet: Dienstag, 7. Januar 2025 15:42
An: kasem adel <kasemadel8@gmail.com>
Cc: pgsql-admin <pgsql-admin@postgresql.org>
Betreff: Re: Advice Needed: Simultaneous Upgrade of Two-Node PostgreSQL 11 Cluster

> Are you mean do the following step:
>
> 1- upgrade primary node
> 2- take base backup from new  version in primary node and keep wal file for 3 days
> 3- move backup by external hard disk to the replica node
> 4- restore base backup to replica
> 5- start replication to replicate delta from primary node.
>
> Please confirm if this the best approach and it will work without risk.

Yes, that's what I meant. It will work, nothing is completely without risk ;-) Main thing is to make absolutely sure
youdon't lose WAL during that time period. If you could set up WAL archiving to push to the remote site, that would be
greatas then you could configure the remote to pull accumulated WAL locally instead of across the slow network link. 

But, just thought of this:

- pg_upgrade both sides
- with neither side running, rsync the data directories (bonus points for being paranoid and using -c)
- fix up the standby flag on the standby
- fix up the postgresql.conf -- for instance, standby config has been moved out of a separate file into the main one,
sothat you can have common config both sides now, with the only necessary difference being the standby flag 
- bring them up

You could even try to figure out where the catalog tables are stored and only rsync those, since pg_)upgrade doesn't
changethe format of your data files. But personally, I wouldn't. I wouldn't want to introduce the possibility of error
onmy part, and the rsync checksum is a nice check that nothing has gotten corrupted over time from network or disk
glitch.(Excluding disk glitch on the primary...) 

You may be wondering why not just pg_upgrade both sides? Well, pg_upgrade ourput should be deterministic, right? So if
youmake sure that clients are disconnected and standby is completely in sync before starting, why not? Maybe you could.
Butbecause it's not designed nor documented for that use, so although you likely could make it work, that's a dangerous
path.The last thing you want to do is take your server down for this scheduled operation, and wind up at the end with
anunusable standby. 






Dear Anton

I need the type of backup that I can take from primary to replica regular backup or base backup or copy data director. 

Thanks

في الثلاثاء، ٧ يناير ٢٠٢٥ ٥:٤١ م Dischner, Anton <Anton.Dischner@med.uni-muenchen.de> كتب:
Hi,

if you are using the same slow internet connection for WAL and data-transfer you might considering to use rsync which has a full set of throttling, resuming/ibcremential and as mentioned data-compression options so that you do not saturate you connection,

BTW nice challenge,

Best,

Anton

-----Ursprüngliche Nachricht-----
Von: Scott Ribe <scott_ribe@elevated-dev.com>
Gesendet: Dienstag, 7. Januar 2025 15:42
An: kasem adel <kasemadel8@gmail.com>
Cc: pgsql-admin <pgsql-admin@postgresql.org>
Betreff: Re: Advice Needed: Simultaneous Upgrade of Two-Node PostgreSQL 11 Cluster

> Are you mean do the following step:
>
> 1- upgrade primary node
> 2- take base backup from new  version in primary node and keep wal file for 3 days
> 3- move backup by external hard disk to the replica node
> 4- restore base backup to replica
> 5- start replication to replicate delta from primary node.
>
> Please confirm if this the best approach and it will work without risk.

Yes, that's what I meant. It will work, nothing is completely without risk ;-) Main thing is to make absolutely sure you don't lose WAL during that time period. If you could set up WAL archiving to push to the remote site, that would be great as then you could configure the remote to pull accumulated WAL locally instead of across the slow network link.

But, just thought of this:

- pg_upgrade both sides
- with neither side running, rsync the data directories (bonus points for being paranoid and using -c)
- fix up the standby flag on the standby
- fix up the postgresql.conf -- for instance, standby config has been moved out of a separate file into the main one, so that you can have common config both sides now, with the only necessary difference being the standby flag
- bring them up

You could even try to figure out where the catalog tables are stored and only rsync those, since pg_)upgrade doesn't change the format of your data files. But personally, I wouldn't. I wouldn't want to introduce the possibility of error on my part, and the rsync checksum is a nice check that nothing has gotten corrupted over time from network or disk glitch. (Excluding disk glitch on the primary...)

You may be wondering why not just pg_upgrade both sides? Well, pg_upgrade ourput should be deterministic, right? So if you make sure that clients are disconnected and standby is completely in sync before starting, why not? Maybe you could. But because it's not designed nor documented for that use, so although you likely could make it work, that's a dangerous path. The last thing you want to do is take your server down for this scheduled operation, and wind up at the end with an unusable standby.



Dear Anton 

For regular backup I mean take dump from specific database that we are use.

Thanks 

في الثلاثاء، ٧ يناير ٢٠٢٥ ٦:٢٦ م kasem adel <kasemadel8@gmail.com> كتب:
Dear Anton

I need the type of backup that I can take from primary to replica regular backup or base backup or copy data director. 

Thanks

في الثلاثاء، ٧ يناير ٢٠٢٥ ٥:٤١ م Dischner, Anton <Anton.Dischner@med.uni-muenchen.de> كتب:
Hi,

if you are using the same slow internet connection for WAL and data-transfer you might considering to use rsync which has a full set of throttling, resuming/ibcremential and as mentioned data-compression options so that you do not saturate you connection,

BTW nice challenge,

Best,

Anton

-----Ursprüngliche Nachricht-----
Von: Scott Ribe <scott_ribe@elevated-dev.com>
Gesendet: Dienstag, 7. Januar 2025 15:42
An: kasem adel <kasemadel8@gmail.com>
Cc: pgsql-admin <pgsql-admin@postgresql.org>
Betreff: Re: Advice Needed: Simultaneous Upgrade of Two-Node PostgreSQL 11 Cluster

> Are you mean do the following step:
>
> 1- upgrade primary node
> 2- take base backup from new  version in primary node and keep wal file for 3 days
> 3- move backup by external hard disk to the replica node
> 4- restore base backup to replica
> 5- start replication to replicate delta from primary node.
>
> Please confirm if this the best approach and it will work without risk.

Yes, that's what I meant. It will work, nothing is completely without risk ;-) Main thing is to make absolutely sure you don't lose WAL during that time period. If you could set up WAL archiving to push to the remote site, that would be great as then you could configure the remote to pull accumulated WAL locally instead of across the slow network link.

But, just thought of this:

- pg_upgrade both sides
- with neither side running, rsync the data directories (bonus points for being paranoid and using -c)
- fix up the standby flag on the standby
- fix up the postgresql.conf -- for instance, standby config has been moved out of a separate file into the main one, so that you can have common config both sides now, with the only necessary difference being the standby flag
- bring them up

You could even try to figure out where the catalog tables are stored and only rsync those, since pg_)upgrade doesn't change the format of your data files. But personally, I wouldn't. I wouldn't want to introduce the possibility of error on my part, and the rsync checksum is a nice check that nothing has gotten corrupted over time from network or disk glitch. (Excluding disk glitch on the primary...)

You may be wondering why not just pg_upgrade both sides? Well, pg_upgrade ourput should be deterministic, right? So if you make sure that clients are disconnected and standby is completely in sync before starting, why not? Maybe you could. But because it's not designed nor documented for that use, so although you likely could make it work, that's a dangerous path. The last thing you want to do is take your server down for this scheduled operation, and wind up at the end with an unusable standby.





Hello,

Could you first clarify the reason for using 'pg_dump' in the upgrade method?

In a PostgreSQL database environment managed by a Patroni cluster, if there aren't overly complex extensions in use, you can replicate the database to the new environment and then perform a failover during the upgrade process. By using the 'pg_upgrade' method, you can first run it with the '--check' flag to ensure compatibility, and then proceed with the upgrade.

Since the PostgreSQL cluster will be initialized with Patroni, there should be no issues, and this approach will likely reduce your workload.

If your database is small or involves a highly complex structure, 'pg_dump' is indeed an option. However, is it truly necessary? Testing this approach would provide more clarity. 

Dischner, Anton <Anton.Dischner@med.uni-muenchen.de>, 7 Oca 2025 Sal, 18:41 tarihinde şunu yazdı:
Hi,

if you are using the same slow internet connection for WAL and data-transfer you might considering to use rsync which has a full set of throttling, resuming/ibcremential and as mentioned data-compression options so that you do not saturate you connection,

BTW nice challenge,

Best,

Anton

-----Ursprüngliche Nachricht-----
Von: Scott Ribe <scott_ribe@elevated-dev.com>
Gesendet: Dienstag, 7. Januar 2025 15:42
An: kasem adel <kasemadel8@gmail.com>
Cc: pgsql-admin <pgsql-admin@postgresql.org>
Betreff: Re: Advice Needed: Simultaneous Upgrade of Two-Node PostgreSQL 11 Cluster

> Are you mean do the following step:
>
> 1- upgrade primary node
> 2- take base backup from new  version in primary node and keep wal file for 3 days
> 3- move backup by external hard disk to the replica node
> 4- restore base backup to replica
> 5- start replication to replicate delta from primary node.
>
> Please confirm if this the best approach and it will work without risk.

Yes, that's what I meant. It will work, nothing is completely without risk ;-) Main thing is to make absolutely sure you don't lose WAL during that time period. If you could set up WAL archiving to push to the remote site, that would be great as then you could configure the remote to pull accumulated WAL locally instead of across the slow network link.

But, just thought of this:

- pg_upgrade both sides
- with neither side running, rsync the data directories (bonus points for being paranoid and using -c)
- fix up the standby flag on the standby
- fix up the postgresql.conf -- for instance, standby config has been moved out of a separate file into the main one, so that you can have common config both sides now, with the only necessary difference being the standby flag
- bring them up

You could even try to figure out where the catalog tables are stored and only rsync those, since pg_)upgrade doesn't change the format of your data files. But personally, I wouldn't. I wouldn't want to introduce the possibility of error on my part, and the rsync checksum is a nice check that nothing has gotten corrupted over time from network or disk glitch. (Excluding disk glitch on the primary...)

You may be wondering why not just pg_upgrade both sides? Well, pg_upgrade ourput should be deterministic, right? So if you make sure that clients are disconnected and standby is completely in sync before starting, why not? Maybe you could. But because it's not designed nor documented for that use, so although you likely could make it work, that's a dangerous path. The last thing you want to do is take your server down for this scheduled operation, and wind up at the end with an unusable standby.





Dear sozcn

No I mean when take backup to restore it to replica after upgrade what type of backup I  need to take to enable replication after restore backup in my last email to upgrade.

Thanks


في الثلاثاء، ٧ يناير ٢٠٢٥ ١١:٤٣ م SOzcn <selahattinozcnma@gmail.com> كتب:


Hello,

Could you first clarify the reason for using 'pg_dump' in the upgrade method?

In a PostgreSQL database environment managed by a Patroni cluster, if there aren't overly complex extensions in use, you can replicate the database to the new environment and then perform a failover during the upgrade process. By using the 'pg_upgrade' method, you can first run it with the '--check' flag to ensure compatibility, and then proceed with the upgrade.

Since the PostgreSQL cluster will be initialized with Patroni, there should be no issues, and this approach will likely reduce your workload.

If your database is small or involves a highly complex structure, 'pg_dump' is indeed an option. However, is it truly necessary? Testing this approach would provide more clarity. 

Dischner, Anton <Anton.Dischner@med.uni-muenchen.de>, 7 Oca 2025 Sal, 18:41 tarihinde şunu yazdı:
Hi,

if you are using the same slow internet connection for WAL and data-transfer you might considering to use rsync which has a full set of throttling, resuming/ibcremential and as mentioned data-compression options so that you do not saturate you connection,

BTW nice challenge,

Best,

Anton

-----Ursprüngliche Nachricht-----
Von: Scott Ribe <scott_ribe@elevated-dev.com>
Gesendet: Dienstag, 7. Januar 2025 15:42
An: kasem adel <kasemadel8@gmail.com>
Cc: pgsql-admin <pgsql-admin@postgresql.org>
Betreff: Re: Advice Needed: Simultaneous Upgrade of Two-Node PostgreSQL 11 Cluster

> Are you mean do the following step:
>
> 1- upgrade primary node
> 2- take base backup from new  version in primary node and keep wal file for 3 days
> 3- move backup by external hard disk to the replica node
> 4- restore base backup to replica
> 5- start replication to replicate delta from primary node.
>
> Please confirm if this the best approach and it will work without risk.

Yes, that's what I meant. It will work, nothing is completely without risk ;-) Main thing is to make absolutely sure you don't lose WAL during that time period. If you could set up WAL archiving to push to the remote site, that would be great as then you could configure the remote to pull accumulated WAL locally instead of across the slow network link.

But, just thought of this:

- pg_upgrade both sides
- with neither side running, rsync the data directories (bonus points for being paranoid and using -c)
- fix up the standby flag on the standby
- fix up the postgresql.conf -- for instance, standby config has been moved out of a separate file into the main one, so that you can have common config both sides now, with the only necessary difference being the standby flag
- bring them up

You could even try to figure out where the catalog tables are stored and only rsync those, since pg_)upgrade doesn't change the format of your data files. But personally, I wouldn't. I wouldn't want to introduce the possibility of error on my part, and the rsync checksum is a nice check that nothing has gotten corrupted over time from network or disk glitch. (Excluding disk glitch on the primary...)

You may be wondering why not just pg_upgrade both sides? Well, pg_upgrade ourput should be deterministic, right? So if you make sure that clients are disconnected and standby is completely in sync before starting, why not? Maybe you could. But because it's not designed nor documented for that use, so although you likely could make it work, that's a dangerous path. The last thing you want to do is take your server down for this scheduled operation, and wind up at the end with an unusable standby.





In this case, I don’t think there’s a need to complicate the steps. If I were in your position, I would likely follow the steps outlined below:

- Set up a Patroni Cluster, then disable the Patroni Cluster service.
- Install the same PostgreSQL version in the new environment and create a replica using pg_basebackup.
- Set up a cluster for the new PostgreSQL version you intend to use in the new environment, whether it’s 15, 16, or 17, based on your preference.
- Update the Patroni Cluster configuration to point to the data path of the new version.
- When it’s time to upgrade, failover to the old version to make it the - --
- Primary, then perform the upgrade. Once upgraded, transition to the new version, upgrade it, and enable the Patroni Cluster.
- Afterward, to maintain the health of the environment, initiate backup processes. If you are using pgbackrest, start the backup workflows. If not, configure regular backups using pg_basebackup or pg_dump and automate them with crontab.

This is a general approach I’ve followed in similar cases.

kasem adel <kasemadel8@gmail.com>, 8 Oca 2025 Çar, 02:40 tarihinde şunu yazdı:
Dear sozcn

No I mean when take backup to restore it to replica after upgrade what type of backup I  need to take to enable replication after restore backup in my last email to upgrade.

Thanks


في الثلاثاء، ٧ يناير ٢٠٢٥ ١١:٤٣ م SOzcn <selahattinozcnma@gmail.com> كتب:


Hello,

Could you first clarify the reason for using 'pg_dump' in the upgrade method?

In a PostgreSQL database environment managed by a Patroni cluster, if there aren't overly complex extensions in use, you can replicate the database to the new environment and then perform a failover during the upgrade process. By using the 'pg_upgrade' method, you can first run it with the '--check' flag to ensure compatibility, and then proceed with the upgrade.

Since the PostgreSQL cluster will be initialized with Patroni, there should be no issues, and this approach will likely reduce your workload.

If your database is small or involves a highly complex structure, 'pg_dump' is indeed an option. However, is it truly necessary? Testing this approach would provide more clarity. 

Dischner, Anton <Anton.Dischner@med.uni-muenchen.de>, 7 Oca 2025 Sal, 18:41 tarihinde şunu yazdı:
Hi,

if you are using the same slow internet connection for WAL and data-transfer you might considering to use rsync which has a full set of throttling, resuming/ibcremential and as mentioned data-compression options so that you do not saturate you connection,

BTW nice challenge,

Best,

Anton

-----Ursprüngliche Nachricht-----
Von: Scott Ribe <scott_ribe@elevated-dev.com>
Gesendet: Dienstag, 7. Januar 2025 15:42
An: kasem adel <kasemadel8@gmail.com>
Cc: pgsql-admin <pgsql-admin@postgresql.org>
Betreff: Re: Advice Needed: Simultaneous Upgrade of Two-Node PostgreSQL 11 Cluster

> Are you mean do the following step:
>
> 1- upgrade primary node
> 2- take base backup from new  version in primary node and keep wal file for 3 days
> 3- move backup by external hard disk to the replica node
> 4- restore base backup to replica
> 5- start replication to replicate delta from primary node.
>
> Please confirm if this the best approach and it will work without risk.

Yes, that's what I meant. It will work, nothing is completely without risk ;-) Main thing is to make absolutely sure you don't lose WAL during that time period. If you could set up WAL archiving to push to the remote site, that would be great as then you could configure the remote to pull accumulated WAL locally instead of across the slow network link.

But, just thought of this:

- pg_upgrade both sides
- with neither side running, rsync the data directories (bonus points for being paranoid and using -c)
- fix up the standby flag on the standby
- fix up the postgresql.conf -- for instance, standby config has been moved out of a separate file into the main one, so that you can have common config both sides now, with the only necessary difference being the standby flag
- bring them up

You could even try to figure out where the catalog tables are stored and only rsync those, since pg_)upgrade doesn't change the format of your data files. But personally, I wouldn't. I wouldn't want to introduce the possibility of error on my part, and the rsync checksum is a nice check that nothing has gotten corrupted over time from network or disk glitch. (Excluding disk glitch on the primary...)

You may be wondering why not just pg_upgrade both sides? Well, pg_upgrade ourput should be deterministic, right? So if you make sure that clients are disconnected and standby is completely in sync before starting, why not? Maybe you could. But because it's not designed nor documented for that use, so although you likely could make it work, that's a dangerous path. The last thing you want to do is take your server down for this scheduled operation, and wind up at the end with an unusable standby.





> On Jan 7, 2025, at 6:50 AM, kasem adel <kasemadel8@gmail.com> wrote:
>
> Kindly be informed that database size is 1.8 TB and we have an low network bandwidth between two node 10 mbps and to
transferthe data from scratch that will take from 21 day to 30 day  for this we need solution in upgrade to prevent
loaddata from scratch . 

Yes, and I gave you alternatives taking that into account. I think you have all the answers you're going to get, and
nowhave to study them and choose an approach.