Greetings,
* Martín Fernández (fmartin91@gmail.com) wrote:
> I've wrote a couple of questions around pg_upgrade and updating standbys using rsync last week. We were able to
successfullyupgrade half of our cluster (the other half was kept for failover) from pg92 with postgis 1.5.8 to pg10
withpostgis 2.4. It was a really interesting challenge because of postgis binary incompatibility for geometry data
types.
>
> The rsync call that we used looked exactly like this (taken from pg_upgrade man page basically):
>
> `rsync --verbose --verbose --progress --archive --delete --hard-links --size-only --no-inc-recursive
/var/lib/postgres/9.2/var/lib/postgres/10 $REPLICA_IP:/var/lib/postgres`
>
> We are now in the journey of upgrading the other half of the cluster since we have concluded that the upgrade was
successful.
>
> We are planning on using the same rsync call to upgrade the rest of the standbys (in combination with
pg_start_backup/pg_stop_backuplow level api). My only concern is that I'm not 100% sure if the `--size-only` flag will
beenough to guarantee that files are the same. On the initial set of standbys that we upgraded this shouldn't generate
anissue since the standbys were at the same last checkpoint than the master and we did the rsync call before starting
theprimary (after running pg_upgrade).
No, you can't use --size-only to rebuild those replicas while the
primary is online, even if you're using pg_start/stop_backup, you should
really enable the rsync checksums-based check to make sure that you're
copying all of the files that you need to from the primary to the
replica during the pg_start/stop_backup, and then you need to make sure
and have an appropriate backup_label installed on all the replicas to
get them to replay from the pg_start_backup checkpoint and to replay
through to the end of the pg_stop_backup. You would have a
recovery.conf file already but you might need to make sure it has a
restore_command which can pull back WAL that might have already been
archived by the primary.
Note that this method of rebuilding the replicas will likely be
time-consuming but unfortunately it's necessary. There are alternatives
to using rsync to perform this if you need to get it done faster.
> Is there any potential issues that could show up if we do it with --size-only ? Should we use the default rsync
mechanismthat would check for size and timestamps ?
I wouldn't trust just size/timestamp in this case, you really should use
checksums.
> Hoping someone has some better experience than me on upgrading standbys using rsync.
The rsync-based pg_upgrade mechanism for replicas *only* works when it's
done after all of the systems have been shut down and you've verified
that all the nodes reached the same shutdown checkpoint, it is *not*
appropriate for online rebuilding of replicas.
Thanks!
Stephen