Re: pg_upgrade + replica servers + rsync --size-only is unsafe - Mailing list pgsql-admin

From Rob Emery
Subject Re: pg_upgrade + replica servers + rsync --size-only is unsafe
Date
Msg-id CAPCETptm3cV_sr4uXgA_55Tc20RQ-1EN=fOBaH8+WhpRU_pZvw@mail.gmail.com
Whole thread Raw
In response to Re: pg_upgrade + replica servers + rsync --size-only is unsafe  (Stephen Frost <sfrost@snowman.net>)
Responses Re: pg_upgrade + replica servers + rsync --size-only is unsafe  (Stephen Frost <sfrost@snowman.net>)
List pgsql-admin
Hello Stephen,

>Then you run the rsync, at the /srv level, and all those 'new files' in
>/srv/new_cluster on the primary should get copied over to
>/srv/new_cluster on the replica, while any files which are hard-linked
>between old_cluster and new_cluster should end up as hard links on the
>replica.
>
>Thanks,

Thanks very much for your reply.  

We've re-ran our test upgrade this morning just to be certain and we reproduced the same result once again, it appears that the issue is not that the pg_controlfile is hardlinked, but that it exists at all.

The output of pg_controldata shows a different ID on the master and the secondary as you suspected:

Master:

$ ./pg_controldata /var/lib/postgresql/9.6/main
pg_control version number:            960
Catalog version number:               201608131
Database system identifier:           6901965895850147745
Database cluster state:               in production
pg_control last modified:             Thu 03 Dec 2020 10:11:21 GMT

Secondary:

$ ./pg_controldata /var/lib/postgresql/9.6/main
pg_control version number:            960
Catalog version number:               201608131
Database system identifier:           6901966223447496344
Database cluster state:               shut down
pg_control last modified:             Thu 03 Dec 2020 09:54:00 GMT

The pg_control file does exist on the secondary 9.6 prior to the rsync because it is created by the initdb command (which is step 4. of the instructions - https://www.postgresql.org/docs/9.6/pgupgrade.html; we verified that by running `/usr/lib/postgresql/9.6/bin/initdb /var/lib/postgresql/999/main` and it creates a pg_control file).  

The pg_control files are both exactly 8192 bytes; removing the --size-only option resolves this and causes rsync to copy the control file across, which can be confirmed by the --dry-run option:

$ rsync --verbose --dry-run --archive --delete --hard-links --no-inc-recursive /media/postgresql/data/main /media/postgresql/data/9.6 PGRETESTA02:/media/postgresql/data
building file list ... done

... <snip>

9.6/main/global/pg_control

... <snip>

sent 252,838 bytes  received 666 bytes  507,008.00 bytes/sec
total size is 12,061,021,649  speedup is 47,577.24 (DRY RUN)

So it seems like setting --size-only guarantees disaster in this flow? However we are definitely confused as to why this isn't hit more often, as it seems that this would always happen?

I've attached the following that might clarify:
- raw output of the rsync that we ran during step 12.a of our instructions (master-rsync.txt)
- pg_controldata raw output from both master and secondary once the error occurs (master-pgcontroldata.txt and secondary-pgcontroldata.txt)
- A copy of our internal instructions with the commands that we're using at each step to build our script. We've based this on the instructions at https://www.postgresql.org/docs/9.6/pgupgrade.html with some tweaks for our configuration.

Some things to note about our setup which I'm adding for completeness as it may have a bearing on the result we're seeing:
- the 9.5 database is currently in /media/postgresql/data/main and we are moving to a /media/postgresql/data/main/$version structure to make it easier for upgrades as part of this move (hence why the rsync command isn't 'symmetrical' in the arguments)
- additionally we symlink the /var/lib/postgresql/$version/main directory off to that location.
- We're using Debian 9 and the postgresql apt packages
- As we are using Debian, the configuration for the postgresql DB's lives in /etc/postgresql/$version/main

Many Thanks
Rob

On Wed, 2 Dec 2020 at 17:28, Stephen Frost <sfrost@snowman.net> wrote:
Greetings,

* Rob Emery (re-pgsql@codeweavers.net) wrote:
> We're pretty sure that we've just hit a scenario where the instructions for
> pg_upgrade with standby servers aren't quite safe.

I've pretty much always felt those instructions should come with a
"expert use only" note.

> We were testing our upgrade process with a copy of our live cluster in a
> lab environment and we found that using
>
> rsync --verbose --archive --delete --hard-links --size-only
> --no-inc-recursive /media/postgresql/data/main /media/postgresql/data/9.6
> PGRETESTA02:/media/postgresql/data
>
> Resulted in:
>
> 2020-12-02 14:49:11.513 GMT [20884-1] LOG:  database system was shut
> down in recovery at 2020-12-02 13:57:56 GMT
> 2020-12-02 14:49:11.513 GMT [20884-2] LOG:  entering standby mode
> 2020-12-02 14:49:11.557 GMT [20884-3] LOG:  consistent recovery state
> reached at 19E/25000098
> 2020-12-02 14:49:11.557 GMT [20884-4] LOG:  invalid record length at
> 19E/25000098: wanted 24, got 0
> 2020-12-02 14:49:11.559 GMT [20883-1] LOG:  database system is ready
> to accept read only connections
> 2020-12-02 14:49:11.593 GMT [20888-1] FATAL:  database system
> identifier differs between the primary and standby
> 2020-12-02 14:49:11.593 GMT [20888-2] DETAIL:  The primary's
> identifier is 6901669428825624285, the standby's identifier is
> 690161835164
> 1138930.
>
> however if we don't use --size-only, then the process worked fine and we a
> smooth upgrade.

This doesn't look like a --size-only issue though- how did you end up
with this when the pg_controldata file (where the system identifier is
pulled from) shouldn't even exist and won't be a hardlink to the
existing one on the old system since it's a new cluster, and therefore
should definitely be copied?

When you go to run the rsync to get the replica up to date you should
have a system that looks like this:

PRIMARY:

/srv/old_cluster
   ... lots of files
/srv/new_cluster
   ... lots of *new* files, including pg_controldata
   ... hardlinks to PG table/index files that aren't part of the catalog

REPLICA:

/srv/old_cluster
   ... lots of files
/srv/new_cluster
   ... entirely empty

Then you run the rsync, at the /srv level, and all those 'new files' in
/srv/new_cluster on the primary should get copied over to
/srv/new_cluster on the replica, while any files which are hard-linked
between old_cluster and new_cluster should end up as hard links on the
replica.

Thanks,

Stephen


--
Robert Emery
Infrastructure Director

01785 711633

<> Codeweavers

Phone:  0800 021 0888
Website:  codeweavers.net

Barn 4, Dunston Business Village, ST18 9AB. Registered in England and
Wales No. 04092394 VAT registration no. 974 9705 63


Attachment

pgsql-admin by date:

Previous
From: John Scalia
Date:
Subject: Re: Seeing privileges on a schema
Next
From: Stephen Frost
Date:
Subject: Re: pg_upgrade + replica servers + rsync --size-only is unsafe