Re: pg_upgrade and rsync - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: pg_upgrade and rsync
Date
Msg-id 20150123205246.GB25726@momjian.us
Whole thread Raw
In response to Re: pg_upgrade and rsync  (Stephen Frost <sfrost@snowman.net>)
List pgsql-hackers
On Fri, Jan 23, 2015 at 02:34:36PM -0500, Stephen Frost wrote:
> > Why? Just rsync the new data directory onto the old directory on the
> > standbys. That's fine and simple.
> 
> That still doesn't address the need to use --size-only, it would just
> mean that you don't need to use -H.  If anything the -H part is the
> aspect which worries me the least about this approach.

It took me a while to understand what Stephen was saying, so let me
explain the details so everyone can get on the same page.

First, let's look at the downsides of using non-hardlink rsync against a
slave cluster, whether we run pg_upgrade on the slave or not:
o  must preserve db directory and relfilenodes (4 new things for   pg_upgrade to preserve)o  must checksum files
becausethere is no way to distinguish   user tables/indexes (which don't need to be copied) from system
tables/indexes(which must be copied so it is in sync with   the master)o  must use log_wal_hints when the slave is
installedso the   checksums match
 

So, even if if all the checksums work, it will be slow/expensive.

Stephen's idea is quite interesting.  You run pg_upgrade on the master,
then, before you start the new server, you run rsync with special flags
and sync the old _and_ new clusters on the master with just the old
cluster on the standby (yeah, odd).  Yes, this is very odd, and where I
got lost too.

First, this only works when pg_upgrade is run in --link mode.  What
rsync --hard-links --size-only is going to do is record which files have
hard links, remember their inode numbers, and cross-reference the
hard-linked files.  When doing the rsync remote comparisons, the
master's old relfilenodes will match the standby's old relfilenodes, and
because we are using --size-only, they will be considered identical and
not copied, or even checksumed.  When it goes to do the standby's new
cluster, none of the directories will exist, so they will all be copied
along with the system objects (they are small), but the user
tables/indexes will be identified as already existing in the slave's old
cluster so it will hard-link to those standby's old cluster files. Once
rsync is complete, you can delete the old cluster on master and standby.
This is effectively duplicating the way pg_upgrade works.

What is interesting is that this will work on any version of pg_upgrade,
with no modifications, as long as link mode is used.  You _cannot_ run
initdb on the standby, as this will create system files that would
prevent the master's system files from being copied.  This is also going
to remove your recovery.conf on the standby, and replace your
postgresql.conf with the master's, so any modifications you made to the
standby will have to be saved and restored in to the new cluster before
starting.

I plan to run some tests soon to verify this method works, and if so, I
can document it in the pg_upgrade manual page.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: pgaudit - an auditing extension for PostgreSQL
Next
From: Alvaro Herrera
Date:
Subject: Re: REVIEW: Track TRUNCATE via pgstat