Re: pg_upgrade and rsync - Mailing list pgsql-hackers

From Jim Nasby
Subject Re: pg_upgrade and rsync
Date
Msg-id 54C29451.7020103@BlueTreble.com
Whole thread Raw
In response to Re: pg_upgrade and rsync  (Stephen Frost <sfrost@snowman.net>)
Responses Re: pg_upgrade and rsync
List pgsql-hackers
On 1/22/15 7:54 PM, Stephen Frost wrote:
> * Bruce Momjian (bruce@momjian.us) wrote:
>> >On Fri, Jan 23, 2015 at 01:19:33AM +0100, Andres Freund wrote:
>>> > >Or do you - as the text edited in your patch, but not the quote above -
>>> > >mean to run pg_upgrade just on the primary and then rsync?
>> >
>> >No, I was going to run it on both, then rsync.
> I'm pretty sure this is all a lot easier than you believe it to be.  If
> you want to recreate what pg_upgrade does to a cluster then the simplest
> thing to do is rsync before removing any of the hard links.  rsync will
> simply recreate the same hard link tree that pg_upgrade created when it
> ran, and update files which were actually changed (the catalog tables).
>
> The problem, as mentioned elsewhere, is that you have to checksum all
> the files because the timestamps will differ.  You can actually get
> around that with rsync if you really want though- tell it to only look
> at file sizes instead of size+time by passing in --size-only.

What if instead of trying to handle that on the rsync side, we changed pg_upgrade so that it created hardlinks that had
thesame timestamp as the original file?
 

That said, the whole timestamp race condition in rsync gives me the heebie-jeebies. For normal workloads maybe it's not
thatbig a deal, but when dealing with fixed-size data (ie: Postgres blocks)? Eww.
 

How horribly difficult would it be to allow pg_upgrade to operate on multiple servers? Could we have it create a shell
scriptinstead of directly modifying things itself? Or perhaps some custom "command file" that could then be replayed by
pg_upgradeon another server? Of course, that's assuming that replicas are compatible enough with masters for that to
work...
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Next
From: Stephen Frost
Date:
Subject: Re: pg_upgrade and rsync