Thread: Feedback about hybrid SAN snap and rsync'd approach for large systemcloning
Feedback about hybrid SAN snap and rsync'd approach for large systemcloning
From
Jerry Sievers
Date:
Suppose we have a DB cluster with an additional tablespace and we are able to make an atomic SAN snapshot of *only* the main cluster volume... The additional tablespace contains only UNLOGGED relations. We cannot snap that volume so we use rsync as follows... 1. pg_start_backup('foo'); make SAN snapshot rsync the add'l tablespace pg_stop_backup() Now provision a new cluster around the snapshot and rsync'd volume, rejigger the pg_tblspc link if necessary... and start it up maybe or not having it remain as a streaming replica. It's been my experience that possibly bulky data in the additional tablespace does *not* need be rsync'd if we capture only the *_init files. Id' be curious to here feedback re the sanity of this approach. And would also like to know if perhaps *only* the directories under the rsync'd tablespace actually must be present for a successful recovery. The above approach has worked mumerous times even with origin systems having large, churny contents in the dedicated unlogged tablespace (which is on a much faster local NVME volume than the main SAN volume.) Thanks!
Re: Feedback about hybrid SAN snap and rsync'd approach for large systemcloning
From
Laurenz Albe
Date:
On Tue, 2022-04-12 at 18:07 -0500, Jerry Sievers wrote: > Suppose we have a DB cluster with an additional tablespace and we are > able to make an atomic SAN snapshot of *only* the main cluster > volume... > > The additional tablespace contains only UNLOGGED relations. > > We cannot snap that volume so we use rsync as follows... > > 1. pg_start_backup('foo'); > make SAN snapshot > rsync the add'l tablespace > pg_stop_backup() > > Now provision a new cluster around the snapshot and rsync'd volume, > rejigger the pg_tblspc link if necessary... and start it up maybe or > not having it remain as a streaming replica. > > It's been my experience that possibly bulky data in the additional > tablespace does *not* need be rsync'd if we capture only the *_init > files. > > Id' be curious to here feedback re the sanity of this approach. I thought that an unlogged table is rendered empty by recovery, but perhaps I misunderstood what you are doing. Yours, Laurenz Albe
Re: Feedback about hybrid SAN snap and rsync'd approach for large systemcloning
From
Stephen Frost
Date:
Greetings, * Jerry Sievers (gsievers19@comcast.net) wrote: > Suppose we have a DB cluster with an additional tablespace and we are > able to make an atomic SAN snapshot of *only* the main cluster > volume... > > The additional tablespace contains only UNLOGGED relations. > > We cannot snap that volume so we use rsync as follows... > > 1. pg_start_backup('foo'); > make SAN snapshot > rsync the add'l tablespace > pg_stop_backup() Surely you're also doing WAL archiving? You shouldn't ever be using start/stop backup without also doing WAL archiving as you must capture all of the WAL between the two. > Now provision a new cluster around the snapshot and rsync'd volume, > rejigger the pg_tblspc link if necessary... and start it up maybe or > not having it remain as a streaming replica. Dedicated backup tools know how to do tablespace remapping. > It's been my experience that possibly bulky data in the additional > tablespace does *not* need be rsync'd if we capture only the *_init > files. That isn't a trivial thing to do with rsync tho and "try it and see if it works" is a really bad way to go about this- it's important to understand what the files are in the data directory, how PG uses them, how PG recovery works, etc. That's why there are folks who write dedicated backup tools for PG (myself included). > Id' be curious to here feedback re the sanity of this approach. I wouldn't recommend it is the short answer. Note that you must be doing WAL archiving and you really should (and will have to, as of 15) use non-exclusive backup when you're doing this and be sure to copy the backup_label file from the pg_stop_backup/pg_backup_stop (in v15) results and put that into the snapshot. If you use exclusive mode then if the system crashes for any reason during the rsync then you'll have a good chance of ending up with a system that won't come back up until you go muck around in the data directory. It's not good and is why it's now been removed. Also, you'd have to craft a pretty ugly rsync to make it not copy data from the unlogged tables, and when you end up with a logged table in that tablespace (and you will, unless you're actively monitoring for it and remove any that pop up...) you could end up with data corruption. Don't think that you'll be able to use delta rsyncs with this either, as you may end up with files being changed without rsync realizing it and again end up with corruption. > And would also like to know if perhaps *only* the directories under > the rsync'd tablespace actually must be present for a successful > recovery. > > The above approach has worked mumerous times even with origin systems > having large, churny contents in the dedicated unlogged tablespace > (which is on a much faster local NVME volume than the main SAN > volume.) Dedicated backup tools already know how to recognize unlogged tables and skip them, along with being able to do incremental backup and delta restores, I'd strongly suggest you consider using such tools instead of trying to hack your own. Thanks, Stephen
Attachment
Re: Feedback about hybrid SAN snap and rsync'd approach for large systemcloning
From
Jerry Sievers
Date:
Hi Stephen, and thanks! Please see below... > Greetings, > * Jerry Sievers (gsievers19@comcast.net) wrote: > Suppose we have a DB cluster with an additional tablespace and we are > able to make an atomic SAN snapshot of *only* the main cluster > volume... > The additional tablespace contains only UNLOGGED relations. > We cannot snap that volume so we use rsync as follows... > 1. pg_start_backup('foo'); > make SAN snapshot > rsync the add'l tablespace > pg_stop_backup() > Surely you're also doing WAL archiving? You shouldn't ever be using > start/stop backup without also doing WAL archiving as you must capture > all of the WAL between the two. Oh, we're doing archiving all right, via pg_receivewal. And please stop calling me Shirley :-) > Now provision a new cluster around the snapshot and rsync'd volume, > rejigger the pg_tblspc link if necessary... and start it up maybe or > not having it remain as a streaming replica. > Dedicated backup tools know how to do tablespace remapping. Yep, I'm aware. The fixing of symlinks isn't much of a concern here, generally. > It's been my experience that possibly bulky data in the additional > tablespace does *not* need be rsync'd if we capture only the *_init > files. > That isn't a trivial thing to do with rsync tho and "try it and see if > it works" is a really bad way to go about this- it's important to > understand what the files are in the data directory, how PG uses them, > how PG recovery works, etc. That's why there are folks who write > dedicated backup tools for PG (myself included). Understood. It's a legacy configuration that we're working with that's used both for making full backups as well as frequently refreshed non-prod systems. Some of them are large enough to make compelling avoidance of any unnecessary data materialization. > Id' be curious to here feedback re the sanity of this approach. > I wouldn't recommend it is the short answer. Note that you must be > doing WAL archiving and you really should (and will have to, as of 15) Hmmm, we're using non-exclusive just as of my most recent updates... but I wasn't aware that exclusive mode was going away entirely w/V15. > use non-exclusive backup when you're doing this and be sure to copy the > backup_label file from the pg_stop_backup/pg_backup_stop (in v15) > results and put that into the snapshot. If you use exclusive mode > then if the system crashes for any reason during the rsync then you'll > have a good chance of ending up with a system that won't come back up > until you go muck around in the data directory. It's not good and is > why it's now been removed. Agree and I've certainly had a few cases of that in the past. Agree that it'll foil the less seasoned DBA. Yep and we do stash the labelfile output from pg_stop_backup in the snap after it's mounted on whatever box. > Also, you'd have to craft a pretty ugly rsync to make it not copy data > from the unlogged tables, and when you end up with a logged table in > that tablespace (and you will, unless you're actively monitoring for it > and remove any that pop up...) you could end up with data corruption. We are... but indeed the rsync include/exclude rules were fussy to develop and IMO non-trivial to really verify, certainly not ideal. For that matter, the event trigger that disallows creating logged tables I'm not certain is foolproof. > Don't think that you'll be able to use delta rsyncs with this either, as > you may end up with files being changed without rsync realizing it and > again end up with corruption. We don't do that anyhow. The snaps I'm talking about here are instantiated only once, then written to tape or brought up as non-prod systems... then torn all the way down. > And would also like to know if perhaps *only* the directories under > the rsync'd tablespace actually must be present for a successful > recovery. Forget I ever said the above which I'm pretty sure isn't workable and wouldn't buy much anyhow. > The above approach has worked numerous times even with origin > systems having large, churny contents in the dedicated unlogged > tablespace (which is on a much faster local NVME volume than the > main SAN volume.) > Dedicated backup tools already know how to recognize unlogged tables and > skip them, along with being able to do incremental backup and delta > restores, I'd strongly suggest you consider using such tools instead of > trying to hack your own. Point very well taken and it's about time my site had another look to see if such a solution is a good fit and /or to reverify that our backups are consistently recoverable. Thx again for weighing in. > Thanks, > Stephen