Re: block-level incremental backup - Mailing list pgsql-hackers
From | Stephen Frost |
---|---|
Subject | Re: block-level incremental backup |
Date | |
Msg-id | 20190917160908.GH6962@tamriel.snowman.net Whole thread Raw |
In response to | Re: block-level incremental backup (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: block-level incremental backup
|
List | pgsql-hackers |
Greetings, * Robert Haas (robertmhaas@gmail.com) wrote: > On Mon, Sep 16, 2019 at 3:38 PM Stephen Frost <sfrost@snowman.net> wrote: > > As discussed nearby, not everything that needs to be included in the > > backup is actually going to be in the WAL though, right? How would that > > ever be able to handle the case where someone starts the server under > > wal_level = logical, takes a full backup, then restarts with wal_level = > > minimal, writes out a bunch of new data, and then restarts back to > > wal_level = logical and takes an incremental? > > Fair point. I think the WAL-scanning approach can only work if > wal_level > minimal. But, I also think that few people run with > wal_level = minimal in this era where the default has been changed to > replica; and I think we can detect the WAL level in use while scanning > WAL. It can only change at a checkpoint. We need to be sure that we can detect if the WAL level has ever been set to minimal between a full and an incremental and, if so, either refuse to run the incremental, or promote it to a full, or make it a checksum-based incremental instead of trusting the WAL stream. I'm also glad that we ended up changing the default though and I do hope that there's relatively few people running with minimal and that there's even fewer who play around with flipping it back and forth. > > On larger systems, so many of the files are 1GB in size that checking > > the file size is quite close to meaningless. Yes, having to checksum > > all of the files definitely adds to the cost of taking the backup, but > > to avoid it we need strong assurances that a given file hasn't been > > changed since our last full backup. WAL, today at least, isn't quite > > that, and timestamps can possibly be fooled with, so if you'd like to be > > particularly careful, there doesn't seem to be a lot of alternatives. > > I see your points, but it feels like you're trying to talk down the > WAL-based approach over what seem to me to be fairly manageable corner > cases. Just to be clear, I see your points and I like the general idea of finding solutions, but it seems like the issues are likely to be pretty complex and I'm not sure that's being appreciated very well. > > I'm not asking you to be an expert on those systems, just to help me > > understand the statements you're making. How is backing up to a > > pgbackrest repo different than running a pg_basebackup in the context of > > using some other Enterprise backup system? In both cases, you'll have a > > full copy of the backup (presumably compressed) somewhere out on a disk > > or filesystem which is then backed up by the Enterprise tool. > > Well, I think that what people really want is to be able to backup > straight into the enterprise tool, without an intermediate step. Ok.. I can understand that but I don't get how these changes to pg_basebackup will help facilitate that. If they don't and what you're talking about here is independent, then great, that clarifies things, but if you're saying that these changes to pg_basebackup are to help with backing up directly into those Enterprise systems then I'm just asking for some help in understanding how- what's the use-case here that we're adding to pg_basebackup that makes it work with these Enterprise systems? I'm not trying to be difficult here, I'm just trying to understand. > My basic point here is: As with practically all PostgreSQL > development, I think we should try to expose capabilities and avoid > making policy on behalf of users. > > I'm not objecting to the idea of having tools that can help users > figure out how much WAL they need to retain -- but insofar as we can > do it, such tools should work regardless of where that WAL is actually > stored. How would that tool work, if it's to be able to work regardless of where the WAL is actually stored..? Today, pg_archivecleanup just works against a POSIX filesystem- are you thinking that the tool would have a pluggable storage system, so that it could work with, say, a POSIX filesystem, or a CIFS mount, or a s3-like system? > I dislike the idea that PostgreSQL would provide something > akin to a "pgbackrest repository" in core, or I at least I think it > would be important that we're careful about how much functionality > gets tied to the presence and use of such a thing, because, at least > based on my experience working at EnterpriseDB, larger customers often > don't want to do it that way. This seems largely independent of the above discussion, but since we're discussing it, I've certainly had various experiences in this area too- some larger customers would like to use an s3-like store (which pgbackrest already supports and will be supporting others going forward as it has a pluggable storage mechanism for the repo...), and then there's customers who would like to point their Enterprise backup solution at a directory on disk to back it up (which pgbackrest also supports, as mentioned previously), and lastly there's customers who really want to just backup the PG data directory and they'd like it to "just work", thank you, and no they don't have any thought or concern about how to handle WAL, but surely it can't be that important, can it? The last is tongue-in-cheek and I'm half-kidding there, but this is why I was trying to understand the comments above about what the use-case is here that we're trying to solve for that answers the call for the Enterprise software crowd, and ideally what distinguishes that from pgbackrest, but just the clear cut "this is what this change will do to make pg_basebackup work for Enterprise customers" would be great, or even a "well, pg_basebackup today works for them because it does X and it'll continue to be able to do X even after this change." I'll take a wild shot in the dark to try to help move us through this- is it that pg_basebackup can stream out to stdout in some cases..? Though that's quite limited since it means you can't have additional tablespaces and you can't stream the WAL, and how would that work with the manifest idea that's being discussed..? If there's a directory that's got manifest files in it for each backup, so we have the file sizes for them, those would need to be accessible when we go to do the incremental backup and couldn't be stored off somewhere else, I wouldn't think.. > > That's not great, of course, which is why there are trade-offs to be > > made, one of which typically involves using timestamps, but doing so > > quite carefully, to perform the file exclusion. Other ideas are great > > but it seems like WAL isn't really a great idea unless we make some > > changes there and we, as in PG, haven't got a robust "we know this file > > changed as of this point" to work from. I worry that we're putting too > > much faith into a system to do something independent of what it was > > actually built and designed to do, and thinking that because we could > > trust it for X, we can trust it for Y. > > That seems like a considerable overreaction to me based on the > problems reported thus far. The fact is, WAL was originally intended > for crash recovery and has subsequently been generalized to be usable > for point-in-time recovery, standby servers, and logical decoding. > It's clearly established at this point as the canonical way that you > know what in the database has changed, which is the same need that we > have for incremental backup. Provided the WAL level is at the level that you need it to be that will be true for things which are actually supported with PITR, replication to standby servers, et al. I can see how it might come across as an overreaction but this strikes me as a pretty glaring issue and I worry that if it was overlooked until now that there'll be other more subtle issues, and backups are just plain complicated to get right, just to begin with already, something that I don't think people appreciate until they've been dealing with them for quite a while. Not that this would be the first time we've had issues in this area, and we'd likely work through them over time, but I'm sure we'd all prefer to get it as close to right as possible the first time around, and that's going to require some pretty in depth review. > At any rate, the same criticism can be leveled - IMHO with a lot more > validity - at timestamps. Last-modification timestamps are completely > outside of our control; they are owned by the OS and various operating > systems can and do have varying behavior. They can go backwards when > things have changed; they can go forwards when things have not > changed. They were clearly not intended to meet this kind of > requirement. Even, they were intended for that purpose much less so > than WAL, which was actually designed for a requirement in this > general ballpark, if not this thing precisely. While I understand that timestamps may be used for a lot of things and that the time on a system could go forward or backward, the actual requirement is: - If the file was modified after the backup was done, the timestamp (or the size) needs to be different. Doesn't actually matter if it's forwards, or backwards, different is all that's needed. The timestamp also needs to be before the backup started for it to be considered an option to skip it. Is it possible for that to be fool'd? Yes, of course, but it isn't as simply fooled as your typical "just copy files newer than X" issue that other tools have, at least, if you're keeping a manifest of all of the files, et al, as discussed earlier. Thanks, Stephen
Attachment
pgsql-hackers by date: