Re: pg_combinebackup does not detect missing files - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: pg_combinebackup does not detect missing files |
Date | |
Msg-id | CA+TgmoYX+mw48tTs=7-r-ZJiuiAGeHv4JhPeeQa4tJ8apZDjcg@mail.gmail.com Whole thread Raw |
In response to | Re: pg_combinebackup does not detect missing files (David Steele <david@pgmasters.net>) |
Responses |
Re: pg_combinebackup does not detect missing files
|
List | pgsql-hackers |
On Wed, Apr 17, 2024 at 7:09 PM David Steele <david@pgmasters.net> wrote: > I think here: > > + <application>pg_basebackup</application> only attempts to verify > > you mean: > > + <application>pg_combinebackup</application> only attempts to verify > > Otherwise this looks good to me. Good catch, thanks. Committed with that change. > Fair enough. I accept that your reasoning is not random, but I'm still > not very satisfied that the user needs to run a separate and rather > expensive process to do the verification when pg_combinebackup already > has the necessary information at hand. My guess is that most users will > elect to skip verification. I think you're probably right that a lot of people will skip it; I'm just less convinced than you are that it's a bad thing. It's not a *great* thing if people skip it, but restore time is actually just about the worst time to find out that you have a problem with your backups. I think users would be better served by verifying stored backups periodically when they *don't* need to restore them. Also, saying that we have all of the information that we need to do the verification is only partially true: - we do have to parse the manifest anyway, but we don't have to compute checksums anyway, and I think that cost can be significant even for CRC-32C and much more significant for any of the SHA variants - we don't need to read all of the files in all of the backups. if there's a newer full, the corresponding file in older backups, whether full or incremental, need not be read - incremental files other than the most recent only need to be read to the extent that we need their data; if some of the same blocks have been changed again, we can economize How much you save because of these effects is pretty variable. Best case, you have a 2-backup chain with no manifest checksums, and all verification will have to do that you wouldn't otherwise need to do is walk each older directory tree in toto and cross-check which files exist against the manifest. That's probably cheap enough that nobody would be too fussed. Worst case, you have a 10-backup (or whatever) chain with SHA512 checksums and, say, a 50% turnover rate. In that case, I think having verification happen automatically could be a pretty major hit, both in terms of I/O and CPU. If your database is 1TB, it's ~5.5TB of read I/O (because one 1TB full backup and 9 0.5TB incrementals) instead of ~1TB of read I/O, plus the checksumming. Now, obviously you can still feel that it's totally worth it, or that someone in that situation shouldn't even be using incremental backups, and it's a value judgement, so fair enough. But my guess is that the efforts that this implementation makes to minimize the amount of I/O required for a restore are going to be important for a lot of people. > At least now they'll have the information they need to make an informed > choice. Right. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: