Re: pg_combinebackup does not detect missing files - Mailing list pgsql-hackers

From David Steele
Subject Re: pg_combinebackup does not detect missing files
Date
Msg-id 908d3845-e6dd-43cb-82f3-56f11b57a98f@pgmasters.net
Whole thread Raw
In response to Re: pg_combinebackup does not detect missing files  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: pg_combinebackup does not detect missing files  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 4/22/24 23:53, Robert Haas wrote:
> On Sun, Apr 21, 2024 at 8:47 PM David Steele <david@pgmasters.net> wrote:
>>> I figured that wouldn't be particularly meaningful, and
>>> that's pretty much the only kind of validation that's even
>>> theoretically possible without a bunch of extra overhead, since we
>>> compute checksums on entire files rather than, say, individual blocks.
>>> And you could really only do it for the final backup in the chain,
>>> because you should end up accessing all of those files, but the same
>>> is not true for the predecessor backups. So it's a very weak form of
>>> verification.
>>
>> I don't think it is weak if you can verify that the output is exactly as
>> expected, i.e. all files are present and have the correct contents.
> 
> I don't understand what you mean here. I thought we were in agreement
> that verifying contents would cost a lot more. The verification that
> we can actually do without much cost can only check for missing files
> in the most recent backup, which is quite weak. pg_verifybackup is
> available if you want more comprehensive verification and you're
> willing to pay the cost of it.

I simply meant that it is *possible* to verify the output of 
pg_combinebackup without explicitly verifying all the backups. There 
would be overhead, yes, but it would be less than verifying each backup 
individually. For my 2c that efficiency would make it worth doing 
verification in pg_combinebackup, with perhaps a switch to turn it off 
if the user is confident in their sources.

>> I think it is a worthwhile change and we are still a month away from
>> beta1. We'll see if anyone disagrees.
> 
> I don't plan to press forward with this in this release unless we get
> a couple of +1s from disinterested parties. We're now two weeks after
> feature freeze and this is design behavior, not a bug. Perhaps the
> design should have been otherwise, but two weeks after feature freeze
> is not the time to debate that.

It doesn't appear that anyone but me is terribly concerned about 
verification, even in this weak form, so probably best to hold this 
patch until the next release. As you say, it is late in the game.

Regards,
-David



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: BitmapHeapScan streaming read user and prelim refactoring
Next
From: Thomas Munro
Date:
Subject: Re: Requiring LLVM 14+ in PostgreSQL 18