Re: Online verification of checksums - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Online verification of checksums
Date
Msg-id a9aa939f-fcf6-017d-d7ce-0a7d26a72cc2@2ndquadrant.com
Whole thread Raw
In response to Re: Online verification of checksums  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Online verification of checksums
List pgsql-hackers
On 3/5/19 4:12 AM, Michael Paquier wrote:
> On Mon, Mar 04, 2019 at 03:08:09PM +0100, Tomas Vondra wrote:
>> I still don't understand what issue you see in how basebackup verifies
>> checksums. Can you point me to the explanation you've sent after 11 was
>> released?
> 
> The history is mostly on this thread:
> https://www.postgresql.org/message-id/20181020044248.GD2553@paquier.xyz
> 

Thanks, will look.

Based on quickly skimming that thread the main issue seems to be
deciding which files in the data directory are expected to have
checksums. Which is a valid issue, of course, but I was expecting
something about partial read/writes etc.

>> So you have a workload/configuration that actually results in data
>> corruption yet we fail to detect that? Or we generate false positives?
>> Or what do you mean by "100% safe" here?
> 
> What's proposed on this thread could generate false positives.  Checks
> which have deterministic properties and clean failure handling are
> reliable when it comes to reports.

My understanding is that:

(a) The checksum verification should not generate false positives (same
as for basebackup).

(b) The partial reads do emit warnings, which might be considered false
positives I guess. Which is why I'm arguing for changing it to do the
same thing basebackup does, i.e. ignore this.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Should we increase the default vacuum_cost_limit?
Next
From: Heikki Linnakangas
Date:
Subject: Re: GiST VACUUM