Re: Online verification of checksums - Mailing list pgsql-hackers

From Michael Banck
Subject Re: Online verification of checksums
Date
Msg-id dfeee5471ea96cf30f720177b12ec9053a954598.camel@credativ.de
Whole thread Raw
In response to Re: Online verification of checksums  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Online verification of checksums
List pgsql-hackers
Hi,

Am Montag, den 06.04.2020, 16:45 -0400 schrieb Tom Lane:
> I wrote:
> > Another thing that's bothering me is that the patch compares page LSN
> > against GetInsertRecPtr(); but that function says
> > ...
> > I'm not convinced that an approximation is good enough here.  It seems
> > like a page that's just now been updated could have an LSN beyond the
> > current XLOG page start, potentially leading to a false checksum
> > complaint.  Maybe we could address that by adding one xlog page to
> > the GetInsertRecPtr result?  Kind of a hack, but ...

I was about to write that it sounds like a pragmatic solution to me,
but...

> Actually, after thinking about that a bit more: why is there an LSN-based
> special condition at all?  It seems like it'd be far more useful to
> checksum everything, and on failure try to re-read and re-verify the page
> once or twice, so as to handle the corner case where we examine a page
> that's in process of being overwritten.

Andres outlined something about a year ago which on re-reading sounds
similar to what you suggest above in     
20190326170820.6sylklg7eh6uhabd@alap3.anarazel.de but never posted a
full patch. He seems to have had a few additional checks from PageIsVerified() in mind, though.

The original check against the checkpoint LSN wasn't suggested by me;
I've submitted this patch with the InsertRecPtr as an upper bound as a
*(presumably) minimal-invasive patch which could be back-patched (when
nothing came of the above thread for a while), but the issue seems to be
quite a bit nuanced.

Probably we need to take a step back; the question is whether something
like what Andres suggested should/could be coded up for v13 still
(before the feature freeze) and if so, by whom (I won't have the time),
or whether it would still qualify as a back-patchable bug-fix and/or
whether your suggestion above would.


Michael
    
-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz




pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: SyncRepLock acquired exclusively in default configuration
Next
From: James Coleman
Date:
Subject: Re: [PATCH] Incremental sort (was: PoC: Partial sort)