Re: Online verification of checksums - Mailing list pgsql-hackers

From Michael Banck
Subject Re: Online verification of checksums
Date
Msg-id 1552940142.9697.40.camel@credativ.de
Whole thread Raw
In response to Re: Online verification of checksums  (Stephen Frost <sfrost@snowman.net>)
Responses Re: Online verification of checksums
Re: Online verification of checksums
List pgsql-hackers
Hi,

Am Montag, den 18.03.2019, 16:11 +0800 schrieb Stephen Frost:
> On Mon, Mar 18, 2019 at 15:52 Michael Banck <michael.banck@credativ.de> wrote:
> > Am Montag, den 18.03.2019, 03:34 -0400 schrieb Stephen Frost:
> > > Thanks for that.  Reading through the code though, I don't entirely
> > > understand why we're making things complicated for ourselves by trying
> > > to seek and re-read the entire block, specifically this:
> > 
> > [...]
> > 
> > > I would think that we could just do:
> > > 
> > >   insert_location = 0;
> > >   r = read(BLCKSIZE - insert_location);
> > >   if (r < 0) error();
> > >   if (r == 0) EOF detected, move to next
> > >   if (r < (BLCKSIZE - insert_location)) {
> > >     insert_location += r;
> > >     continue;
> > >   }
> > > 
> > > At this point, we should have a full block, do our checks...
> > 
> > Well, we need to read() into some buffer which you have ommitted.
> 
> Surely there’s a buffer the read in the existing code is passing in,
> you just need to offset by the current pointer, sorry for not being
> clear.
> 
> In other words the read would look more like:
> 
> read(fd,buf + insert_ptr, BUFSZ - insert_ptr)
> 
> And then you have to reset insert_ptr once you have a full block.

Ok, thanks for clearing that up.

I've tried to do that now in the attached, does that suit you?

> Yes, the question was more like this: have you ever seen a read return
> a partial result when you know you’re in the middle somewhere of an
> existing file and the length of the file hasn’t been changed by
> something else..?

I don't think I've seen that, but that wouldn't turn up in regular
testing anyway I guess but only in pathological cases?  I guess we are
probably dealing with this in the current version of the patch, but I
can't say for certain as it sounds pretty difficult to test.

I have also added a paragraph to the documentation about possilby
skipping new or recently updated pages:

+   If the cluster is online, pages that have been (re-)written since the last
+   checkpoint will not count as checksum failures if they cannot be read or
+   verified correctly.

Wording improvements welcome.


Michael


-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz
Attachment

pgsql-hackers by date:

Previous
From: Joe Conway
Date:
Subject: Re: Row Level Security − leakproof-ness and performance implications
Next
From: Andres Freund
Date:
Subject: Re: outdated reference to tuple header OIDs