Re: Online verification of checksums - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: Online verification of checksums
Date
Msg-id 20180926153031.GB4184@tamriel.snowman.net
Whole thread Raw
In response to Re: Online verification of checksums  (Fabien COELHO <coelho@cri.ensmp.fr>)
List pgsql-hackers
Greetings,

* Fabien COELHO (coelho@cri.ensmp.fr) wrote:
> >Note that a short read isn't an error and falls under the 'new' blocks
> >discussion above.
>
> I'm really unsure that a short read should really be coldly skipped:
>
> If the check is offline, then one file is in a very bad state, this is
> really a panic situation.

Why?  Are we sure that's really something which can't ever happen, even
if the database was shutdown with 'immediate'?  I don't think it can but
that's something to consider.  In any case, my comments were
specifically thinking about it from an 'online' perspective.

> If the check is online, given that both postgres and the verify command
> interact with the same OS (?) and at the pg page level, I'm not sure in
> which situation there could be a partial block, because pg would only send
> full pages to the OS.

The OS doesn't operate at the same level that PG does- a single write in
PG could get blocked and scheduled off after having only copied half of
the 8k that PG sends.  This isn't really debatable- we've seen it happen
and everything is operating perfectly correctly, it just happens that
you were able to get a read() at the same time a write() was happening
and that only part of the page had been updated at that point.

Thanks!

Stephen

Attachment

pgsql-hackers by date:

Previous
From: Fabien COELHO
Date:
Subject: Re: Online verification of checksums
Next
From: Nikita Glukhov
Date:
Subject: Re: [PATCH] kNN for btree