Re: Online checksums verification in the backend - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: Online checksums verification in the backend
Date
Msg-id 20201103094612.GB2298@paquier.xyz
Whole thread Raw
In response to Re: Online checksums verification in the backend  (Andres Freund <andres@anarazel.de>)
Responses Re: Online checksums verification in the backend  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
On Mon, Nov 02, 2020 at 11:34:57AM -0800, Andres Freund wrote:
> On 2020-11-02 12:35:30 -0500, Robert Haas wrote:
>> On Thu, Oct 29, 2020 at 2:17 PM Andres Freund <andres@anarazel.de> wrote:
>>> I think this needs to be quickly reworked or reverted.
>
> I think it's fairly clear by now that revert is appropriate for now.

Yep, that's clear.  I'll deal with that tomorrow.  That's more than a
simple rework.

>> I don't like this patch, either. In addition to what Andres mentioned,
>> CheckBuffer() is a completely-special case mechanism which can't be
>> reused by anything else. In particular, the amcheck for heap stuff
>> which I recently committed (866e24d47db1743dfcff5bd595b57e3a143f2cb1)
>> would really like a way to examine a buffer without risking an error
>> if PageIsVerified() should happen to fail, but this patch is of
>> absolutely no use in getting there, because CheckBuffer() doesn't give
>> the caller any way to access the contents of the buffer. It can only
>> do the checks that it knows how to do, and that's it. That doesn't
>> seem like a good design.
>
> Wouldn't this be better served by having a ReadBufferExtended() flag,
> preventing erroring out and zeroing the buffer? I'm not sure that
> handling both the case where the buffer contents need to be valid and
> the one where it doesn't will make for a good API.

If you grep for ReadBuffer_common() is some of the emails I sent..  I
was rather interested in something like that.

>> I don't like the fact that CheckBuffer() silently skips dirty buffers,
>> either. The comment should really say that it checks the state of a
>> buffer without loading it into shared buffers, except sometimes it
>> doesn't actually check it.
>
> Yea, I don't see a good reason for that either. There's reasons for
> dirty buffers that aren't WAL logged - so if the on-disk page is broken,
> a standby taken outside pg_basebackup would possibly still end up with a
> corrupt on-disk page. Similar with a crash restart.

Er, if you don't skip dirty buffers, wouldn't you actually report some
pages as broken if attempting to run those in a standby who may have
some torn pages from a previous base backup?  You could still run into
problems post-promotion, until the first checkpoint post-recovery
happens, no?

>> If the buffer is in shared buffers, we could take a share-lock
>> on the buffer and copy the contents of the page as it exists on disk,
>> and then still check it.
>
> Don't think we need a share lock. That still allows the buffer to be
> written out (and thus a torn read). What we need is to set
> BM_IO_IN_PROGRESS on the buffer in question - only one backend can set
> that. And then unset that again, without unsetting
> BM_DIRTY/BM_JUST_DIRTIED.

If that can work, we could make use of some of that for base backups
for a single retry of a page that initially failed.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: -O switch
Next
From: Peter Eisentraut
Date:
Subject: Re: [PATCH] remove deprecated v8.2 containment operators