Re: Online verification of checksums - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Online verification of checksums
Date
Msg-id 20190307180035.2sc4byi2mivjayy6@alap3.anarazel.de
Whole thread Raw
In response to Re: Online verification of checksums  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: Online verification of checksums
List pgsql-hackers
Hi,

On 2019-03-07 12:53:30 +0100, Tomas Vondra wrote:
> On 3/6/19 6:42 PM, Andres Freund wrote:
> >
> > ...
> >
> > To me the right way seems to be to IO lock the page via PG after such a
> > failure, and then retry. Which should be relatively easily doable for
> > the basebackup case, but obviously harder for the pg_verify_checksums
> > case.
> > 
> 
> Actually, what do you mean by "IO lock the page"? Just waiting for the
> current IO to complete (essentially BM_IO_IN_PROGRESS)? Or essentially
> acquiring a lock and holding it for the duration of the check?

The latter. And with IO lock I meant BufferDescriptorGetIOLock(), in
contrast to a buffer's content lock. That way we wouldn't block
modifications to the in-memory page.


> The former does not really help, because there might be another I/O request
> initiated right after, interfering with the retry.
> 
> The latter might work, assuming the check is fast (which it probably is). I
> wonder if this might cause issues due to loading possibly corrupted data
> (with invalid checksums) into shared buffers.

Oh, I was basically thinking that we'd just reread from disk outside of
postgres in that case, while preventing postgres related IO by holding
the IO lock.

But:

> But then again, we could just
> hack a special version of ReadBuffer_common() which would just

> (a) check if a page is in shared buffers, and if it is then consider the
> checksum correct (because in memory it may be stale, and it was read
> successfully so it was OK at that moment)
> 
> (b) if it's not in shared buffers already, try reading it and verify the
> checksum, and then just evict it right away (not to spoil sb)

This'd also make sense and make the whole process more efficient. OTOH,
it might actually be worthwhile to check the on-disk page even if
there's in-memory state. Unless IO is in progress the on-disk page
always should be valid.

Greetings,

Andres Freund


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Pluggable Storage - Andres's take
Next
From: Robert Haas
Date:
Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)