Re: Online verification of checksums - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Online verification of checksums
Date
Msg-id 29a0ef4d-7d5d-fe5c-253b-3f7f53df0859@2ndquadrant.com
Whole thread Raw
In response to Re: Online verification of checksums  (Andres Freund <andres@anarazel.de>)
Responses Re: Online verification of checksums
List pgsql-hackers
On 3/6/19 6:42 PM, Andres Freund wrote:
 >
> ...
 >
> To me the right way seems to be to IO lock the page via PG after such a
> failure, and then retry. Which should be relatively easily doable for
> the basebackup case, but obviously harder for the pg_verify_checksums
> case.
> 

Actually, what do you mean by "IO lock the page"? Just waiting for the 
current IO to complete (essentially BM_IO_IN_PROGRESS)? Or essentially 
acquiring a lock and holding it for the duration of the check?

The former does not really help, because there might be another I/O 
request initiated right after, interfering with the retry.

The latter might work, assuming the check is fast (which it probably 
is). I wonder if this might cause issues due to loading possibly 
corrupted data (with invalid checksums) into shared buffers. But then 
again, we could just hack a special version of ReadBuffer_common() which 
would just

(a) check if a page is in shared buffers, and if it is then consider the 
checksum correct (because in memory it may be stale, and it was read 
successfully so it was OK at that moment)

(b) if it's not in shared buffers already, try reading it and verify the 
checksum, and then just evict it right away (not to spoil sb)

Or did you have something else in mind?


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


pgsql-hackers by date:

Previous
From: Filip Rembiałkowski
Date:
Subject: Re: Re: proposal: make NOTIFY list de-duplication optional
Next
From: Chris Travers
Date:
Subject: Re: Ltree syntax improvement