Re: Online verification of checksums - Mailing list pgsql-hackers

From David Steele
Subject Re: Online verification of checksums
Date
Msg-id 47e26e3d-989f-b034-f2fc-926b67cc22bf@pgmasters.net
Whole thread Raw
In response to Re: Online verification of checksums  (Stephen Frost <sfrost@snowman.net>)
Responses Re: Online verification of checksums  (Michael Banck <michael.banck@credativ.de>)
List pgsql-hackers
On 9/18/18 11:45 AM, Stephen Frost wrote:
> * Michael Banck (michael.banck@credativ.de) wrote:

>> I have added a retry for this as well now, without a pg_sleep() as well.
>
>> This catches around 80% of the half-reads, but a few slip through. At
>> that point we bail out with exit(1), and the user can try again, which I
>> think is fine? 
>
> No, this is perfectly normal behavior, as is having completely blank
> pages, now that I think about it.  If we get a short read then I'd say
> we simply check that we got an EOF and, in that case, we just move on.
>
>> Alternatively, we could just skip to the next file then and don't make
>> it count as a checksum failure.
>
> No, I wouldn't count it as a checksum failure.  We could possibly count
> it towards the skipped pages, though I'm even on the fence about that.

+1 for it not being a failure.  Personally I'd count it as a skipped
page, since we know the page exists but it can't be verified.

The other option is to wait for the page to stabilize, which doesn't
seem like it would take very long in most cases -- unless you are doing
this test from another host with shared storage.  Then I would expect to
see all kinds of interesting torn pages after the last checkpoint.

Regards,
--
-David
david@pgmasters.net


Attachment

pgsql-hackers by date:

Previous
From: Hironobu SUZUKI
Date:
Subject: Re: pgbench - add pseudo-random permutation function
Next
From: Fabien COELHO
Date:
Subject: Re: pgbench - add pseudo-random permutation function