On 26/07/2018 13:59, Michael Banck wrote: > I've now forward-ported this change to pg_verify_checksums, in order to > make this application useful for online clusters, see attached patch.
Why not provide this functionality as a server function or command. Then you can access blocks with proper locks and don't have to do this rather ad hoc retry logic on concurrent access.
I think it would make sense to provide this functionality in the "checksum worker" infrastruture suggested in the online checksum enabling patch. But I think being able to run it from the outside would also be useful, particularly when it's this simple.
But why do we need a sleep in it? AFAICT this is basically the same code that we have in basebackup.c, and that one does not need the sleep? Certainly 500ms would be very long since we're just protecting against a torn page, but the comment is wrong I think, and we're actually sleeping 0.5ms?