Re: new heapcheck contrib module - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: new heapcheck contrib module
Date
Msg-id CAH2-Wzn38UhrmZomiF_FroR=WUYy7hNx1grmrPRggPnTnxzVRA@mail.gmail.com
Whole thread Raw
In response to Re: new heapcheck contrib module  (Mark Dilger <mark.dilger@enterprisedb.com>)
List pgsql-hackers
On Mon, Oct 5, 2020 at 5:24 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote:
> > I don't see how verify_heapam will avoid raising an error during basic
> > validation from PageIsVerified(), which will violate the guarantee
> > about not throwing errors. I don't see that as a problem myself, but
> > presumably you will.
>
> My concern is not so much that verify_heapam will stop with an error, but rather that it might trigger a panic that
stopsall backends.  Stopping with an error merely because it hits corruption is not ideal, as I would rather it
completedthe scan and reported all corruptions found, but that's minor compared to the damage done if verify_heapam
createsdowntime in a production environment offering high availability guarantees.  That statement might seem nuts,
giventhat the corrupt table itself would be causing downtime, but that analysis depends on assumptions about table
accesspatterns, and there is no a priori reason to think that corrupt pages are necessarily ever being accessed, or
accessedin a way that causes crashes (rather than merely wrong results) outside verify_heapam scanning the whole table. 

That seems reasonable to me. I think that it makes sense to never take
down the server in a non-debug build with verify_heapam. That's not
what I took away from your previous remarks on the issue, but perhaps
it doesn't matter now.

--
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: [PATCH] ecpg: fix progname memory leak
Next
From: "k.jamison@fujitsu.com"
Date:
Subject: RE: [Patch] Optimize dropping of relation buffers using dlist