Re: new heapcheck contrib module - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: new heapcheck contrib module
Date
Msg-id CAH2-WznzMw_Tzy6oS_r_3Xu4wbAUYQOuDbZ_s=Ap8UNMP4TBcQ@mail.gmail.com
Whole thread Raw
In response to Re: new heapcheck contrib module  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: new heapcheck contrib module  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Mon, Aug 3, 2020 at 8:09 AM Robert Haas <robertmhaas@gmail.com> wrote:
> I agree that there's a serious design problem with Mark's patch in
> this regard, but I disagree that the effort is pointless on its own
> terms. You're basically postulating that users don't care how corrupt
> their index is: whether there's one problem or one million problems,
> it's all the same. If the user presents an index with one million
> problems and we tell them about one of them, we've done our job and
> can go home.

It's not so much that I think that users won't care about whether any
given index is a bit corrupt or very corrupt. It's more like I don't
think that it's worth the eye-watering complexity, especially without
a real concrete goal in mind. "Counting all the errors, not just the
first" sounds like a tractable goal for the heap/table structure, but
it's just not like that with indexes. If you really wanted to do this,
you'd have to describe a practical scenario under which it made sense
to soldier on, where we'd definitely be able to count the number of
problems in a meaningful way, without much risk of either massively
overcounting or undecounting inconsistencies.

Consider how the search in verify_ntree.c actually works at a high
level. If you thoroughly corrupted one B-Tree leaf page (let's say you
replaced it with an all-zero page image), all pages to the right of
the page would be fundamentally inaccessible to the left-to-right
level search that is coordinated within
bt_check_level_from_leftmost(). And yet, most real index scans can
still be expected to work. How do you know to skip past that one
corrupt leaf page (by going back to the parent to get the next sibling
leaf page) during index verification? That's what it would take to do
this in the general case, I guess. More fundamentally, I wonder how
many inconsistencies one should imagine that this index has, before we
even get into talking about the implementation.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: Cache relation sizes?
Next
From: Peter Eisentraut
Date:
Subject: Re: Replace remaining StrNCpy() by strlcpy()