Re: new heapcheck contrib module - Mailing list pgsql-hackers

From Mark Dilger
Subject Re: new heapcheck contrib module
Date
Msg-id C235E2F2-3A6A-4141-A0BF-4B5406503087@enterprisedb.com
Whole thread Raw
In response to Re: new heapcheck contrib module  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers

> On Nov 19, 2020, at 11:47 AM, Peter Geoghegan <pg@bowt.ie> wrote:
>
> On Thu, Nov 19, 2020 at 9:06 AM Robert Haas <robertmhaas@gmail.com> wrote:
>> I'm also not sure if these descriptions are clear enough, but it may
>> also be hard to do a good job in a brief space. Still, comparing this
>> to the documentation of heapallindexed makes me rather nervous. This
>> is only trying to verify that the index contains all the tuples in the
>> heap, not that the values in the heap and index tuples actually match.
>
> That's a good point. As things stand, heapallindexed verification does
> not notice when there are extra index tuples in the index that are in
> some way inconsistent with the heap. Hopefully this isn't too much of
> a problem in practice because the presence of extra spurious tuples
> gets detected by the index structure verification process. But in
> general that might not happen.
>
> Ideally heapallindex verification would verify 1:1 correspondence. It
> doesn't do that right now, but it could.
>
> This could work by having two bloom filters -- one for the heap,
> another for the index. The implementation would look for the absence
> of index tuples that should be in the index initially, just like
> today. But at the end it would modify the index bloom filter by &= it
> with the complement of the heap bloom filter. If any bits are left set
> in the index bloom filter, we go back through the index once more and
> locate index tuples that have at least some matching bits in the index
> bloom filter (we cannot expect all of the bits from each of the hash
> functions used by the bloom filter to still be matches).
>
> From here we can do some kind of lookup for maybe-not-matching index
> tuples that we locate. Make sure that they point to an LP_DEAD line
> item in the heap or something. Make sure that they have the same
> values as the heap tuple if they're still retrievable (i.e. if we
> haven't pruned the heap tuple away already).

This approach sounds very good to me, but beyond the scope of what I'm planning for this release cycle.

>> This to me seems too conservative. The result is that by default we
>> check only tables, not indexes. I don't think that's going to be what
>> users want. I don't know whether they want the heapallindexed or
>> rootdescend behaviors for index checks, but I think they want their
>> indexes checked. Happy to hear opinions from actual users on what they
>> want; this is just me guessing that you've guessed wrong. :-)
>
> My thoughts on these two options:
>
> * I don't think that users will ever want rootdescend verification.
>
> That option exists now because I wanted to have something that relied
> on the uniqueness property of B-Tree indexes following the Postgres 12
> work. I didn't add retail index tuple deletion, so it seemed like a
> good idea to have something that makes the same assumptions that it
> would have to make. To validate the design.
>
> Another factor is that Alexander Korotkov made the basic
> bt_index_parent_check() tests a lot better for Postgres 13. This
> undermined the practical argument for using rootdescend verification.

The latest version of the patch has rootdescend off by default, but a switch to turn it on.  The documentation for that
switchin doc/src/sgml/pgamcheck.sgml summarizes your comments: 

+       This form of verification was originally written to help in the
+       development of btree index features.  It may be of limited or even of no
+       use in helping detect the kinds of corruption that occur in practice.
+       In any event, it is known to be a rather expensive check to perform.

For my own self, I don't care if rootdescend is an option in pg_amcheck.  You and Robert expressed somewhat different
opinions,and I tried to split the difference.  I'm happy to go a different direction if that's what the consensus is. 

> Finally, note that bt_index_parent_check() was always supposed to be
> something that was to be used only when you already knew that you had
> big problems, and wanted absolutely thorough verification without
> regard for the costs. This isn't the common case at all. It would be
> reasonable to not expose anything from bt_index_parent_check() at all,
> or to give it much less prominence. Not really sure of what the right
> balance is here myself, so I'm not insisting on anything. Just telling
> you what I know about it.

This still needs work.  Currently, there is a switch to turn off index checking, with the checks on by default.  But
thereis no switch controlling which kind of check is performed (bt_index_check vs. bt_index_parent_check).  Making
mattersmore complicated, selecting both rootdescend and bt_index_check wouldn't make sense, as there is no rootdescend
optionon that function.  So users would need multiple flags to turn on various options, with some flag combinations
drawingan error about the flags not being mutually compatible.  That's doable, but people may not like that interface. 

> * heapallindexed is kind of expensive, but valuable. But the extra
> check is probably less likely to help on the second or subsequent
> index on a table.

There is a switch for enabling this.  It is off by default.

> It might be worth considering an option that only uses it with only
> one index: Preferably the primary key index, failing that some unique
> index, and failing that some other index.

It might make sense for somebody to submit this for a later release.  I don't have any plans to work on this during
thisrelease cycle. 

>> I'm not very convinced by the decision to
>> override the user's decision about heapallindexed either.
>
> I strongly agree.

I have removed the override.

>
>> Maybe I lack
>> imagination, but that seems pretty arbitrary. Suppose there's a giant
>> index which is missing entries for 5 million heap tuples and also
>> there's 1 entry in the table which has an xmin that is less than the
>> pg_clas.relfrozenxid value by 1. You are proposing that because I have
>> the latter problem I don't want you to check for the former one. But
>> I, John Q. Smartuser, do not want you to second-guess what I told you
>> on the command line that I wanted. :-)
>
> Even if your user is just average, they still have one major advantage
> over the architects of pg_amcheck: actual knowledge of the problem in
> front of them.

There is a switch for skipping index checks on corrupt tables.  By default, the indexes will be checked.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: WIP: System Versioned Temporal Table
Next
From: Matthias van de Meent
Date:
Subject: Re: [PATCH] Simple progress reporting for COPY command