Re: Amcheck verification of GiST and GIN - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Amcheck verification of GiST and GIN
Date
Msg-id 7849efb9-af39-491e-a931-6ea5d1fe8f23@enterprisedb.com
Whole thread Raw
In response to Re: Amcheck verification of GiST and GIN  ("Andrey M. Borodin" <x4mmm@yandex-team.ru>)
Responses Re: Amcheck verification of GiST and GIN
Re: Amcheck verification of GiST and GIN
List pgsql-hackers
Hi,

On 7/9/24 08:36, Andrey M. Borodin wrote:
> 
> 
>> On 5 Jul 2024, at 17:27, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>>
>> There’s one more problem in pg_amcheck’s GiST verification. We must
>> check that amcheck is 1.5+ and use GiST verification only in that
>> case …
> 
> Done. I’ll set the status to “Needs review”.
> 

I realized amcheck GIN/GiST support would be useful for testing my
patches adding parallel builds for these index types, so I decided to
take a look at this and do an initial review today.

Attached is a patch series with a extra commits to keep the review
comments and patches adjusting the formatting by pgindent (the patch
seems far enough for this).

Let me quickly go through the review comments:

1) Not sure I like 'amcheck.c' very much, I'd probably go with something
like 'verify_common.c' to match naming of the other files. But it's just
nitpicking and I can live with it.

2) amcheck_lock_relation_and_check seems to be the most important
function, yet there's no comment explaining what it does :-(

3) amcheck_lock_relation_and_check still has a TODO to add the correct
name of the AM

4) Do we actually need amcheck_index_mainfork_expected as a separate
function, or could it be a part of index_checkable?

5) The comment for heaptuplespresent says "debug counter" but that does
not really explain what it's for. (I see verify_nbtree has the same
comment, but maybe let's improve that.)

6) I'd suggest moving the GISTSTATE + blocknum fields to the beginning
of GistCheckState, it seems more natural to start with "generic" fields.

7) I'd adjust the gist_check_parent_keys_consistency comment a bit, to
explain what the function does first, and only then explain how.

8) We seem to be copying PageGetItemIdCareful() around, right? And the
copy in _gist.c still references nbtree - I guess that's not right.

9) Why is the GIN function called gin_index_parent_check() and not
simply gin_index_check() as for the other AMs?

10) The debug in gin_check_posting_tree_parent_keys_consistency triggers
assert when running with client_min_messages='debug5', it seems to be
accessing bogus item pointers.

11) Why does it add pg_amcheck support only for GiST and not GIN?


That's all for now. I'll add this to the stress-testing tests of my
index build patches, and if that triggers more issues I'll report those.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment

pgsql-hackers by date:

Previous
From: Greg Sabino Mullane
Date:
Subject: Send duration output to separate log files
Next
From: Fujii Masao
Date:
Subject: Re: Add a GUC check hook to ensure summarize_wal cannot be enabled when wal_level is minimal