>> I completely agree that checking uniqueness requires looking at the heap, but I don't agree that every caller of bt_index_check on an index wants that particular check to be performed. There are multiple ways in which an index might be corrupt, and Peter wrote the code to only check some of them by default, with options to expand the checks to other things. This is why heapallindexed is optional. If you don't want to pay the price of checking all entries in the heap against the btree, you don't have to.
>>
>> I've got the idea and revised the patch accordingly. Thanks!
>> Pfa v4 of a patch. I've added an optional argument to allow uniqueness checks for the unique indexes.
>> Also, I added a test variant to make them work on 32-bit systems. Unfortunately, converting the regression test to TAP would be a pain for me. Hope it can be used now as a 2-variant regression test for 32 and 64 bit systems.
>>
>> Thank you for your consideration!
>>
>> --
>> Best regards,
>> Pavel Borisov
>>
>> Postgres Professional: http://postgrespro.com
>> <v4-0001-Make-amcheck-checking-UNIQUE-constraint-for-btree.patch>
>
> Looking over v4, here are my review comments...
Mark and Peter, big thanks for your ideas!
I had little time to work on this feature until recently, but finally, I've elaborated v5 patch (PFA)
It contains the following improvements, most of which are based on your consideration:
- Amcheck tests are reworked into TAP-tests with "break the warranty" by comparison function changes in the opclass instead of pg_index update. Mark, again thanks for the sample!
- Added new --checkunique option into pg_amcheck.
- Added documentation and tests for amcheck and pg_amcheck new functions.
- Results are output at ERROR log level like it is done in the other amcheck tests.
- Rare case of inability to check due to the first entry on a leaf page being both: (1) equal to the last one on the previous page and (2) deleted in the heap, is demoted to DEBUG1 log level. In this, I folowed Peter's consideration that amcheck should do its best to check, but can not always verify everything. The case is expected to be extremely rare.
- Made pages connectivity based on btpo_next (corrected a bug in the code, I found meanwhile)
- If snapshot is already taken in heapallindexed case, reuse it for unique constraint check.
The patch is pgindented and rebased on the current PG master code.
I'd like to re-attach the patch v5 to the upcoming CF if you don't mind.
I also want to add that some customers face index uniqueness violations more often than I've expected, and I hope this check could also help some other PostgreSQL customers.
Your further considerations are welcome as always!