Home > mailing lists

amcheck support for BRIN indexes - Mailing list pgsql-hackers

From	Arseniy Mukhin
Subject	amcheck support for BRIN indexes
Date	April 22 12:05:46
Msg-id	CAE7r3MKZkMwuEeFU5M73Jk3yp+2CBr41a3tfvb3CHym+ysAVEA@mail.gmail.com Whole thread Raw
List	pgsql-hackers

Tree view

Hi,

It's not for v18, just wanted to share with the community and register
it in the upcoming Commitfest 2025-07.

Here is the patch with amcheck support for BRIN indexes.

Patch uses amcheck common infrastructure that was introduced in [1].
It works and I deleted all the code that
I copied from btree check at first. Great.

During the check we hold ShareUpdateExclusiveLock, so we don't block
regular reads/inserts/updates
and the same time range summarizations/desummarizations are impossible
during the check which simplifies check logic.
While checking we do ereport(ERROR) on the first issue, the same way
as btree, gin checks do.

There are two parts:

First part is 'index structure check':
1) Meta page check
2) Revmap check. Walk revmap and check every valid revmap item points
to the index tuple with the expected range blkno,
and index tuple is consistent with the tuple description. Also it's
not documented, but known from the brin code that
for every empty range we should have allnulls = true, hasnulls =
false. So this is also checked here.
3) Regular pages check. Walk regular pages and check that every index
tuple has a corresponding revmap item that points to it.
We don't check index tuple structures here, because it was already
done in 2 steps.

Regular pages check is optional. Orphan index tuple errors in this
check doesn't necessary mean that index is corrupted,
but AFAIS brin is not supposed to have such orphan index tuples now,
so if we encounter one than probably something
wrong with the index.

Second part is 'all consistent check':
We check all heap tuples are consistent with the index. It's the most
expensive part and it's optional.
Here we call consistent functions for every heap tuple. Also here we
check that fields like 'has_nulls', 'all_nulls',
'empty_range' are consistent with what we see in the heap.

There are two patch files:

0001-brin-refactoring.patch

It's just two tiny changes in the brin code.
1) For index tuple structure check we need to know how on-disk index
tuples look like.
Function that lets you get it 'brtuple_disk_tupdesc' is internal. This
patch makes it extern.
2) Create macros for BRIN_MAX_PAGES_PER_RANGE. For the meta page check.

0002-amechek-brin-support.patch

It's check functionality itself + regression tests + amcheck extension updates.

Some open questions:

1) How to get the correct strategy number for the scan key in "all
heap consistent" check. The consistent function wants
a scan key, and to generate it we have to pick a strategy number. We
can't just use the same strategy number for all
indexes because its meaning differs from opclass to opclass (for
example equal strategy number is 1 in Bloom
and 3 in min_max). We also can't pick an operator and use it for every
check, because opclasses don't have any requirements
about what operators they should support. The solution was to let user
to define check operator
(parameter consistent_operator_names). It's an array as we can have a
multicolumn index. We can use '=' as default check
operator, because AFAIS '=' operator is supported by all core
opclasses except 'box_inclusion_ops', and IMHO it's the
most obvious candidate for such a check. So if param
'consistent_operator_names' is an empty array (param default value),
then for all attributes we use operator '='. In most cases operator
'=' does the job and you don't need to worry about
consistent_operator_names param. Not sure about it, what do you think?

2) The current implementation of "all heap consistent" uses the scan
of the entire heap. So if we have a lot of
unsummarized ranges, we do a lot of wasted work because we can't use
the tuples that belong to the unsummarized ranges.
Instead of having one scan for the entire heap, we can walk the
revmap, take only the summarized ranges, and
scan only the pages that belong to those ranges. So we have one scan
per range. This approach helps us avoid touching
those heap tuples that we can't use for index checks. But I'm not sure
if we should to worry about that because every
autovacuum summarizes all the unsummarized ranges, so don't expect a
lot of unsummarized ranges on average.

TODO list:

1) add TAP tests
2) update SGML docs for amcheck (think it's better to do after patch
is reviewed and more or less finalized)
3) pg_amcheck integration

Big thanks to Tomas Vondra for the first patch idea and initial review.

[1] https://www.postgresql.org/message-id/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru

Best regards,
Arseniy Mukhin

Attachment

pgsql-hackers by date:

From: Ashutosh Bapat
Date: 22 April, 11:55:27
Subject: Re: bug: virtual generated column can be partition key

From: Yugo Nagata
Date: 22 April, 12:10:06
Subject: Allow to collect statistics on virtual generated columns

amcheck support for BRIN indexes - Mailing list pgsql-hackers

Attachment

Previous

Next