Re: Amcheck verification of GiST and GIN - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Amcheck verification of GiST and GIN
Date
Msg-id CAH2-WzndEBSGDWSARP2=CUeFU5WWp+qtHtfK-bchMraWO1YJ9Q@mail.gmail.com
Whole thread Raw
In response to Re: Amcheck verification of GiST and GIN  (Andrey Borodin <amborodin86@gmail.com>)
List pgsql-hackers
On Sun, Mar 19, 2023 at 4:00 PM Andrey Borodin <amborodin86@gmail.com> wrote:
> After several attempts to corrupt GiST with this 0.000001 epsilon
> adjustment tolerance I think GiST indexing of points is valid.
> Because intersection for search purposes is determined with the same epsilon!
> So it's kind of odd
> postgres=# select point(0.0000001,0)~=point(0,0);
> ?column?
> ----------
>  t
> (1 row)
> , yet the index works correctly.

I think that it's okay, provided that we can assume deterministic
behavior in the code that forms new index tuples. Within nbtree,
operator classes like numeric_ops are supported by heapallindexed
verification, without any requirement for special normalization code
to make it work correctly as a special case. This is true even though
operator classes such as numeric_ops have similar "equality is not
equivalence" issues, which comes up in other areas (e.g., nbtree
deduplication, which must call support routine 4 during a CREATE INDEX
[1]).

The important principle is that amcheck must always be able to produce
a consistent fingerprintable binary output given the same input (the
same heap tuple/Datum array). This must work across all operator
classes that play by the rules for GiST operator classes. We *can*
tolerate some variation here. Well, we really *have* to tolerate a
little of this kind of variation in order to deal with the TOAST input
state thing...but I hope that that's the only complicating factor
here, for GiST (as it is for nbtree). Note that we already rely on the
fact that index_form_tuple() uses palloc0() (not plain palloc) in
verify_nbtree.c, for the obvious reason.

I think that there is a decent chance that it just wouldn't make sense
for an operator class author to ever do something that we need to
worry about. I'm pretty sure that it's just the TOAST thing. But it's
worth thinking about carefully.

[1] https://www.postgresql.org/docs/devel/btree-support-funcs.html
--
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Kirk Wolak
Date:
Subject: Re: Documentation Not Compiling (http://docbook... not https:.//...)
Next
From: Masahiko Sawada
Date:
Subject: Re: Initial Schema Sync for Logical Replication