Re: Amcheck verification of GiST and GIN - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: Amcheck verification of GiST and GIN |
Date | |
Msg-id | CAH2-WzkUHW7HfOwkwbQ56UzjGC970=1b+vTwRJUp74zP_RPQZw@mail.gmail.com Whole thread Raw |
In response to | Re: Amcheck verification of GiST and GIN (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: Amcheck verification of GiST and GIN
Re: Amcheck verification of GiST and GIN |
List | pgsql-hackers |
On Thu, Mar 16, 2023 at 4:48 PM Peter Geoghegan <pg@bowt.ie> wrote: > Some feedback on the GiST patch: I see that the Bloom filter that's used to implement heapallindexed verification fingerprints index tuples that are formed via calls to gistFormTuple(), without any attempt to normalize-away differences in TOAST input state. In other words, there is nothing like verify_nbtree.c's bt_normalize_tuple() function involved in the fingerprinting process. Why is that safe, though? See the "toast_bug" test case within contrib/amcheck/sql/check_btree.sql for an example of how inconsistent TOAST input state confused verify_nbtree.c's heapallindexed verification (before bugfix commit eba775345d). I'm concerned about GiST heapallindexed verification being buggy in exactly the same way, or in some way that is roughly analogous. I do have some concerns about there being analogous problems that are unique to GiST, since GiST is an AM that gives opclass authors many more choices than B-Tree opclass authors have. In particular, I wonder if heapallindexed verification needs to account for how GiST compression might end up breaking heapallindexed. I refer to the "compression" implemented by GiST support routine 3 of GiST opclasses. The existence of GiST support routine 7, the "same" routine, also makes me feel a bit squeamish about heapallindexed verification -- the existence of a "same" routine hints at some confusion about "equality versus equivalence" issues. In more general terms: heapallindexed verification works by fingerprinting index tuples during the index verification stage, and then performing Bloom filter probes in a separate CREATE INDEX style heap-matches-index stage (obviously). There must be some justification for our assumption that there can be no false positive corruption reports due only to a GiST opclass (either extant or theoretical) that follows the GiST contract, and yet allows an inconsistency to arise that isn't really index corruption. This justification won't be easy to come up with, since the GiST contract was not really designed with these requirements in mind. But...we should try to come up with something. What are the assumptions underlying heapallindexed verification for GiST? It doesn't have to be provably correct or anything, but it should at least be empirically falsifiable. Basically, something that says: "Here are our assumptions, if we were wrong in making these assumptions then you could tell that we made a mistake because of X, Y, Z". It's not always clear when something is corrupt. Admittedly I have much less experience with GiST than other people, which likely includes you (Andrey). I am likely missing some context around the evolution of GiST. Possibly I'm making a big deal out of something without it being unhelpful. Unsure. Here is an example of the basic definition of correctness being unclear, in a bad way: Is a HOT chain corrupt when its root LP_REDIRECT points to an LP_DEAD item, or does that not count as corruption? I'm pretty sure that the answer is ambiguous even today, or was ambiguous until recently, at least. Hopefully the verify_heapam.c HOT chain verification patch will be committed, providing us with a clear *definition* of HOT chain corruption -- the definition itself may not be the easy part. On a totally unrelated note: I wonder if we should be checking that internal page tuples have 0xffff as their offset number? Seems like it'd be a cheap enough cross-check. -- Peter Geoghegan
pgsql-hackers by date: