Re: Expanding HOT updates for expression and partial indexes - Mailing list pgsql-hackers
| From | Matthias van de Meent |
|---|---|
| Subject | Re: Expanding HOT updates for expression and partial indexes |
| Date | |
| Msg-id | CAEze2WjbVc1XfyfGCF_bHV_2V3Gk_bu6P2SX-YygsDqCEnnCEg@mail.gmail.com Whole thread Raw |
| In response to | Re: Expanding HOT updates for expression and partial indexes (Greg Burd <greg@burd.me>) |
| List | pgsql-hackers |
On Sat, 22 Nov 2025, 22:30 Greg Burd, <greg@burd.me> wrote: > Thanks for pointing out the oversight for index-oriented scans (IOS), > you're right that the code in v22 doesn't handle that correctly. I'll > fix that. I still think that indexes that don't support IOS can and > should use the type-specific equality checks. This opens the door to > HOT with custom types that have unusual equality rules (see BSON). Do you have specific examples why it would be safe to default to "unusual equality rules" for generally any index's data ingestion needs? Why e.g. BSON must always be compared with their special equality test (and not datumIsEqual), and why IOS-less indexes in general are never going to distinguish between binary distinct but btree-equal values, and why exact equality is the special case here? I understand that you want to maximize optimization for specific workloads that you have in mind, but lacking evidence to the contrary I am really not convinced that your workloads are sufficiently generalizable that they can (and should) be the baseline for these new HOT rules: I have not yet seen good arguments why we could relax "datum equality" to "type equality" without potentially breaking existing indexes. HOT was implemented quite conservatively to make sure that there are no issues where changed values are not reflected in indexes: each indexed TID represents a specific and unchanging set of indexed values, in both the key and non-key attributes of indexes. If a value changes however so slightly, that may be a cause for indexes to treat it differently, and thus HOT must not be used. Aside: The amsummarizing optimization gets around that check by realizing the TID itself isn't really indexed, so the rules can be relaxed around that, but it still needs to go through the effort to update the summarizing indexes if the relevant attributes were ever so slightly updated. This patch right now wants to change these rules and behaviour of HOT in two ways: 1.) Instead of only testing attributes mentioned by indexed expressions for changes, it wants to test the output of the indexed expressions. I would consider this to be generally safe, as long as the expressions comply with the rules we have for indexed expressions. [Which, if not held, would break IOS and various other things, too, so relying on these rules isn't new or special]. 2.) Instead of datumIsEqual, it (by default) wants to do equality checks as provided by the type's default btree opclass' = operator. I have not seen evidence that this is safe. I have even explained with an example that IOS will return distinctly wrong results if only btree's = operator is used to determine if HOT can be applied, and that doesn't even begin to cover the issues related to indexes that may handle data differently from Btree. I also don't want indexes that at some point in the future invent support for IOS to return subtly incorrect results due to HOT checks that depended on the output of a previous version's amcanreturn output. So, IMV, tts_attr_equal is just a rather expensive version of datumIsEqual: the fast path cases would've been handled by datumIsEqual at least as fast (without a switch() statement with 18 specific cases and a default branch); if there is no btree operator it'll still default to btree compare, and if not then if the slow path uses correctly implemented compare operators (for HOT, and potentially all other possible indexes), then these would have an output that is indistinguishable from datumIsEqual, with the only difference the address and performance of the called function and a lot of added catalog lookups. All together, I think it's best to remove the second component of the changes to the HOT rules (changing the type of matching done for indexed values with tts_attr_compare) from this patchset. If you believe this should be added regardless, I think it's best discussed separately in its own thread and patchset -- it should be relatively easy to introduce in both current and future versions of this code, and (if you're correct and this is safe) it would have some benefits even when committed on its own. Kind regards, Matthias van de Meent
pgsql-hackers by date: