Re: Expanding HOT updates for expression and partial indexes - Mailing list pgsql-hackers
| From | Greg Burd |
|---|---|
| Subject | Re: Expanding HOT updates for expression and partial indexes |
| Date | |
| Msg-id | 092E6AFE-81DA-4869-91C7-8F9F5A7541E5@greg.burd.me Whole thread Raw |
| In response to | Re: Expanding HOT updates for expression and partial indexes (Greg Burd <greg@burd.me>) |
| List | pgsql-hackers |
On Nov 19 2025, at 1:00 pm, Greg Burd <greg@burd.me> wrote: > > On Nov 16 2025, at 1:53 pm, Greg Burd <greg@burd.me> wrote: > >> 0004 - Enable HOT updates for expression and partial indexes >> >> This finally gets us back to where this project started, but on much >> more firm ground than before because we're not going to >> self-deadlock. >> The idea has grown from a small function into something larger, but only >> out of necessity. >> >> In this patch I add ExecWhichIndexesRequireUpdates() in execIndexing.c >> which implements (c) finding the set of attributes that force new index >> updates. This set can be very different from the modified indexed >> attributes. We know that some attributes are not equal to their >> previous versions, but does that mean that the index that references >> that attribute needs a new index tuple? It may, or it may not. Here's >> the comment on that function that explains: >> >> /* >> * ExecWhichIndexesRequireUpdates >> * >> * Determine which indexes need updating given modified indexed attributes. >> * This function is a companion to ExecCheckIndexedAttrsForChanges(). >> On the >> * surface, they appear similar but they are doing two very different things. >> * >> * For a standard index on a set of attributes this is the >> intersection of >> * the mix_attrs and the index attrs (key, expression, but not predicate). >> * >> * For expression indexes and indexes which implement the amcomparedatums() >> * index AM API we'll need to form index datum and compare each >> attribute to >> * see if any actually changed. >> * >> * For expression indexes the result of the expression might not change >> at all, >> * this is common with JSONB columns which require expression indexes >> and where >> * it is commonplace to index a field within a document and have >> updates that >> * generally don't update that field. >> * >> * Partial indexes won't trigger index tuples when the old/new tuples >> are both >> * outside of the predicate range. >> * >> * For nbtree the amcomparedatums() API is critical as it requires >> that key >> * attributes are equal when they memcmp(), which might not be the >> case when >> * using type-specific comparison or factoring in collation which >> might make >> * an index case insensitive. >> * >> * All of this is to say that the goal is for the executor to know, >> ahead of >> * calling into the table AM for the update and before calling into >> the index >> * AM for inserting new index tuples, which attributes at a minimum will >> * necessitate a new index tuple. >> * >> ... >> */ > > Attached are rebased (d5b4f3a6d4e) patches with the only changes > happening in the last patch in the series. > > 0004 - Enable HOT updates for expression and partial indexes > > I was never happy with the dual functions > ExecCheckIndexedAttrsForChanges() and ExecWhichIndexesRequireUpdates(), > it felt like too much overhead and duplication of effort. While > updating my tests, adding a few cases, I found that there was also a > flaw in the logic. So, time to rewrite and combine them. > > What did I discover? Before the logic was to find the set of modified > indexed attributes then review all the indexes for changed attributes > using FormIndexDatum() and comparing before/after to see if expressions > really changed the value to be indexed or not. The first pass didn't > take into account expressions, the second did. So, an expression index > over JSONB data wouldn't extract and test the field within the document, > it was just comparing the entire document before/after using the jsonb > comparison function, no bueno. > > This approach wraps both functions into one somewhat simplified > function. The logic is basically, iterate over the indexes reviewing > indexed attributes for changes. Along the way we call into the new > index AM's comparison function when present, otherwise we find and use > the proper type-specific comparison function for the datum. At the end > of the function we have our Bitmapset of attributes that should trigger > new index tuples. > >> What's left undone? >> >> * I need to check code coverage so that I might > > I did this and it was quite good, I'll do it again for this new series > but it's nice to see that the tests are exercising the vast majority of > the code paths. > >> * create tests covering all the new cases > > I think the coverage is good, maybe even redundant or overly complex > in places. > >> * update the README.HOT documentation, wiki, etc. > > Soon, I hope to have this approach solid and under review before > solidifying the docs. > >> * performance... > > Still as yet unmeasured, I know that there is more work per-update to > perform these checks, so some overhead, but I don't know if that > overhead is more than before with HeapDetermineColumnsInfo() and > index_unchanged_by_update(). Those two functions did essentially the > same thing, only with binary comparison (datumIsEqual()). I need to > measure that. What about doing all this work outside of the buffer lock > in heap_update()? Surely that'll give back a bit or at least add to > concurrency. Forming index tuples a few extra times and evaluating the > expressions 3 times rather than 1 is going to hurt, I think I can come > up with a way to cache the formed datum and use it later on, but is that > worth it? Complex expressions, yes. Also, what about expressions that > expect to be executed once... and now are 3x? That's what forced my > update to the insert-conflict-specconflict.out test, but AFAICT there is > no way to test if an expression's value is going to change on update > without exercising it once for the old tuple and once for the new tuple. > Even if it were possible for an index to provide the key it might have > changed after the expression evaluation (as is the case in hash), so I > don't think this is avoidable. Maybe that's reason enough to add a > reloption to disable the expression evaluation piece of this? Given > that it might create a logic or performance regression. The flip side > is the potential to use the HOT path, that's a real savings. > > One concerning thing is that nbtree's assumption that key attributes for > TIDs must use binary comparison for equality. This means that for our > common case (heap/btree) there is more work per-update than before, > which is why I need to measure. I could look into eliminating the > nbtree requirement, I don't understand it too well as yet by I believe > that on page split there is an attempt to deduplicate TIDs into a > TIDBitmap and the test for when that's possible is datumIsEqual(). If > that were the same as in this new code, possibly evening using > tts_attr_equal(), then... I don't know, I'll have to investigate. Chime > in here if you can educate me on this one. :) > > best. > > -greg Doh! I forgot to commit the fixed regression test expected output before formatting the patch set, here it is. -greg
Attachment
pgsql-hackers by date: