Re: Expanding HOT updates for expression and partial indexes - Mailing list pgsql-hackers
From | Greg Burd |
---|---|
Subject | Re: Expanding HOT updates for expression and partial indexes |
Date | |
Msg-id | F3B5A0EB-9240-4235-B62D-9531CC1CD3C6@burd.me Whole thread Raw |
In response to | Re: Expanding HOT updates for expression and partial indexes (Jeff Davis <pgsql@j-davis.com>) |
Responses |
Re: Expanding HOT updates for expression and partial indexes
|
List | pgsql-hackers |
> On Oct 9, 2025, at 3:27 PM, Jeff Davis <pgsql@j-davis.com> wrote: > > On Tue, 2025-10-07 at 17:36 -0400, Greg Burd wrote: >> After reviewing how updates work in the executor, I discovered that >> during execution the new tuple slot is populated with the information >> from ExecBuildUpdateProjection() and the old tuple, but that most >> importantly for this use case that function created a bitmap of the >> modified columns (the columns specified in the update). This bitmap >> isn't the same as the one produced by HeapDetermineColumnsInfo() as >> the >> latter excludes attributes that are not changed after testing >> equality >> with the helper function heap_attr_equals() where as the former will >> include attributes that appear in the update but are the same value >> as >> before. This, happily, is immaterial for the purposes of my function >> ExecExprIndexesRequireUpdates() which simply needs to check to see if >> index tuples generated are unchanged. So I had all I needed to run >> the >> checks ahead of acquiring the lock on the buffer. > > You're still calling ExecExprIndexesRequireUpdates() from within > heap_update(). Can't you do that inside of ExecUpdatePrologue() or > thereabouts? Hey Jeff, I'm trying to knit this into the executor layer but that is tricky because the concept of HOT is very heap-specific, so the executor should be ignorant of the heap's specific needs (right?). Right now, I am considering adding a step in ExecUpdatePrologue() just after opening the indexes. The idea I'm toying with is to have a new function on all TupleTableSlots that examines the before/after slots for an update and the set of updated attributes and returns a Bitmapset of the changed attributes that overlap with indexes and so should trigger index updates in ExecUpdateEpilogue(). That way for heap we'd have something like: Bitmapset *tts_heap_getidxattr(ResultRelInfo *info, TupleTableSlot *updated, TupleTableSlot *existing, Bitmapset *updated_attrs) { some combo of HeapDeterminColumnsInfo() and ExecExprIndexesRequireUpdates() returns the set of indexed attrs that this update changed } So, attributes only referenced by expressions where the expression produces the same value for the updated and existing slots would be removed from the set. Interestingly, summarizing indexes that don't overlap with changed attributes won't be updated (and that's a good thing). Problem is we're not yet accounting for what is about to happen in ExecUpdateAct() when calling into the heap_update(). That's where heap tries to fit the new tuple onto the same page. That might be possible with large tuples thanks to TOAST, it's impossible to say before getting into this function with the page locked. So, for updates we include the modified_attrs in the UpdateContext which is available to heap_update(). If the heap code decides to go HOT, great unset all attributes in the modified_attrs except any that are only summarizing. If the heap can't go HOT, fine, add the indexed attrs back into modified_attrs which should trigger all indexes to be updated. This gets rid of TU_UpdateIndexes enum and allows only modified summarizing indexes to be updated on the HOT path. Two additional benefits IMO. at least, that's what I'm trying out now, -greg > Regards, > Jeff Davis
pgsql-hackers by date: