Re: Expanding HOT updates for expression and partial indexes - Mailing list pgsql-hackers

From Greg Burd
Subject Re: Expanding HOT updates for expression and partial indexes
Date
Msg-id F3B5A0EB-9240-4235-B62D-9531CC1CD3C6@burd.me
Whole thread Raw
In response to Re: Expanding HOT updates for expression and partial indexes  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: Expanding HOT updates for expression and partial indexes
List pgsql-hackers
> On Oct 9, 2025, at 3:27 PM, Jeff Davis <pgsql@j-davis.com> wrote:
>
> On Tue, 2025-10-07 at 17:36 -0400, Greg Burd wrote:
>> After reviewing how updates work in the executor, I discovered that
>> during execution the new tuple slot is populated with the information
>> from ExecBuildUpdateProjection() and the old tuple, but that most
>> importantly for this use case that function created a bitmap of the
>> modified columns (the columns specified in the update).  This bitmap
>> isn't the same as the one produced by HeapDetermineColumnsInfo() as
>> the
>> latter excludes attributes that are not changed after testing
>> equality
>> with the helper function heap_attr_equals() where as the former will
>> include attributes that appear in the update but are the same value
>> as
>> before.  This, happily, is immaterial for the purposes of my function
>> ExecExprIndexesRequireUpdates() which simply needs to check to see if
>> index tuples generated are unchanged.  So I had all I needed to run
>> the
>> checks ahead of acquiring the lock on the buffer.
>
> You're still calling ExecExprIndexesRequireUpdates() from within
> heap_update(). Can't you do that inside of ExecUpdatePrologue() or
> thereabouts?

Hey Jeff,

I'm trying to knit this into the executor layer but that is tricky because
the concept of HOT is very heap-specific, so the executor should be
ignorant of the heap's specific needs (right?). Right now, I am considering
adding a step in ExecUpdatePrologue() just after opening the indexes.

The idea I'm toying with is to have a new function on all TupleTableSlots
that examines the before/after slots for an update and the set of updated
attributes and returns a Bitmapset of the changed attributes that overlap
with indexes and so should trigger index updates in ExecUpdateEpilogue().

That way for heap we'd have something like:
Bitmapset *tts_heap_getidxattr(ResultRelInfo *info,
            TupleTableSlot *updated,
            TupleTableSlot *existing,
            Bitmapset *updated_attrs)
{
    some combo of HeapDeterminColumnsInfo() and
    ExecExprIndexesRequireUpdates()

    returns the set of indexed attrs that this update changed
}

So, attributes only referenced by expressions where the expression
produces the same value for the updated and existing slots would be
removed from the set.

Interestingly, summarizing indexes that don't overlap with changed
attributes won't be updated (and that's a good thing).

Problem is we're not yet accounting for what is about to happen in
ExecUpdateAct() when calling into the heap_update().  That's where
heap tries to fit the new tuple onto the same page.  That might be
possible with large tuples thanks to TOAST, it's impossible to say
before getting into this function with the page locked.

So, for updates we include the modified_attrs in the UpdateContext
which is available to heap_update().  If the heap code decides to
go HOT, great unset all attributes in the modified_attrs except any
that are only summarizing.  If the heap can't go HOT, fine, add
the indexed attrs back into modified_attrs which should trigger all
indexes to be updated.

This gets rid of TU_UpdateIndexes enum and allows only modified
summarizing indexes to be updated on the HOT path.  Two additional
benefits IMO.

at least, that's what I'm trying out now,

-greg

> Regards,
> Jeff Davis




pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: pg_createsubscriber - more logging to say if there are no pubs to drop
Next
From: Robert Haas
Date:
Subject: Re: Resetting recovery target parameters in pg_createsubscriber