Thread: Re: Expanding HOT updates for expression and partial indexes

Re: Expanding HOT updates for expression and partial indexes

From
Matthias van de Meent
Date:
On Thu, 6 Feb 2025 at 23:24, Burd, Greg <gregburd@amazon.com> wrote:
>
> Attached find a patch that expands the cases where heap-only tuple (HOT) updates are possible without changing the
basicsemantics of HOT. This is accomplished by examining expression indexes for changes to determine if indexes require
updatingor not. A similar approach is taken for partial indexes, the predicate is evaluated and, in some cases, HOT
updatesare allowed. Even with this patch if any index is changed, all indexes are updated. Only in cases where none are
modifiedwill this patch allow the HOT path. 

So, effectively this disables the amsummarizing-based optimizations of
https://postgr.es/c/19d8e2308 ? That sounds like a bad degradation in
behaviour.

> I’m also aware of PHOT [4] and WARM [5] which allow for updating some, but not all indexes while remaining on the HOT
updatepath, this patch does not attempt to accomplish that. 
>
> [...] This opens the door to future improvements by providing a way to pass a bitmap of modified indexes along to be
addressedby something similar to the PHOT/WARM logic. 

<sidetrack>

I have serious doubts about the viability of any proposal working to
implement PHOT/WARM in PostgreSQL, as they seem to have an inherent
nature of fundamentally breaking the TID lifecycle:
We won't be able to clean up dead-to-everyone TIDs that were
PHOT-updated, because some index Y may still rely on it, and we can't
remove the TID from that same index Y because there is still a live
PHOT/WARM tuple later in the chain whose values for that index haven't
changed since that dead-to-everyone tuple, and thus this PHOT/WARM
tuple is the one pointed to by that index.
For HOT, this isn't much of an issue, because there is just one TID
that's impacted (and it only occupies a single LP slot, with
LP_REDIRECT). However, with PHOT/WARM, you'd relatively easily be able
to fill a page with TIDs (or even full tuples) you can't clean up with
VACUUM until the moment a the PHOT/WARM/HOT chain is broken (due to
UPDATE leaving the page or the final entry getting DELETE-d).

Unless we are somehow are able to replace the TIDs in indexes from
"intermediate dead PHOT" to "base TID"/"latest TID" (either of which
is probably also problematic for indexes that expect a TID to appear
exactly once in the index at any point in time) I don't think the
system is viable if we maintain only a single data structure to
contain all dead TIDs. If we had a datastore for dead items per index,
that'd be more likely to work, but it also would significantly
increase the memory overhead of vacuuming tables.

</sidetrack>

> I have a few concerns with the patch, things I’d greatly appreciate your thoughts on:
>
> First, I pass an EState along the update path to enable running the checks in heapam, this works but leaves me
feelingas if I violated separation of concerns. If there is a better way to do this let me know or if you think the
costof creating one in the execIndexing.c ExecIndexesRequiringUpdates() is okay that’s another possibility. 

I think that doesn't have to be bad.

> Third, there is overhead to this patch, it is no longer a single simple bitmap test to choose HOT or not in
heap_update().

Why can't it mostly be that simple in simple cases?

I mean, it's clear that "updated indexed column's value == non-HOT
update". And that to determine whether an updated *projected* column's
value (i.e., expression index column's value) was actually updated we
need to calculate the previous and current index value, thus execute
the projection twice. But why would we have significant additional
overhead if there are no expression indexes, or when we can know by
bitmap overlap that the only interesting cases are summarizing
indexes?

I would've implemented this with (1) two new bitmaps, one each for
normal and summarizing indexes, each containing which columns are
exclusively used in expression indexes (and which should thus be used
to trigger the (comparatively) expensive recalculation).

Then, I'd maintain a (cached) list of unique projections/expressions
found in indexes, so that 30 indexes on e.g.
((mycolumn::jsonb)->>'metadata') only extend to 1 check for
differences, rather than 30. The "new" output of these expression
evaluations would be stored to be used later as index datums, reducing
the number of per-expression evaluations down to 2 at most, rather
than 2+1 when the index needs an insertion but the expression itself
wasn't updated.

So, it'd be something like (pseudocode):

if (bms_overlap(updated_columns, hotblocking))
    /* if columns only indexed through expressions were updated, do
expensive stuff. Otherwise, it's a normal non-HOT update. */
    if (bms_subset_compare(updated_columns, hot_expression_columns) in
(BMS_EQUAL, BMS_SUBSET1))
        expensive check for expression changes + populate index column data
    else
        normal_update
else if (bms_overlap(updated_columns, summarizing))
    /* same as above for hotblocking, but now summarizing */
    if (bms_subset_compare(updated_columns, sum_expression_columns) in
(BMS_EQUAL, BMS_SUBSET1))
        expensive check for summarized expression changes + populate
summarized index column data
    else
        summarizing_update
else
    hot_update

Note that it is relatively expensive to do check whether any one index
needs to be updated. It's generally cheaper to do all those checks at
once, where possible; using one or 2 more bitmaps would be sufficient.

Also note that this approach doesn't update specific summarizing
indexes, just all of them or none. I think that "update only
summarizing indexes that were updated" should be a separate patch from
"check if indexed expressions' values changed", potentially in the
patchset, but not as part of the main bulk.

> Fourth, I’d like to know which version the community prefers (v3 or v4).  I think v4 moves the code in a direction
thatis cleaner overall, but you may disagree.  I realize that the way I use the modified_indexes bitmapset is a tad
overloaded(NULL means all indexes should be updated, otherwise only update the indexes in the set which may be
all/some/noneof the indexes) and that may violate the principal of least surprise but I feel that it is better than the
TU_UpdateIndexesenum in the code today. 

I would be hesitant to let table AMs decide which indexes to update at
that precision. Note that this API would allow the AM to update only
(say) the PK index and no other indexes, which is not allowed to
happen if index consistentcy is required (which it is).

----->8-----

Do you have any documentation on the approaches used, and the specific
differences between v3 and v4? I don't see much of that in your
initial mail, and the patches themselves also don't show much of that
in their details. I'd like at least some documentation of the new
behaviour in src/backend/access/heap/README.HOT at some point before
this got marked as RFC in the commitfest app, though preferably sooner
rather than later.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)



Re: Expanding HOT updates for expression and partial indexes

From
"Burd, Greg"
Date:
Apologies for not being clear, this preserves the current behavior for summarizing indexes allowing for HOT updates
whilealso updating the index.  No degradation here that I’m aware of, indeed the tests that ensure that behavior are
unchangedand pass.
 

-greg

> On Feb 10, 2025, at 12:17 PM, Matthias van de Meent <boekewurm+postgres@gmail.com> wrote:
> 
> So, effectively this disables the amsummarizing-based optimizations of
> https://postgr.es/c/19d8e2308 ? That sounds like a bad degradation in
> behaviour.



Re: Expanding HOT updates for expression and partial indexes

From
"Burd, Greg"
Date:

> On Feb 10, 2025, at 12:17 PM, Matthias van de Meent <boekewurm+postgres@gmail.com> wrote:
> 
>> 
>> I have a few concerns with the patch, things I’d greatly appreciate your thoughts on:
>> 
>> First, I pass an EState along the update path to enable running the checks in heapam, this works but leaves me
feelingas if I violated separation of concerns. If there is a better way to do this let me know or if you think the
costof creating one in the execIndexing.c ExecIndexesRequiringUpdates() is okay that’s another possibility.
 
> 
> I think that doesn't have to be bad.

Meaning that the approach I’ve taken is okay with you?

>> Third, there is overhead to this patch, it is no longer a single simple bitmap test to choose HOT or not in
heap_update().
> 
> Why can't it mostly be that simple in simple cases?

It can remain that simple in the cases you mention.  In relcache the hot blocking attributes are nearly the same, only
thesummarizing attributes are removed.  The first test then in heap_update() is for overlap with the modified set.
Whenthere is none, the update will proceed on the HOT path.
 

The presence of a summarizing index is determined in ExecIndexesRequiringUpdates() in execIndexing.c, so a slightly
longercode path but not much new overhead.
 

> I mean, it's clear that "updated indexed column's value == non-HOT
> update". And that to determine whether an updated *projected* column's
> value (i.e., expression index column's value) was actually updated we
> need to calculate the previous and current index value, thus execute
> the projection twice. But why would we have significant additional
> overhead if there are no expression indexes, or when we can know by
> bitmap overlap that the only interesting cases are summarizing
> indexes?

You’re right, there’s not a lot of new overhead in that case except what happens in ExecIndexesRequiringUpdates() to
scanover the list of IndexInfo.  It is really only when there are many expressions/predicates requiring examination
thatthere is any significant cost to this approach AFAICT (but if you see something please point it out).
 

> I would've implemented this with (1) two new bitmaps, one each for
> normal and summarizing indexes, each containing which columns are
> exclusively used in expression indexes (and which should thus be used
> to trigger the (comparatively) expensive recalculation).

That was one where I started, over time that became harder to work as the bitmaps contain the union of index attributes
forthe table not per-column.  Now there is one bitmap to cover the broadest case and then a function to find the
modifiedset of indexes where each is examined against bitmaps that contain only attributes specific to the index in
question. This helped in cases where there were both expression and non-expression indexes on the same attribute.
 

> Then, I'd maintain a (cached) list of unique projections/expressions
> found in indexes, so that 30 indexes on e.g.
> ((mycolumn::jsonb)->>'metadata') only extend to 1 check for
> differences, rather than 30.

An optimization to avoid rechecking isn’t a bad idea.  I wonder how hard it would be to surface the field
(->>’metadata’)from the index expression to track for redundancy, I’ll have to look into that.
 

> The "new" output of these expression
> evaluations would be stored to be used later as index datums, reducing
> the number of per-expression evaluations down to 2 at most, rather
> than 2+1 when the index needs an insertion but the expression itself
> wasn't updated.

Not reforming the new index tuples is also an interesting optimization.  I wonder how that can be passed from within
heapam’scall into a function in execIndexing up into nodeModifiyTable and back down into execIndexing and on to the
indexaccess method?  I’ll have to think about that, ideas welcome.
 

> So, it'd be something like (pseudocode):
> 
> if (bms_overlap(updated_columns, hotblocking))
>    /* if columns only indexed through expressions were updated, do
> expensive stuff. Otherwise, it's a normal non-HOT update. */
>    if (bms_subset_compare(updated_columns, hot_expression_columns) in
> (BMS_EQUAL, BMS_SUBSET1))
>        expensive check for expression changes + populate index column data
>    else
>        normal_update
> else if (bms_overlap(updated_columns, summarizing))
>    /* same as above for hotblocking, but now summarizing */
>    if (bms_subset_compare(updated_columns, sum_expression_columns) in
> (BMS_EQUAL, BMS_SUBSET1))
>        expensive check for summarized expression changes + populate
> summarized index column data
>    else
>        summarizing_update
> else
>    hot_update
> 
> Note that it is relatively expensive to do check whether any one index
> needs to be updated. It's generally cheaper to do all those checks at
> once, where possible; using one or 2 more bitmaps would be sufficient.
> 
> Also note that this approach doesn't update specific summarizing
> indexes, just all of them or none. I think that "update only
> summarizing indexes that were updated" should be a separate patch from
> "check if indexed expressions' values changed", potentially in the
> patchset, but not as part of the main bulk.
> 
>> Fourth, I’d like to know which version the community prefers (v3 or v4).  I think v4 moves the code in a direction
thatis cleaner overall, but you may disagree.  I realize that the way I use the modified_indexes bitmapset is a tad
overloaded(NULL means all indexes should be updated, otherwise only update the indexes in the set which may be
all/some/noneof the indexes) and that may violate the principal of least surprise but I feel that it is better than the
TU_UpdateIndexesenum in the code today.
 
> 
> I would be hesitant to let table AMs decide which indexes to update at
> that precision. Note that this API would allow the AM to update only
> (say) the PK index and no other indexes, which is not allowed to
> happen if index consistentcy is required (which it is).

Interesting, thanks for the feedback.  I’ll think on this a bit more and provide more detail with the next update.

> ----->8-----
> 
> Do you have any documentation on the approaches used, and the specific
> differences between v3 and v4? I don't see much of that in your
> initial mail, and the patches themselves also don't show much of that
> in their details. I'd like at least some documentation of the new
> behaviour in src/backend/access/heap/README.HOT at some point before
> this got marked as RFC in the commitfest app, though preferably sooner
> rather than later.

Good point, I should have updated README.HOT with the initial patchset.  I’ll jump on that and update ASAP.

> Kind regards,
> 
> Matthias van de Meent
> Neon (https://neon.tech)

thanks for the thoughtful reply.

-greg


Re: Expanding HOT updates for expression and partial indexes

From
Nathan Bossart
Date:
On Mon, Feb 10, 2025 at 06:17:42PM +0100, Matthias van de Meent wrote:
> I have serious doubts about the viability of any proposal working to
> implement PHOT/WARM in PostgreSQL, as they seem to have an inherent
> nature of fundamentally breaking the TID lifecycle:
> We won't be able to clean up dead-to-everyone TIDs that were
> PHOT-updated, because some index Y may still rely on it, and we can't
> remove the TID from that same index Y because there is still a live
> PHOT/WARM tuple later in the chain whose values for that index haven't
> changed since that dead-to-everyone tuple, and thus this PHOT/WARM
> tuple is the one pointed to by that index.
> For HOT, this isn't much of an issue, because there is just one TID
> that's impacted (and it only occupies a single LP slot, with
> LP_REDIRECT). However, with PHOT/WARM, you'd relatively easily be able
> to fill a page with TIDs (or even full tuples) you can't clean up with
> VACUUM until the moment a the PHOT/WARM/HOT chain is broken (due to
> UPDATE leaving the page or the final entry getting DELETE-d).
> 
> Unless we are somehow are able to replace the TIDs in indexes from
> "intermediate dead PHOT" to "base TID"/"latest TID" (either of which
> is probably also problematic for indexes that expect a TID to appear
> exactly once in the index at any point in time) I don't think the
> system is viable if we maintain only a single data structure to
> contain all dead TIDs. If we had a datastore for dead items per index,
> that'd be more likely to work, but it also would significantly
> increase the memory overhead of vacuuming tables.

I share your concerns, but I don't think things are as dire as you suggest.
For example, perhaps we put a limit on how long a PHOT chain can be, or
maybe we try to detect update patterns that don't work well with PHOT.
Another option could be to limit PHOT updates to only when the same set of
indexed columns are updated or when <50% of the indexed columns are
updated.  These aren't fully fleshed-out ideas, of course, but I am at
least somewhat optimistic we could find appropriate trade-offs.

-- 
nathan



Re: Expanding HOT updates for expression and partial indexes

From
Matthias van de Meent
Date:
On Tue, 11 Feb 2025 at 00:20, Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Mon, Feb 10, 2025 at 06:17:42PM +0100, Matthias van de Meent wrote:
> > I have serious doubts about the viability of any proposal working to
> > implement PHOT/WARM in PostgreSQL, as they seem to have an inherent
> > nature of fundamentally breaking the TID lifecycle:
> > [... concerns]
>
> I share your concerns, but I don't think things are as dire as you suggest.
> For example, perhaps we put a limit on how long a PHOT chain can be, or
> maybe we try to detect update patterns that don't work well with PHOT.
> Another option could be to limit PHOT updates to only when the same set of
> indexed columns are updated or when <50% of the indexed columns are
> updated.  These aren't fully fleshed-out ideas, of course, but I am at
> least somewhat optimistic we could find appropriate trade-offs.

Yes, there are methods which could limit the overhead. But I'm not
sure there are cheap-enough designs which would make PHOT a
universally good choice (i.e. not tunable with guc/table option),
considering its significantly larger un-reclaimable storage overhead
vs HOT.

Kind regards,

Matthias van de Meent.



Re: Expanding HOT updates for expression and partial indexes

From
Matthias van de Meent
Date:
On Mon, 10 Feb 2025 at 19:15, Burd, Greg <gregburd@amazon.com> wrote:
>
> Apologies for not being clear, this preserves the current behavior for summarizing indexes allowing for HOT updates
whilealso updating the index.  No degradation here that I’m aware of, indeed the tests that ensure that behavior are
unchangedand pass. 

Looking at the code again, while it does indeed preserve the current
behaviour, it doesn't actually improve the behavior for summarizing
indexes when that would be expected.

Example:

CREATE INDEX hotblocking ON mytab USING btree((att1->'data'));
CREATE INDEX summarizing ON mytab USING BRIN(att2);
UPDATE mytab SET att1 = att1 || '{"check": "mate"}';

In v3 (same code present in v4), I notice that in the above case we
hit the "indexed attribute updated" path (hotblocking indeed indexes
the updated attribute att1), go into ExecIndexesRequiringUpdates, and
mark index 'summarizing' as 'needs an update', even though no
attribute of that index has a new value. Then we notice that
att1->'data' hasn't changed, and so we don't need to update the
'hotblocking' index, but we do update the (unchanged) 'summarizing'
index.

This indicates that in practice (with this version of the patch) this
will improve the HOT applicability situation while summarizing indexes
don't really gain a benefit from this - they're always updated when
any indexed column is updated, even if we could detect that there were
no changes to any indexed values.

Actually, you could say we find ourselves in the counter-intuitive
situation that the addition of the 'hotblocking' index whose value
were not updated now caused index insertions into summarizing indexes.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)



Re: Expanding HOT updates for expression and partial indexes

From
"Burd, Greg"
Date:
Matthias,

Thanks for the in-depth review, you are correct and I appreciate you uncovering that oversight with summarizing
indexes. I’ll add a test case and modify the logic to prevent updates to unchanged summarizing indexes by testing their
attributesagainst the modified set while keeping the HOT optimization when only summarizing indexes are changed.
 

thanks for finding this,

-greg


> On Feb 11, 2025, at 4:40 PM, Matthias van de Meent <boekewurm+postgres@gmail.com> wrote:
> 
> On Mon, 10 Feb 2025 at 19:15, Burd, Greg <gregburd@amazon.com> wrote:
>> 
>> Apologies for not being clear, this preserves the current behavior for summarizing indexes allowing for HOT updates
whilealso updating the index.  No degradation here that I’m aware of, indeed the tests that ensure that behavior are
unchangedand pass.
 
> 
> Looking at the code again, while it does indeed preserve the current
> behaviour, it doesn't actually improve the behavior for summarizing
> indexes when that would be expected.
> 
> Example:
> 
> CREATE INDEX hotblocking ON mytab USING btree((att1->'data'));
> CREATE INDEX summarizing ON mytab USING BRIN(att2);
> UPDATE mytab SET att1 = att1 || '{"check": "mate"}';
> 
> In v3 (same code present in v4), I notice that in the above case we
> hit the "indexed attribute updated" path (hotblocking indeed indexes
> the updated attribute att1), go into ExecIndexesRequiringUpdates, and
> mark index 'summarizing' as 'needs an update', even though no
> attribute of that index has a new value. Then we notice that
> att1->'data' hasn't changed, and so we don't need to update the
> 'hotblocking' index, but we do update the (unchanged) 'summarizing'
> index.
> 
> This indicates that in practice (with this version of the patch) this
> will improve the HOT applicability situation while summarizing indexes
> don't really gain a benefit from this - they're always updated when
> any indexed column is updated, even if we could detect that there were
> no changes to any indexed values.
> 
> Actually, you could say we find ourselves in the counter-intuitive
> situation that the addition of the 'hotblocking' index whose value
> were not updated now caused index insertions into summarizing indexes.
> 
> Kind regards,
> 
> Matthias van de Meent
> Neon (https://neon.tech)


Re: Expanding HOT updates for expression and partial indexes

From
Matthias van de Meent
Date:
On Thu, 13 Feb 2025 at 19:46, Burd, Greg <gregburd@amazon.com> wrote:
>
> Attached find an updated patchset v5 that is an evolution of v4.
>
> Changes v4 to v5 are:
> * replaced GUC with table reloption called "expression_checks" (open to other name ideas)
> * minimal documentation updates to README.HOT to address changes
> * avoid, when possible, the expensive path that requires evaluating an estate using bitmaps
> * determines the set of summarized indexes requiring updates, only updates those
> * more tests in heap_hot_updates.sql (perhaps too many...)
> * rebased to master, formatted, and make check-world passes

Thank you for the update. Below some comments, in no particular order.

-----

I'm not a fan of how you replaced TU_UpdateIndexes with a bitmap. It
seems unergonomic and a waste of performance.

In HEAD, we don't do any expensive computations for the fast path of
"indexed attribute updated" - at most we do the bitmap compare and
then set a pointer. With this patch, the way we signal that is by
allocating a bitmap of size O(n_indexes). That's potentially quite
expensive (given 1000s of indexes), and definitely more expensive than
only a pointer assignment.
In HEAD, we have a clear indication of which classes of indexes to
update, with TU_UpdateIndexes. With this patch, we have to derive that
from the (lack of) bits in the bitmap that might be output by the
table_update procedure.

I think we can do with an additional parameter for which indexes would
be updated (or store that info in the parameter which also will hold
EState et al). I think it's cheaper that way, too - only when
update_indexes could be TU_SUMMARIZING we might need the exact
information for which indexes to insert new tuples into, and it only
really needs to be sized to the number of summarizing indexes (usually
small/nonexistent, but potentially huge).

-----

I think your patch design came from trying to include at least two
distinct optimizations:
  1) to make the HOT-or-not check include whether the expressions of
indexes were updated, and
  2) to only insert index tuples into indexes that got updated values
when table_tuple_update returns update_indexes=TU_Summarizing.

While they touch similar code (clearly seen here), I think those
should be implemented in different patches. For (1), the current API
surface is good enough when the EState is passed down. For (2), you'll
indeed also need an additional argument we can use to fill with the
right summarizing indexes, but I don't think that can nor should
replace the function of TU_UpdateIndexes.

If you agree with my observation of those being distinct
optimizations, could you split this patch into parts (but still within
the same series) so that these are separately reviewable?

-----

I notice that ExecIndexesRequiringUpdates() does work on all indexes,
rather than just indexes relevant to this exact phase of checking. I
think that is a waste of time, so if we sort the indexes in order of
[hotblocking without expressions, hotblocking with expressions,
summarizing], then (with stored start/end indexes) we can save time in
cases where there are comparatively few of the types we're not going
to look at.

As an extreme example: we shouldn't do the (comparatively) expensive
work evaluating expressions to determine which of 1000s of summarizing
indexes has been updated when we're still not sure if we can apply HOT
at all.

(Sidenote: Though, arguably, we could be smarter by skipping index
insertions into unmodified summarizing indexes altogether regardless
of HOT status, as long as the update is on the same page - but that's
getting ahead of ourselves and not relevant to this discussion.)

-----

I noticed you've disabled any passing of "HOT or not" in the
simple_update cases, and have done away with the various checks that
are in place to prevent corruption. I don't think that's a great idea,
it's quite likely to cause bugs.

-----

You're extracting type info from the opclass, to use in
datum_image_eq(). Couldn't you instead use the index relation's
TupleDesc and its stored attribute information instead? That saves us
from having to do further catalog lookups during execution. I'm also
fairly sure that that information is supposed to be a more accurate
representation of attributes' expression output types than the
opclass' type information (though, they probably should match).

-----

The operations applied in ExecIndexesRequiringUpdates partially
duplicate those done in index_unchanged_by_update. Can we (partially)
unify this, and pass which indexes were updated through the IndexInfo,
rather than the current bitmap?

-----

I don't see a good reason to add IndexInfo to Relation, by way of
rd_indexInfoList. It seems like an ad-hoc way of passing data around,
and I don't think that's the right way.





>>>> I mean, it's clear that "updated indexed column's value == non-HOT
>>>> update". And that to determine whether an updated *projected* column's
>>>> value (i.e., expression index column's value) was actually updated we
>>>> need to calculate the previous and current index value, thus execute
>>>> the projection twice. But why would we have significant additional
>>>> overhead if there are no expression indexes, or when we can know by
>>>> bitmap overlap that the only interesting cases are summarizing
>>>> indexes?
>>
>> See the attached approach. Evaluation of the expressions only has to
>> happen if there are any HOT-blocking attributes which are exclusively
>> hot-blockingly indexed through expressions, so if the updated
>> attribute numbers are a subset of hotblockingexprattrs. (substitute
>> hotblocking with summarizing for the summarizing approach)
>
> I believe I've incorporated the gist of your idea in this v5 patch, let me know if I missed something.

Seems about accurate.

>>>> I would've implemented this with (1) two new bitmaps, one each for
>>>> normal and summarizing indexes, each containing which columns are
>>>> exclusively used in expression indexes (and which should thus be used
>>>> to trigger the (comparatively) expensive recalculation).
>>>
>>> That was one where I started, over time that became harder to work as the bitmaps contain the union of index
attributesfor the table not per-column. 
>>
>> I think it's fairly easy to create, though.
>>
>>> Now there is one bitmap to cover the broadest case and then a function to find the modified set of indexes where
eachis examined against bitmaps that contain only attributes specific to the index in question.  This helped in cases
wherethere were both expression and non-expression indexes on the same attribute. 
>>
>> Fair, but do we care about one expression index on (attr1->>'data')'s
>> value *not* changing when an index on (attr1) exists and attr1 has
>> changed? That index on att1 would block HOT updates regardless of the
>> (lack of) changes to the (att1->>'data') index, so doing those
>> expensive calculations seems quite wasteful.
>
> Agreed, when both a non-expression and an expression index exist on the same attribute then the expression checks are
unnecessaryand should be avoided.  In this v5 patchset this case becomes two checks of bitmaps (first hot_attrs, then
exclusivelyexp_attrs) before proceeding with a non-HOT update. 



> > So, in my opinion, we should also keep track of those attributes only
> > included in expressions of indexes, and that's fairly easy: see
> > attached prototype.diff.txt (might need some work, the patch was
> > drafted on v16's codebase, but the idea is clear).
>
> Thank you for your patch, I've included and expanded it.