Re: Expanding HOT updates for expression and partial indexes - Mailing list pgsql-hackers

From Matthias van de Meent
Subject Re: Expanding HOT updates for expression and partial indexes
Date
Msg-id CAEze2WjJqjyJk-7XBtiYBovJMg5tLZkS5MGii56oCOa1Z+MKWQ@mail.gmail.com
Whole thread Raw
In response to Re: Expanding HOT updates for expression and partial indexes  ("Burd, Greg" <gregburd@amazon.com>)
List pgsql-hackers
On Wed, 5 Mar 2025 at 18:21, Burd, Greg <gregburd@amazon.com> wrote:
>
> Hello,
>
> I've rebased and updated the patch a bit.  The biggest change is that the performance penalty measured with v1 of
thispatch is essentially gone in v10.  The overhead was due to re-creating IndexInfo information unnecessarily, which I
foundexisted in the estate.  I've added a few fields in IndexInfo that are not populated by default but necessary when
checkingexpression indexes, those fields are populated on demand and only once limiting their overhead. 

This review is based on a light reading of patch v10. I have not read
all 90kB, and am unlikely to finish a full review soon:

> * assumes estate->es_result_relations[0] is the ResultRelInfo being updated

I'm not sure that's a valid assumption. I suspect it might be false in
cases of nested updates, like

$ UPDATE table1 SET value = other.value FROM (UPDATE table2 SET value
= 2 ) other WHERE other.id = table1.id;

If this table1 or table2 has expression indexes I suspect it may
result in this assertion failing (but I haven't spun up a server with
the patch).
Alternatively, please also check that it doesn't break if any of these
two tables is partitioned with multiple partitions (and/or has
expression indexes, etc.).

> * uses ri_IndexRelationInfo[] from within estate rather than re-creating it

As I mentioned above, I think it's safer to pass the known-correct RRI
(known by callers of table_tuple_update) down the stack.

> * augments IndexInfo only when needed for testing expressions and only once

ExecExpressionIndexesUpdated seems to always loop over all indexes,
always calling AttributeIndexInfo which always updates the fields in
the IndexInfo when the index has only !byval attributes (e.g. text,
json, or other such varlena types). You say it happens only once, have
I missed something?

I'm also somewhat concerned about the use of typecache lookups on
index->rd_opcintype[i], rather than using
TupleDescCompactAttr(index->rd_att, i); the latter of which I think
should be faster, especially when multiple wide indexes are scanned
with various column types. In hot loops of single-tuple update
statements I think this may make a few 0.1%pt difference - not a lot,
but worth considering.

> * only creates a local old/new TupleTableSlot when not present in estate

I'm not sure it's safe for us to touch that RRI's tupleslots.

> * retains existing summarized index HOT update logic

Great, thanks!

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)



pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: Update Unicode data to Unicode 16.0.0
Next
From: Melanie Plageman
Date:
Subject: Re: Log connection establishment timings