Re: Expanding HOT updates for expression and partial indexes - Mailing list pgsql-hackers

From Greg Burd
Subject Re: Expanding HOT updates for expression and partial indexes
Date
Msg-id CF007F8C-5FCA-4FC5-A7EC-282F8C9D413C@greg.burd.me
Whole thread Raw
List pgsql-hackers
On Jul 2 2025, at 2:10 pm, Greg Burd <greg@burd.me> wrote:

> The goal is to allow HOT updates under two new conditions:
> * when an indexed expression has not changed
> * when possible for a partial index

This is still true. :)

This patch has languished for nearly 2+ months at v17.  Why?  Primarily
due to feedback that although the idea had merit, there was a critical
flaw in the approach that made it a non-starter.  The flaw was that I'd
been executing expressions while holding both the pin and a lock on the
buffer, which is not a great idea (self dead lock, etc.). This was
pointed out to me (thanks Robert Haas!) and so I needed to re-think my approach.

I put the patch aside for a while, then this past week at PGConf.dev/NYC
I heard interest from a few people (Jeff Davis, Nathan Bossart) who
encouraged me to move the code executing the expressions to just before
acquiring the lock but after pinning the buffer. The theory being that
my new code using the old/new tts to form and test the index tuples
resulting from executing expressions was using the resultsRelInfo struct
created during plan execution, not the information found on the page,
and so was safe without the lock.

This proved tricky because I had been using the modified_attrs and
expr_attrs as a test to avoid exercising expressions when unnecessary. 
Calling HeapDetermineColumnsInfo() outside the buffer lock to get
modified_attrs proved to be a problem as it examines an oldtup that is
cobbled together from the elements on the page, requiring the lock I was
trying to avoid.

After reviewing how updates work in the executor, I discovered that
during execution the new tuple slot is populated with the information
from ExecBuildUpdateProjection() and the old tuple, but that most
importantly for this use case that function created a bitmap of the
modified columns (the columns specified in the update).  This bitmap
isn't the same as the one produced by HeapDetermineColumnsInfo() as the
latter excludes attributes that are not changed after testing equality
with the helper function heap_attr_equals() where as the former will
include attributes that appear in the update but are the same value as
before.  This, happily, is immaterial for the purposes of my function
ExecExprIndexesRequireUpdates() which simply needs to check to see if
index tuples generated are unchanged.  So I had all I needed to run the
checks ahead of acquiring the lock on the buffer.

So, this led to v18 (attached), which passes test wold including a
number of new tests for the various corner cases relative to HOT updates
for expressions.

There is much room for improvement, and your suggestions are welcome.

I'll find time to quantify the benefit of this patch for the targeted
use cases and to ensure that all other cases see no regressions.

I need to review the tests I've added to ensure that they are the
minimal set required, that they communicate effectively their purpose,
etc. For now, it's more shotgun than scalpel, .... I'll get to it.

I added a reloption "expression_checks" to disable this new code path. 
Good idea or bad precedent?

In execIndexing I special case for IsolationIsSerializable() and I can't
remember why now but I do recall one isolation test failing... I'll
check on this and get back to the thread.  Or maybe you know why that

From small things like naming to larger questions like hijacking the
modified columns bitmapset for use later in the heap, I need feedback
and guidance.

I'm using UpdateContext for estate/resultRelInfo and I've added to that
lockmode, these change the table am API a tad, I'm open to better ideas
that accomplish the same.

I'd like not to build, then rebuild index tuples for these expressions
but I can't think of a way to do that without a palloc(), this is
avoided today.


Example:

CREATE TABLE t (
    x INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    y JSONB
);
CREATE INDEX t_age_idx ON t (((y->>'age')::int));

INSERT INTO t (y)
VALUES ('{"name": "John", "age": 30, "city": "New York"}');

-- Update an indexed field in the JSONB document, should not be HOT
UPDATE t
SET y = jsonb_set(y, '{age}', '31')
WHERE x = 1;

SELECT pg_stat_force_next_flush();
SELECT relname, n_tup_upd, n_tup_hot_upd FROM pg_stat_user_tables
WHERE relname = 't';

-- Update a non-indexed field in the JSONB document, new HOT update
UPDATE t
SET y = jsonb_set(y, '{city}', '"Boston"')
WHERE x = 1;

SELECT pg_stat_force_next_flush();
SELECT relname, n_tup_upd, n_tup_hot_upd FROM pg_stat_user_tables
WHERE relname = 't';


This patch is far from "commit ready", but it does accomplish the
$subject and pass tests.

I look forward to any/all constructive feedback.

best.

-greg

Attachment

pgsql-hackers by date:

Previous
From: "Matheus Alcantara"
Date:
Subject: Re: Optimize LISTEN/NOTIFY
Next
From: Peter Smith
Date:
Subject: Re: [PROPOSAL] Termination of Background Workers for ALTER/DROP DATABASE