Re: Expanding HOT updates for expression and partial indexes - Mailing list pgsql-hackers
From | Greg Burd |
---|---|
Subject | Re: Expanding HOT updates for expression and partial indexes |
Date | |
Msg-id | CF007F8C-5FCA-4FC5-A7EC-282F8C9D413C@greg.burd.me Whole thread Raw |
List | pgsql-hackers |
On Jul 2 2025, at 2:10 pm, Greg Burd <greg@burd.me> wrote: > The goal is to allow HOT updates under two new conditions: > * when an indexed expression has not changed > * when possible for a partial index This is still true. :) This patch has languished for nearly 2+ months at v17. Why? Primarily due to feedback that although the idea had merit, there was a critical flaw in the approach that made it a non-starter. The flaw was that I'd been executing expressions while holding both the pin and a lock on the buffer, which is not a great idea (self dead lock, etc.). This was pointed out to me (thanks Robert Haas!) and so I needed to re-think my approach. I put the patch aside for a while, then this past week at PGConf.dev/NYC I heard interest from a few people (Jeff Davis, Nathan Bossart) who encouraged me to move the code executing the expressions to just before acquiring the lock but after pinning the buffer. The theory being that my new code using the old/new tts to form and test the index tuples resulting from executing expressions was using the resultsRelInfo struct created during plan execution, not the information found on the page, and so was safe without the lock. This proved tricky because I had been using the modified_attrs and expr_attrs as a test to avoid exercising expressions when unnecessary. Calling HeapDetermineColumnsInfo() outside the buffer lock to get modified_attrs proved to be a problem as it examines an oldtup that is cobbled together from the elements on the page, requiring the lock I was trying to avoid. After reviewing how updates work in the executor, I discovered that during execution the new tuple slot is populated with the information from ExecBuildUpdateProjection() and the old tuple, but that most importantly for this use case that function created a bitmap of the modified columns (the columns specified in the update). This bitmap isn't the same as the one produced by HeapDetermineColumnsInfo() as the latter excludes attributes that are not changed after testing equality with the helper function heap_attr_equals() where as the former will include attributes that appear in the update but are the same value as before. This, happily, is immaterial for the purposes of my function ExecExprIndexesRequireUpdates() which simply needs to check to see if index tuples generated are unchanged. So I had all I needed to run the checks ahead of acquiring the lock on the buffer. So, this led to v18 (attached), which passes test wold including a number of new tests for the various corner cases relative to HOT updates for expressions. There is much room for improvement, and your suggestions are welcome. I'll find time to quantify the benefit of this patch for the targeted use cases and to ensure that all other cases see no regressions. I need to review the tests I've added to ensure that they are the minimal set required, that they communicate effectively their purpose, etc. For now, it's more shotgun than scalpel, .... I'll get to it. I added a reloption "expression_checks" to disable this new code path. Good idea or bad precedent? In execIndexing I special case for IsolationIsSerializable() and I can't remember why now but I do recall one isolation test failing... I'll check on this and get back to the thread. Or maybe you know why that From small things like naming to larger questions like hijacking the modified columns bitmapset for use later in the heap, I need feedback and guidance. I'm using UpdateContext for estate/resultRelInfo and I've added to that lockmode, these change the table am API a tad, I'm open to better ideas that accomplish the same. I'd like not to build, then rebuild index tuples for these expressions but I can't think of a way to do that without a palloc(), this is avoided today. Example: CREATE TABLE t ( x INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, y JSONB ); CREATE INDEX t_age_idx ON t (((y->>'age')::int)); INSERT INTO t (y) VALUES ('{"name": "John", "age": 30, "city": "New York"}'); -- Update an indexed field in the JSONB document, should not be HOT UPDATE t SET y = jsonb_set(y, '{age}', '31') WHERE x = 1; SELECT pg_stat_force_next_flush(); SELECT relname, n_tup_upd, n_tup_hot_upd FROM pg_stat_user_tables WHERE relname = 't'; -- Update a non-indexed field in the JSONB document, new HOT update UPDATE t SET y = jsonb_set(y, '{city}', '"Boston"') WHERE x = 1; SELECT pg_stat_force_next_flush(); SELECT relname, n_tup_upd, n_tup_hot_upd FROM pg_stat_user_tables WHERE relname = 't'; This patch is far from "commit ready", but it does accomplish the $subject and pass tests. I look forward to any/all constructive feedback. best. -greg
Attachment
pgsql-hackers by date: