Re: Expanding HOT updates for expression and partial indexes - Mailing list pgsql-hackers
| From | Greg Burd |
|---|---|
| Subject | Re: Expanding HOT updates for expression and partial indexes |
| Date | |
| Msg-id | 9bb9bdd6-e1fe-48fe-837d-4d0289396f1c@app.fastmail.com Whole thread Raw |
| In response to | Re: Expanding HOT updates for expression and partial indexes (Jeff Davis <pgsql@j-davis.com>) |
| Responses |
Re: Expanding HOT updates for expression and partial indexes
|
| List | pgsql-hackers |
On Wed, Feb 25, 2026, at 4:03 PM, Jeff Davis wrote:
> On Mon, 2026-02-23 at 14:23 -0500, Greg Burd wrote:
>> Hello.
>>
>> Attached is a new patch set that fixes a few issues identified in the
>> last set.
>>
>> 0001 - creates a new way to identify the set of attributes both
>> modified by the update and referenced by one or more indexes on the
>> target relation being updated. This patch keeps the
>> HeapDetermineColumnsInfo() path within heap_update() for calls from
>> simple_heap_update() when modified_attrs_valid is set to false. I'm
>> not a huge fan of this, but it does serve as a way to illustrate a
>> minimal set of changes easing review a bit.
>>
>> 0002 - splits out the top portion of heap_update() into both
>> heapam_tuple_update() and simple_heap_update(), adds a few helper
>> functions and tries to reduce repeated code. The goal here was to
>> remove some of the mess related to the various bitmaps used to make
>> decisions during the update.
>
> IIUC, a minimal version of this patch set might be:
>
> * add 'mix_attrs' bitmap to API for table_tuple_update
> * have executor calculate the bitmap, using the old slot to see if
> expression results have changed
> * have simple_heap_update calculate the bitmap using heap_fetch to get
> the old tuple (would be a redundant pin, but not sure if that's a
> problem or not)
>
> And leave the rest mostly unchanged.
>
> Did I miss something? If not, it would be nice to see such a minimal
> patch and/or understand why we don't follow that approach.
Hey Jeff, thanks for sticking with me on this journey. :)
I think your approach makes sense, here's a summary of what's attached (v30) and at the bottom of this email are some
earlyperformance measurements.
* in the executor
* identify the mix_attrs
* one new argument to table_tuple_update( ..., mix_attrs, ...)
* heapam_tuple_update( ..., mix_attrs, ...)
* calculates hot_allowed using mix_attrs
* calculates lockmode using key_attrs and mix_attrs
* two new arguments to heap_update(..., mix_attrs, hot_allowed, ...)
* on return determines what to do with TU_UpdateIndexes
* heap_update( ..., mix_attrs, hot_allowed, ... )
* takes buffer lock
* calculates rep_id_key_req, passes that to ExtractReplicaId()
* if newbuf==buffer && hot_allowed -> HOT
* releases buffer lock
* simple_heap_update( ... no changes to API ... )
* now needs to compare old/new tuples *BEFORE* calling heap_update()
* uses heap_fetch() to turn otid -> oldtuple
* calls HeapUpdateModIdxAttrs()
* calculates lockmode
* calculates if hot is allowed
* calls into heap_update(..., mix_attrs, hot_allowed, ...)
* on return determines what to do with TU_UpdateIndexes
* renamed HeapDetermineColumnsInfo() to HeapUpdateModIdxAttrs()
* removed logic related to rep_id_key_req, that is in heap_update()
> Regards,
> Jeff Davis
There are a pair of functions now for finding "mix_attrs" that replace the singular HeapDetermineColumnsInfo()
function:
ExecUpdateModIdxAttrs()
HeapUpdateModIdxAttrs()
These do essentially the same thing, only with different available information and where the latter is called within
thecontext of a buffer lock.
In ExecUpdateModIdxAttrs() we compare two TupleTableSlots, the existing and the plan slot, using a new helper function
ExecCompareSlotAttrs(). This gives us the "mix_attrs" (modified indexed attributes) bitmap. In this function we have
theResultRelInfo and EState so it is possible to use the ExecGetAllUpdatedCols() function to potentially reduce the set
ofattributes we need to check for changes. The function only reviews indexed attributes that also exist in that set,
whichled to an interesting discovery... see below.
In HeapUpdateModIdxAttrs() we start with an old TID and a HeapTuple, so first we need to fetch that old HeapTuple so we
cancompare old/new datum and find any modified indexed attributes.
A new function HeapUpdateHotAllowable() is used in heapam_tuple_update() and simple_heap_update() encapsulating that
logicin one place including the "only summarized" test. Heap will use the HOT path if that function returns true and
thetuple fits on the same buffer page. No logic changes, just moved the decision making around a bit.
A new function HeapUpdateDetermineLockMode() is used to choose exclusive or shared lock mode ahead of calling into
heap_update(). Again, same logic as before.
It turns out that ExecGetAllUpdatedCols() doesn't get all updated columns as the name advertises. It finds all the
attributes(columns) that were mentioned in the UPDATE statement or any triggers that will fire during the update, but
thatoverlooks any attributes changed within before-row triggers that invoke functions which call heap_modify_tuple().
Thishappens when tsvector_update_trigger() is called in tsearch.sql, the code modifies an indexed attribute not
mentionedin the UPDATE or triggers. I've fixed this oversight and to me this makes sense, but tell me if you disagree.
generated_virtual.sql and updatable_views.sql had tests where the scan order of the tuples on the pages seems to now be
non-deterministic. I've updated those tests to ensure stability. AFAICT my changes in this patch should not change any
HOTdecision or any replica identity key WAL logging decision, but somehow they uncovered this instability. Or there is
abug, but I've not spotted that as yet. Feel free to point out the obvious if you do. :)
Just to be clear, this patch doesn't include any of $subject. In tests I've not measured any performance regressions,
andthat's not surprising as the sum total computational effort is nearly identical before/after the patch. Yes, the
patchmoves some computation outside the buffer lock on the heap page and that might open the door to more concurrency
orslightly different behavior when updates are highly contentious. There may be more occasions where TU_Updated is
returned,or some speed improvements when updating more than one row at a time.
My hope is to get this into a shape where we're comfortable with these changes and it can be committed even though none
of$subject is achieved because it does lay some ground work for those future HOT expanding and WARM/PHOT enabling ideas
I'vebeen working on.
Things on my TODO list, short term:
* Re-introduce the index AM's new function to allow indexes to play a role in when they require new index entries
* I'm not a fan of TU_UpdateIndexes, it's *very* heap-specific, I'd like to eliminate this
Longer term, so as to return to working on $subject:
* Allow types to indicate that they maintain "sub-attributes" that might be used to form index key datum
* Identify in the executor for each attribute SET if a) it has sub-attributes, and if so b) does the new value for the
attributechange any sub-attribute that is used to form index keys
* With the previous two ideas I think we can safely re-introduce HOT for expressions without re-evaluating the
expressionsand comparing index datum (read: without the overhead I've measured in the past)
* PHOT or WARM or <other nifty name here>, teach heap how to only update changed indexes (rather than the all or
nothingapproach we have today)
I look forward to community feedback.
best.
-greg
----------------- PERFORMANCE TEST RESULTS
DISCLAIMER: "claude" and I worked on the perf-cf5556-v30.sh script, as I'm sure is apparent. I think people call it
"vibecoding" when you try to contain the enthusiasm of your friendly LLM and direct it toward some goal. IME it's like
tryingto control a room full of dangerously knowledgeable and overly eager to please kindergarten-aged parrots. I
admitto needing more time to review the script, the test cases, and the results to fully explore these changes and
validatethat they actually measure something meaningful. If you find something silly or a glaring mistake, go easy on
me(and "claude") but do let me (us?) know.
$ ./perf-cf5556-v30.sh
Checking for running PostgreSQL instances...
✓ No other PostgreSQL instances running
╔════════════════════════════════════════════════════════════════════╗
║ CF-5556 PERFORMANCE TEST SUITE
╚════════════════════════════════════════════════════════════════════╝
Configuration:
Test duration : 60s
Clients / Jobs : 8 / 4
Results directory : /tmp/cf5556-perf-results/20260226_150623
Setup extensions : NO
Test extensions : NO
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
BUILDING AND TESTING
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Baseline: d0833fdae7e (origin/master)
Patches: 1 patch(es) to test cumulatively
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
VERSION: baseline (d0833fdae7e)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Building PostgreSQL...
✓ PostgreSQL built
Starting server...
✓ Server started
shared_preload_libraries: pg_stat_statements
Setting up test databases...
Creating driver_license table (100k rows, 5 BTREE indexes)...
✓ driver_license ready (100000 rows)
Creating t_jsonb table (10k rows, 3 BTREE expression indexes)...
✓ t_jsonb ready (10k rows)
Creating t_gin table (10k rows, GIN index — control)...
✓ t_gin ready (10k rows, GIN — control)
Running isolated tests (60s each)...
license_write_single TPS: 69816.475176 Lat: 0.115ms
jsonb_write_single TPS: 56571.238721 Lat: 0.141ms
jsonb_write_batch TPS: 3606.918699 Lat: 2.218ms
gin_write_single TPS: 64040.989986 Lat: 0.125ms
pgbench_tpcb-like TPS: 20133.238199 Lat: 0.397ms
pgbench_simple-update TPS: 19219.239741 Lat: 0.416ms
Running concurrent read/write tests...
Running concurrent test: 2 writers + 6 readers...
jsonb_2w_6r Write: 14776.018733 TPS Read: 80627.112253 TPS
Write: 0.135 ms Read: 0.074 ms
Running concurrent test: 4 writers + 4 readers...
jsonb_4w_4r Write: 29399.436764 TPS Read: 52688.734439 TPS
Write: 0.136 ms Read: 0.076 ms
Running concurrent test: 6 writers + 2 readers...
jsonb_6w_2r Write: 43151.037340 TPS Read: 25295.378828 TPS
Write: 0.139 ms Read: 0.079 ms
Running concurrent test: 2 writers + 6 readers...
license_2w_6r Write: 18891.968944 TPS Read: 74113.652071 TPS
Write: 0.106 ms Read: 0.081 ms
Running concurrent test: 4 writers + 4 readers...
license_4w_4r Write: 37305.015382 TPS Read: 48489.296785 TPS
Write: 0.107 ms Read: 0.082 ms
Running concurrent test: 6 writers + 2 readers...
license_6w_2r Write: 54403.687584 TPS Read: 23519.461792 TPS
Write: 0.110 ms Read: 0.085 ms
Stopping server...
✓ Server stopped
fatal: a branch named 'cf-5556-test-all-patches' already exists
Applying all 1 patches cumulatively...
Applying v20260226b-0001-Idenfity-modified-indexed-attributes-in-t.patch...
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
VERSION: patched (526c2a8733d)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Building PostgreSQL...
✓ PostgreSQL built
Starting server...
✓ Server started
shared_preload_libraries: pg_stat_statements
Setting up test databases...
Creating driver_license table (100k rows, 5 BTREE indexes)...
✓ driver_license ready (100000 rows)
Creating t_jsonb table (10k rows, 3 BTREE expression indexes)...
✓ t_jsonb ready (10k rows)
Creating t_gin table (10k rows, GIN index — control)...
✓ t_gin ready (10k rows, GIN — control)
Running isolated tests (60s each)...
license_write_single TPS: 70093.895595 Lat: 0.114ms
jsonb_write_single TPS: 56751.907107 Lat: 0.141ms
jsonb_write_batch TPS: 4141.086856 Lat: 1.932ms
gin_write_single TPS: 63845.491951 Lat: 0.125ms
pgbench_tpcb-like TPS: 19911.229480 Lat: 0.402ms
pgbench_simple-update TPS: 19840.566625 Lat: 0.403ms
Running concurrent read/write tests...
Running concurrent test: 2 writers + 6 readers...
jsonb_2w_6r Write: 14821.571968 TPS Read: 81057.390470 TPS
Write: 0.135 ms Read: 0.074 ms
Running concurrent test: 4 writers + 4 readers...
jsonb_4w_4r Write: 29428.063408 TPS Read: 52533.626129 TPS
Write: 0.136 ms Read: 0.076 ms
Running concurrent test: 6 writers + 2 readers...
jsonb_6w_2r Write: 43204.958598 TPS Read: 25301.939523 TPS
Write: 0.139 ms Read: 0.079 ms
Running concurrent test: 2 writers + 6 readers...
license_2w_6r Write: 18958.924353 TPS Read: 74548.095482 TPS
Write: 0.105 ms Read: 0.080 ms
Running concurrent test: 4 writers + 4 readers...
license_4w_4r Write: 37185.369146 TPS Read: 48580.299936 TPS
Write: 0.108 ms Read: 0.082 ms
Running concurrent test: 6 writers + 2 readers...
license_6w_2r Write: 54461.228141 TPS Read: 23692.388873 TPS
Write: 0.110 ms Read: 0.084 ms
Stopping server...
✓ Server stopped
╔════════════════════════════════════════════════════════════════════╗
║ RESULTS SUMMARY
╚════════════════════════════════════════════════════════════════════╝
═══════════════════════════════════════════════════════════════════════════════════
ISOLATED WORKLOAD COMPARISON (Patched vs Baseline)
═══════════════════════════════════════════════════════════════════════════════════
Table Workload Baseline TPS Patched TPS Δ%
───────────────────────────────────────────────────────────────────────────────────
gin write_single 64041.0 63845.5 -0.3%
jsonb write_batch 3606.9 4141.1 +14.8%
jsonb write_single 56571.2 56751.9 +0.3%
license write_single 69816.5 70093.9 +0.4%
pgbench simple-update 19219.2 19840.6 +3.2%
pgbench tpcb-like 20133.2 19911.2 -1.1%
───────────────────────────────────────────────────────────────────────────────────
═══════════════════════════════════════════════════════════════════════════════════
CONCURRENT WORKLOAD ANALYSIS (Write Pressure Impact on Reads)
═══════════════════════════════════════════════════════════════════════════════════
Table Write:Read Base Write Patch Write Base Read Patch Read
───────────────────────────────────────────────────────────────────────────────────
jsonb 2w_6r 14776.0 14821.6 80627.1 81057.4
license 2w_6r 18892.0 18958.9 74113.7 74548.1
jsonb 4w_4r 29399.4 29428.1 52688.7 52533.6
license 4w_4r 37305.0 37185.4 48489.3 48580.3
jsonb 6w_2r 43151.0 43205.0 25295.4 25301.9
license 6w_2r 54403.7 54461.2 23519.5 23692.4
───────────────────────────────────────────────────────────────────────────────────
Output files:
/tmp/cf5556-perf-results/20260226_150623/results.txt (raw results)
/tmp/cf5556-perf-results/20260226_150623/*_server.log (server startup/error logs)
/tmp/cf5556-perf-results/20260226_150623/*_setup.log (database setup logs)
/tmp/cf5556-perf-results/20260226_150623/*_build.log (build logs)
/tmp/cf5556-perf-results/20260226_150623/*_*.txt (pgbench output)
/tmp/cf5556-perf-results/20260226_150623/*_*.sql (test queries)
Cleaning up...
✓ Cleanup complete
Attachment
pgsql-hackers by date: