Re: Expanding HOT updates for expression and partial indexes - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: Expanding HOT updates for expression and partial indexes
Date
Msg-id Z6qJydc0BNF8AGPt@nathan
Whole thread Raw
In response to Re: Expanding HOT updates for expression and partial indexes  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
Responses Re: Expanding HOT updates for expression and partial indexes
List pgsql-hackers
On Mon, Feb 10, 2025 at 06:17:42PM +0100, Matthias van de Meent wrote:
> I have serious doubts about the viability of any proposal working to
> implement PHOT/WARM in PostgreSQL, as they seem to have an inherent
> nature of fundamentally breaking the TID lifecycle:
> We won't be able to clean up dead-to-everyone TIDs that were
> PHOT-updated, because some index Y may still rely on it, and we can't
> remove the TID from that same index Y because there is still a live
> PHOT/WARM tuple later in the chain whose values for that index haven't
> changed since that dead-to-everyone tuple, and thus this PHOT/WARM
> tuple is the one pointed to by that index.
> For HOT, this isn't much of an issue, because there is just one TID
> that's impacted (and it only occupies a single LP slot, with
> LP_REDIRECT). However, with PHOT/WARM, you'd relatively easily be able
> to fill a page with TIDs (or even full tuples) you can't clean up with
> VACUUM until the moment a the PHOT/WARM/HOT chain is broken (due to
> UPDATE leaving the page or the final entry getting DELETE-d).
> 
> Unless we are somehow are able to replace the TIDs in indexes from
> "intermediate dead PHOT" to "base TID"/"latest TID" (either of which
> is probably also problematic for indexes that expect a TID to appear
> exactly once in the index at any point in time) I don't think the
> system is viable if we maintain only a single data structure to
> contain all dead TIDs. If we had a datastore for dead items per index,
> that'd be more likely to work, but it also would significantly
> increase the memory overhead of vacuuming tables.

I share your concerns, but I don't think things are as dire as you suggest.
For example, perhaps we put a limit on how long a PHOT chain can be, or
maybe we try to detect update patterns that don't work well with PHOT.
Another option could be to limit PHOT updates to only when the same set of
indexed columns are updated or when <50% of the indexed columns are
updated.  These aren't fully fleshed-out ideas, of course, but I am at
least somewhat optimistic we could find appropriate trade-offs.

-- 
nathan



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: AIO v2.3
Next
From: Andres Freund
Date:
Subject: Re: RFC: Allow EXPLAIN to Output Page Fault Information