On Wed, Sep 9, 2020 at 11:02 AM Andres Freund <andres@anarazel.de> wrote:
> Obviously lookup in such a more complicated structure isn't free. Nor is
> building it. So we'd need some heuristics about when to do so. It'd
> probably be OK to occasionally look at the age of the oldest in-progress
> statement, to infer the time for old snapshots based on that. And then
> we could have a GUC that says when it's worth doing more complicated
> lookups.
>
> I don't have a handle on how to deal with the ctid chaining for
> intermediate row versions.
I wonder if it makes sense to add the feature to nbtree first. This
side-steps the practical problem of having to figure out ctid
chaining. You can prove the idea in a relatively low risk way, while
still creating significant value for users.
We can reason about versioning within indexes without having
authoritative information about it close at hand. The trick is to
delay everything until it looks like we'll have to split the page. I
can see very substantial improvements to index bloat from non-HOT
updates with the deduplication-deletion patch I've been working on:
https://postgr.es/m/CAH2-WznjQQwSPSxBiZ6wQyBiKqFmfdjOdeqp28otqw551T7jcQ@mail.gmail.com
Importantly, the patch tends to bound the number of distinct index
entries *per logical row* in the presence of non-HOT updates (at least
for indexes that are not logically modified by the update). ISTM that
this is what really matters. The patch doesn't just reduce index bloat
-- it greatly caps the number of heap pages any given primary key
point lookup query will do. So in practice we eagerly kill precisely
the garbage index tuples that would have caused us the most trouble
without the new mechanism in place.
Avoiding a page split is a big win, so we can probably justify
relatively expensive lookup calls to make this work. This seems
especially likely to be true because we can probably ratchet up to
that only when we see that it will win. If a unique index has 10
entries for one value (10 versions), which is inevitable once HOT
pruning fails, then how many of those 10 do we really need? The answer
is perhaps just 2 or 3 maybe 99%+ of the time. I strongly suspect that
preventing page splits is more valuable than opportunistic heap
pruning (unless the pruning itself prevents page splits, which it may
or may not). Logically unnecessary page splits have obvious lasting
consequences, are usually highly correlated, and affect performance in
ways that are non-linear and likely very harmful in the real world.
--
Peter Geoghegan