On Mon, Dec 16, 2024 at 2:17 PM Peter Geoghegan <pg@bowt.ie> wrote:
> Maybe it's a good idea, but right now it poses a similar risk to my
> scenario involving a random, isolated SELECT FOR SHARE that happens to
> affect some random tuple on a cold/frozen page. Of course, this
> wouldn't be all that hard to fix.
BTW, one alternative to a "retry cleanup lock acquisition 3 times"
heuristic is a version of freezing/pruning that doesn't require a
cleanup lock at all.
The only reason that we require a cleanup lock is to make it safe to
defragment the page, to free tuple space. Nothing stops you from
inventing a variant of pruning/freezing that works just like regular
pruning/freezing, but without attempting to free up tuple space --
thus obviating the need for a cleanup lock. This process could still
remove dead tuples by (say) setting their xmin to InvalidTransactionId
-- that ought to still be safe.
I don't think that problems with acquiring cleanup locks happen all
that often -- they're probably very rare. But when they do happen they
can be extremely serious -- they can even cause outages. And so
there's a lot to be said for a design that removes those sorts of
risks (or that at least significantly ameliorates them). It gives you
the freedom to come up with better high-level designs that don't have
to worry about not acquiring cleanup locks.
--
Peter Geoghegan