Re: Disabling Heap-Only Tuples - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Disabling Heap-Only Tuples
Date
Msg-id CA+TgmoYc7Y0DndOx1AcuzeMcOzLK6aTqX1V++0PsVLptiyA-Ug@mail.gmail.com
Whole thread Raw
In response to Re: Disabling Heap-Only Tuples  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Responses Re: Disabling Heap-Only Tuples
Re: Disabling Heap-Only Tuples
List pgsql-hackers
On Tue, Sep 19, 2023 at 12:30 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> I was thinking something vaguely like "a table size that's roughly what
> an optimal autovacuuming schedule would leave the table at" assuming 0.2
> vacuum_scale_factor.  You would determine the absolute minimum size for
> the table given the current live tuples in the table, then add 20% to
> account for a steady state of dead tuples and vacuumed space.  So it's
> not 1.2x of the "current" table size at the time the local_update_limit
> feature is installed, but 1.2x of the optimal table size.

Right, that would be great. And honestly if that's something we can
figure out, then why does the parameter even need to be an integer
instead of a Boolean? If the system knows the optimal table size, then
the user can just say "try to compact this table" and need not say to
what size. The 1.2 multiplier is probably situation dependent and
maybe the multiplier should indeed be a configuration parameter, but
we would be way better off if the absolute size didn't need to be.

> This makes me think that maybe the logic needs to be a little more
> complex to avoid the problem you describe: if an UPDATE is prevented
> from being HOT because of this setting, but then it goes and consults
> FSM and it gives the update a higher block number than the tuple's
> current block (or it fails to give a block number at all so it is forced
> to extend the relation), then the update should give up on that strategy
> and use a HOT update after all.  (I have not read the actual patch;
> maybe it already does this?  It sounds kinda obvious.)

+1 to all of that. Anything we can do to reduce the chance of the
parameter doing the opposite of what it's intended to do is, IMHO,
really, really valuable. If you're in the situation where you really
need something like this, you're probably having a pretty bad day
already.

Just to be more clear about my position, I don't think that having
some kind of a feature along these lines is a bad idea. I do think
that this is one of those cases where the perfect is the enemy of the
good, and we can fall into the trap of saying that since we can't do
the perfect thing let's not do anything at all. At the same time, just
because we need to do something doesn't mean we should do exactly the
first thing that anybody thought up, or that we shouldn't try as hard
as we can to mitigate the downsides. If we add something like this I
bet it will get a lot of use. Even a minor improvement to the design
that removes one pitfall of many could turn out to help a lot of
people. If we could get to the point where most people have a positive
user experience without too much effort, this could turn out to be one
of the most impactful features in years.

> Having to set AEL is not nice for sure, but wouldn't
> ShareUpdateExclusiveLock be sufficient?  We have a bunch of reloptions
> for which that is sufficient.

Hmm, yeah, I think you're right.

--
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: dikkop seems unhappy because of openssl stuff (FreeBSD 14-BETA1)
Next
From: Andres Freund
Date:
Subject: Re: Disabling Heap-Only Tuples