Re: Disabling Heap-Only Tuples - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Disabling Heap-Only Tuples
Date
Msg-id CA+TgmoaEZom6b5Jhp8dcr5czvM_d6gsK8bAYf-0bjdswNfTziA@mail.gmail.com
Whole thread Raw
In response to Re: Disabling Heap-Only Tuples  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Responses Re: Disabling Heap-Only Tuples
List pgsql-hackers
On Tue, Sep 19, 2023 at 6:26 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> Second, I think we should make it auto-reset.  That is, have the user
> set some value; later, when some condition triggers (say, the table size
> is 1.2x the limit value you configured), then the local_update_limit is
> automatically removed from the table options.  From that point onwards,
> the table is operated normally.

That's an interesting idea. It would require taking AEL on the table.
And also, what do you mean by 1.2x the limit value? Is that supposed
to be a >= condition or a <= condition? It can't really be a >=
condition, but you wouldn't set it in the first place unless the table
were significantly bigger than it could be. But if it's a <= condition
it doesn't really protect you from hosing yourself. You just have to
insert a bit more data before enough of the bloat gets removed, and
now the table just bloats infinitely and probably rather quickly. The
correct value of the setting depends on the amount of real data
(non-bloat) in the table, not the actual table size.

> The point here is that third-party tools such as pg_repack or pg_squeeze
> exist, which work in a way we don't like, yet we offer no alternative.
> This proposal is a mechanism that essentially replaces those tools with
> a simple in-core feature, without having to include the tool itself in
> core.

I agree that it would be nice to have something in core that can be
used to help with this problem, but this feature isn't the same thing
as pg_repack or pg_squeeze, either. In some ways, it's better, because
it can shrink the table without rewriting it, which is very desirable.
But in other ways, it's worse, and the fact that it seems like it can
backfire spectacularly if you set the wrong value seems like one big
way that it is a lot worse. If there is a way that we can make this a
mode that you activate for a table, and the system calculates and
updates the threshold, I think that would actually be a pretty good
feature. It would be tricky to use it to recover from acute
emergencies, because it doesn't actually do anything until updates
happen, but you could use it for that in a pinch. And even without
that it would be useful if you have a table that is sometimes very
large and sometimes very small and you want to get the space back from
the OS when it is in the small phase of its lifecycle.

But without any kind of auto-tuning, in my opinion, it's a fairly poor
feature. Sure, some people will get use out of it, if they're
sufficiently knowledgeable and sufficiently determined. But I think
for most people in most situations, it will be a struggle.

--
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Checks in RegisterBackgroundWorker.()
Next
From: Tomas Vondra
Date:
Subject: Re: dikkop seems unhappy because of openssl stuff (FreeBSD 14-BETA1)