Re: Disabling Heap-Only Tuples - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Disabling Heap-Only Tuples
Date
Msg-id 20230921223335.tumif47d25z5gx6t@awork3.anarazel.de
Whole thread Raw
In response to Re: Disabling Heap-Only Tuples  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Hi,

On 2023-09-19 14:50:13 -0400, Robert Haas wrote:
> On Tue, Sep 19, 2023 at 12:56 PM Andres Freund <andres@anarazel.de> wrote:
> > Yea, a setting like what's discussed here seems, uh, not particularly useful
> > for achieving the goal of compacting tables.  I don't think guiding this
> > through SQL makes a lot of sense. For decent compaction you'd want to scan the
> > table backwards, and move rows from the end to earlier, but stop once
> > everything is filled up. You can somewhat do that from SQL, but it's going to
> > be awkward and slow.  I doubt you even want to use the normal UPDATE WAL
> > logging.
> >
> > I think having explicit compaction support in VACUUM or somewhere similar
> > would make sense, but I don't think the proposed GUC is a useful stepping
> > stone.
> 
> I think there's a difference between wanting to compact instantly and
> wanting to compact over time. I think that this kind of thing is
> reasonably well-suited to the latter, if we can engineer away the
> cases where it backfires.
> 
> But I know people will try to use it for instant compaction too, and
> there it's worth remembering why we removed old-style VACUUM FULL. The
> main problem is that it was mind-bogglingly slow.

I think some of the slowness was implementation related, rather than
fundamental. But more importantly, storage was something entirely different
back then than it is now.


> The other really bad problem is that it caused massive index bloat. I think
> any system that's based on moving around my tuples right now to make my
> table smaller right now is likely to have similar issues.

I think the problem of exploding WAL usage exists both for compaction being
done in VACUUM (or a dedicated command) and being done by backends. I think to
make using a facility like this realistic, you really need some form of rate
limiting, regardless of when compaction is performed. Even leaving WAL volume
aside, naively doing on-update compaction will cause lots of additional
contention on early FSM pages.


> In the case where you're trying to compact gradually, I think there
> are potentially serious issues with index bloat, but only potentially.
> It seems like there are reasonable cases where it's fine.

> Specifically, if you have relatively few indexes per table, relatively
> few long-running transactions, and all tuples get updated on a
> semi-regular basis, I'm thinking that you're more likely to win than
> lose.

Maybe - but are you going to have a significant bloat issue in that case?
Sure, if the updates update most of the table, youre are going to - but then
on-update compaction won't really be needed either, since you're going to run
out of space on pages on a regular basis.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: GenBKI emits useless open;close for catalogs without rows
Next
From: Michael Paquier
Date:
Subject: Re: pg_upgrade and logical replication