Hi,
On 2023-09-19 14:50:13 -0400, Robert Haas wrote:
> On Tue, Sep 19, 2023 at 12:56 PM Andres Freund <andres@anarazel.de> wrote:
> > Yea, a setting like what's discussed here seems, uh, not particularly useful
> > for achieving the goal of compacting tables. I don't think guiding this
> > through SQL makes a lot of sense. For decent compaction you'd want to scan the
> > table backwards, and move rows from the end to earlier, but stop once
> > everything is filled up. You can somewhat do that from SQL, but it's going to
> > be awkward and slow. I doubt you even want to use the normal UPDATE WAL
> > logging.
> >
> > I think having explicit compaction support in VACUUM or somewhere similar
> > would make sense, but I don't think the proposed GUC is a useful stepping
> > stone.
>
> I think there's a difference between wanting to compact instantly and
> wanting to compact over time. I think that this kind of thing is
> reasonably well-suited to the latter, if we can engineer away the
> cases where it backfires.
>
> But I know people will try to use it for instant compaction too, and
> there it's worth remembering why we removed old-style VACUUM FULL. The
> main problem is that it was mind-bogglingly slow.
I think some of the slowness was implementation related, rather than
fundamental. But more importantly, storage was something entirely different
back then than it is now.
> The other really bad problem is that it caused massive index bloat. I think
> any system that's based on moving around my tuples right now to make my
> table smaller right now is likely to have similar issues.
I think the problem of exploding WAL usage exists both for compaction being
done in VACUUM (or a dedicated command) and being done by backends. I think to
make using a facility like this realistic, you really need some form of rate
limiting, regardless of when compaction is performed. Even leaving WAL volume
aside, naively doing on-update compaction will cause lots of additional
contention on early FSM pages.
> In the case where you're trying to compact gradually, I think there
> are potentially serious issues with index bloat, but only potentially.
> It seems like there are reasonable cases where it's fine.
> Specifically, if you have relatively few indexes per table, relatively
> few long-running transactions, and all tuples get updated on a
> semi-regular basis, I'm thinking that you're more likely to win than
> lose.
Maybe - but are you going to have a significant bloat issue in that case?
Sure, if the updates update most of the table, youre are going to - but then
on-update compaction won't really be needed either, since you're going to run
out of space on pages on a regular basis.
Greetings,
Andres Freund