Home > mailing lists

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation
Date	January 17, 2023 22:11:07
Msg-id	CA+TgmoY7hnNnCZUcaeznbEu1FwS87v6zwjNzjjVjbskdz9ff9Q@mail.gmail.com Whole thread Raw
In response to	Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation (Peter Geoghegan <pg@bowt.ie>)
Responses	Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation
List	pgsql-hackers

Tree view

On Tue, Jan 17, 2023 at 3:08 PM Peter Geoghegan <pg@bowt.ie> wrote:
> If you assume that there is chronic undercounting of dead tuples
> (which I think is very common), ...

Why do you think that?

> How many dead heap-only tuples are equivalent to one LP_DEAD item?
> What about page-level concentrations, and the implication for
> line-pointer bloat? I don't have a good answer to any of these
> questions myself.

Seems a bit pessimistic. If we had unlimited resources and all
operations were infinitely fast, the optimal strategy would be to
vacuum after every insert, update, or delete. But in reality, that
would be prohibitively expensive, so we're making a trade-off.
Batching together cleanup for many table modifications reduces the
amortized cost of cleaning up after one such operation very
considerably. That's critical. But if we batch too much together, then
the actual cleanup doesn't happen soon enough to keep us out of
trouble.

If we had an oracle that could provide us with perfect information,
we'd ask it, among other things, how much work will be required to
vacuum right now, and how much benefit would we get out of doing so.
The dead tuple count is related to the first question. It's not a
direct, linear relationship, but it's not completely unrelated,
either. Maybe we could refine the estimates by gathering more or
different statistics than we do now, but ultimately it's always going
to be a trade-off between getting the work done sooner (and thus maybe
preventing table growth or a wraparound shutdown) and being able to do
more work at once (and thus being more efficient). The current system
set of counters predates HOT and the visibility map, so it's not
surprising if needs updating, but if you're argue that the whole
concept is just garbage, I think that's an overreaction.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

pgsql-hackers by date:

From: Peter Geoghegan
Date: 17 January 2023, 22:03:54
Subject: Re: Update comments in multixact.c

From: Tom Lane
Date: 17 January 2023, 22:13:58
Subject: Re: pgsql: Doc: add XML ID attributes to and tags.

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation - Mailing list pgsql-hackers

Previous

Next