Re: Why is INSERT-driven autovacuuming based on pg_class.reltuples? - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Why is INSERT-driven autovacuuming based on pg_class.reltuples?
Date
Msg-id CAH2-WznRu8B-kCMeJXfwosU=b+jpH9keuW6Uj8yspHH8ENZ_xA@mail.gmail.com
Whole thread Raw
In response to Why is INSERT-driven autovacuuming based on pg_class.reltuples?  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Why is INSERT-driven autovacuuming based on pg_class.reltuples?
List pgsql-hackers
On Thu, Jan 27, 2022 at 12:20 PM Peter Geoghegan <pg@bowt.ie> wrote:
> Both VACUUM and ANALYZE update pg_class.reltuples. But this code seems
> to assume that it's only something that VACUUM can ever do. Why
> wouldn't we expect a plain ANALYZE to have actually been the last
> thing to update pg_class.reltuples for an append-only table? Wouldn't
> that lead to less frequent (perhaps infinitely less frequent)
> vacuuming for an append-only table, relative to the documented
> behavior of autovacuum_vacuum_insert_scale_factor?

PgStat_StatTabEntry.inserts_since_vacuum will continue to grow and
grow as more tuples are inserted, until VACUUM actually runs, no
matter what. That largely explains why this bug was missed before now:
it's inevitable that inserts_since_vacuum will become large at some
point -- even large relative to a bogus scaled
pg_class.reltuples-at-ANALYZE threshold (unless ANALYZE hasn't been
run since the last VACUUM, in which case pg_class.reltuples will be at
the expected value anyway). And so we'll eventually get to the point
where so many unvacuumed inserted tuples have accumulated that an
insert-driven autovacuum still takes place.

In practice these delayed insert-driven autovacuum operations will
usually happen without *ludicrous* delay (relative to the documented
behavior). Even still, the autovacuum schedule for append-only tables
will often be quite wrong. (Anti-wraparound VACUUMs probably made the
bug harder to notice as well, of course.)

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: Write visibility map during CLUSTER/VACUUM FULL
Next
From: Andres Freund
Date:
Subject: Re: A test for replay of regression tests