Thread: Why is INSERT-driven autovacuuming based on pg_class.reltuples?

Why is INSERT-driven autovacuuming based on pg_class.reltuples?

From
Peter Geoghegan
Date:
Commit b07642dbcd ("Trigger autovacuum based on number of INSERTs")
taught autovacuum to run in response to INSERTs, which is now
typically the dominant factor that drives vacuuming for an append-only
table -- a very useful feature, certainly.

This is driven by the following logic from autovacuum.c:

        vacinsthresh = (float4) vac_ins_base_thresh +
vac_ins_scale_factor * reltuples;
        ...

        /* Determine if this table needs vacuum or analyze. */
        *dovacuum = force_vacuum || (vactuples > vacthresh) ||
            (vac_ins_base_thresh >= 0 && instuples > vacinsthresh);

I can see why the nearby, similar vacthresh and anlthresh variables
(not shown here) are scaled based on pg_class.reltuples -- that makes
sense. But why should we follow that example here, with vacinsthresh?

Both VACUUM and ANALYZE update pg_class.reltuples. But this code seems
to assume that it's only something that VACUUM can ever do. Why
wouldn't we expect a plain ANALYZE to have actually been the last
thing to update pg_class.reltuples for an append-only table? Wouldn't
that lead to less frequent (perhaps infinitely less frequent)
vacuuming for an append-only table, relative to the documented
behavior of autovacuum_vacuum_insert_scale_factor?

-- 
Peter Geoghegan



Re: Why is INSERT-driven autovacuuming based on pg_class.reltuples?

From
Peter Geoghegan
Date:
On Thu, Jan 27, 2022 at 12:20 PM Peter Geoghegan <pg@bowt.ie> wrote:
> Both VACUUM and ANALYZE update pg_class.reltuples. But this code seems
> to assume that it's only something that VACUUM can ever do. Why
> wouldn't we expect a plain ANALYZE to have actually been the last
> thing to update pg_class.reltuples for an append-only table? Wouldn't
> that lead to less frequent (perhaps infinitely less frequent)
> vacuuming for an append-only table, relative to the documented
> behavior of autovacuum_vacuum_insert_scale_factor?

PgStat_StatTabEntry.inserts_since_vacuum will continue to grow and
grow as more tuples are inserted, until VACUUM actually runs, no
matter what. That largely explains why this bug was missed before now:
it's inevitable that inserts_since_vacuum will become large at some
point -- even large relative to a bogus scaled
pg_class.reltuples-at-ANALYZE threshold (unless ANALYZE hasn't been
run since the last VACUUM, in which case pg_class.reltuples will be at
the expected value anyway). And so we'll eventually get to the point
where so many unvacuumed inserted tuples have accumulated that an
insert-driven autovacuum still takes place.

In practice these delayed insert-driven autovacuum operations will
usually happen without *ludicrous* delay (relative to the documented
behavior). Even still, the autovacuum schedule for append-only tables
will often be quite wrong. (Anti-wraparound VACUUMs probably made the
bug harder to notice as well, of course.)

-- 
Peter Geoghegan



Re: Why is INSERT-driven autovacuuming based on pg_class.reltuples?

From
Laurenz Albe
Date:
On Thu, 2022-01-27 at 12:20 -0800, Peter Geoghegan wrote:
> I can see why the nearby, similar vacthresh and anlthresh variables
> (not shown here) are scaled based on pg_class.reltuples -- that makes
> sense. But why should we follow that example here, with vacinsthresh?

What would you suggest instead?  pg_stat_all_tables.n_live_tup?

Yours,
Laurenz Albe




Re: Why is INSERT-driven autovacuuming based on pg_class.reltuples?

From
Peter Geoghegan
Date:
On Thu, Jan 27, 2022 at 11:22 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> What would you suggest instead?  pg_stat_all_tables.n_live_tup?

I'm not sure, except that I assume that it'll have to come from the
statistics collector, not from pg_class. I think that this bug stemmed
from the fact that vac_update_relstats() is used by both VACUUM and
ANALYZE to store information in pg_class, while both *also* use
pgstat_report_vacuum()/pgstat_report_analyze() to store closely
related information at the same point.  There is rather a lot of
implicit information here. I recall a few other bugs that also seemed
related to this messiness with statistics and pg_class.


--
Peter Geoghegan



Re: Why is INSERT-driven autovacuuming based on pg_class.reltuples?

From
Justin Pryzby
Date:
On Thu, Jan 27, 2022 at 01:59:38PM -0800, Peter Geoghegan wrote:
> On Thu, Jan 27, 2022 at 12:20 PM Peter Geoghegan <pg@bowt.ie> wrote:
> > Both VACUUM and ANALYZE update pg_class.reltuples. But this code seems
> > to assume that it's only something that VACUUM can ever do. Why
> > wouldn't we expect a plain ANALYZE to have actually been the last
> > thing to update pg_class.reltuples for an append-only table? Wouldn't
> > that lead to less frequent (perhaps infinitely less frequent)
> > vacuuming for an append-only table, relative to the documented
> > behavior of autovacuum_vacuum_insert_scale_factor?
> 
> PgStat_StatTabEntry.inserts_since_vacuum will continue to grow and
> grow as more tuples are inserted, until VACUUM actually runs, no
> matter what. That largely explains why this bug was missed before now:
> it's inevitable that inserts_since_vacuum will become large at some
> point -- even large relative to a bogus scaled
> pg_class.reltuples-at-ANALYZE threshold (unless ANALYZE hasn't been
> run since the last VACUUM, in which case pg_class.reltuples will be at
> the expected value anyway). And so we'll eventually get to the point
> where so many unvacuumed inserted tuples have accumulated that an
> insert-driven autovacuum still takes place.

Maybe I'm missed your point, but I think this may not rise to the level of
being a "bug".

If we just vacuumed an insert-only table, and then insert some more, and
autoanalyze runs and updates reltuples, what's wrong with vac_ins_scale_factor
* reltuples + vac_ins_base_thresh ?

You're saying reltuples should be the number of tuples at the last vacuum
instead of the most recent value from either vacuum or analyze ?

It's true that the vacuum threshold may be hit later than if reltuples hadn't
been updated by ANALYZE.  If that's what you're referring to, that's the
behavior of scale factor in general.  If a table is growing in size at a
constant rate, analyze will run at decreasing frequency.  With the default 20%
scale factor, it'll first run at 1.2x the table's initial size, then at 1.44
(not 1.4), then at 1.728 (not 1.6), then at 2.0736 (not 1.8).  That's not
necessarily desirable, but it's not necessarily wrong, either.  If your table
doubles in size, you might have to adjust these things.  Maybe there should be
another knob allowing perfect, "geometric" (or other) frequency, but the
behavior is not new in this patch.

We talked about that here.
https://www.postgresql.org/message-id/flat/20200305172749.GK684%40telsasoft.com#edac59123843f9f8e1abbc2b570c76f1

With the default values, analyze happens after 10% growth, and vacuum happens
after 20% (which ends up being 22% of the initial table size).  The goal of
this patch was to have inserts trigger autovacuum *at all*.  This behavior may
be subtle or non-ideal, but not a problem?  The opposite thing could also
happen - *vacuum* could update reltuples, causing the autoanalyze threshold to
be hit a bit later.  

-- 
Justin



Re: Why is INSERT-driven autovacuuming based on pg_class.reltuples?

From
David Rowley
Date:
On Fri, 28 Jan 2022 at 09:20, Peter Geoghegan <pg@bowt.ie> wrote:
> Both VACUUM and ANALYZE update pg_class.reltuples. But this code seems
> to assume that it's only something that VACUUM can ever do.

Like Justin I'm also not quite following what the problem is here.
pg_class.reltuples is only used to estimate how many tuples the scale
factor is likely to be.  It does not matter if that was set by ANALYZE
or VACUUM, it's simply an estimate.

I quoted the text above as I get the idea that you've gotten the wrong
end of the stick about how this works.  reltuples is just used to
estimate what the number of tuples for the insert threshold is based
on the scale factor. It does not matter if that was estimated by
VACUUM or ANALYZE.

If ANALYZE runs and sets pg_class.reltuples to 1 million, then we
insert 500k tuples, assuming a 0 vacuum_ins_threshold and a
vacuum_ins_scale_factor of 0.2, then we'll want to perform a vacuum as
"vac_ins_base_thresh + vac_ins_scale_factor * reltuples" will come out
at 200k.  auto-vacuum will then trigger and update reltuples hopefully
to some value around 1.5 million, then next time it'll take 300k
tuples to trigger an insert vacuum.

I'm not quite following where the problem is with that.  (Of course
with the exception of the fact that ANALYZE and VACUUM have different
methods how they decide what to set pg_class.reltuples to.  That's not
a new problem)

David



Re: Why is INSERT-driven autovacuuming based on pg_class.reltuples?

From
David Rowley
Date:
On Mon, 31 Jan 2022 at 17:28, David Rowley <dgrowleyml@gmail.com> wrote:
> If ANALYZE runs and sets pg_class.reltuples to 1 million, then we
> insert 500k tuples, assuming a 0 vacuum_ins_threshold and a
> vacuum_ins_scale_factor of 0.2, then we'll want to perform a vacuum as
> "vac_ins_base_thresh + vac_ins_scale_factor * reltuples" will come out
> at 200k.  auto-vacuum will then trigger and update reltuples hopefully
> to some value around 1.5 million, then next time it'll take 300k
> tuples to trigger an insert vacuum.

If we wanted a more current estimate for the number of tuples in a
relation then we could use reltuples / relpages *
RelationGetNumberOfBlocks(r).  However, I still don't see why an
INSERT driven auto-vacuums are a particularly special case. ANALYZE
updating the reltuples estimate had an effect on when auto-vacuum
would trigger for tables that generally grow in the number of live
tuples but previously only (i.e before insert vacuums existed)
received auto-vacuum attention due to UPDATEs/DELETEs.

I suppose the question is, what is autovacuum_vacuum_scale_factor
meant to represent?  Our documents claim:

> Specifies a fraction of the table size to add to autovacuum_vacuum_threshold when deciding whether to trigger a
VACUUM.The default is 0.2 (20% of table size). This parameter can only be set in the postgresql.conf file or on the
servercommand line; but the setting can be overridden for individual tables by changing table storage parameters. 

Nothing there seems to indicate the scale is based on the historical
table size when the table was last vacuumed/analyzed, so you could
claim that the 3 usages of relpages when deciding if the table should
be vacuumed and/or analyzed are all wrong and should take into account
RelationGetNumberOfBlocks too.

I'm not planning on doing anything to change any of this unless I see
some compelling argument that what's there is wrong.

David



Re: Why is INSERT-driven autovacuuming based on pg_class.reltuples?

From
Peter Geoghegan
Date:
On Tue, Feb 1, 2022 at 3:02 PM David Rowley <dgrowleyml@gmail.com> wrote:
> If we wanted a more current estimate for the number of tuples in a
> relation then we could use reltuples / relpages *
> RelationGetNumberOfBlocks(r).  However, I still don't see why an
> INSERT driven auto-vacuums are a particularly special case. ANALYZE
> updating the reltuples estimate had an effect on when auto-vacuum
> would trigger for tables that generally grow in the number of live
> tuples but previously only (i.e before insert vacuums existed)
> received auto-vacuum attention due to UPDATEs/DELETEs.

Sorry for the delay in my response; I've been travelling.

I admit that I jumped the gun here -- I now believe that it's
operating as originally designed. And that the design is reasonable.
It couldn't hurt to describe the design in a little more detail in the
user docs, though.

> Nothing there seems to indicate the scale is based on the historical
> table size when the table was last vacuumed/analyzed, so you could
> claim that the 3 usages of relpages when deciding if the table should
> be vacuumed and/or analyzed are all wrong and should take into account
> RelationGetNumberOfBlocks too.

The route of my confusion might interest you. vacuumlazy.c's call to
vac_update_relstats() provides relpages and reltuples values that are
very accurate relative to the VACUUM operations view of things
(relative to its OldestXmin, which uses the size of the table at the
start of the VACUUM, not the end). When the vac_update_relstats() call
is made, the same accurate-relative-toOldestXmin values might actually
*already* be out of date relative to the physical table as it is at
the end of the same VACUUM.

The fact that these accurate values could be out of date like this is
practically inevitable for a certain kind of table. But that might not
matter at all, if they were later *interpreted* in a way that took
this into account later on -- context matters.

For some reason I made the leap to thinking that everybody else
believed the same thing, too, and that the intention with the design
of the insert-driven autovacuum stuff was to capture that. But that's
just not the case, at all. I apologize for the confusion.

BTW, this situation with the relstats only being accurate "relative to
the last VACUUM's OldestXmin" is *especially* important with
new_dead_tuples values, which are basically recently dead tuples
relative to OldestXmin. In the common case where there are very few
recently dead tuples, we have an incredibly distorted "sample" of dead
tuples (it tells us much more about VACUUM implementation details than
the truth of what's going on in the table). So that also recommends
"relativistically interpreting" the values later on. This specific
issue has even less to do with autovacuum_vacuum_scale_factor than the
main point, of course. I agree with you that the insert-driven stuff
isn't a special case.

--
Peter Geoghegan