On Fri, 20 Mar 2020 at 11:17, Andres Freund <andres@anarazel.de> wrote:
> I think there's too much "reinventing" autovacuum scheduling in a
> "local" insert-only manner happening in this thread. And as far as I can
> tell additionally only looking at a somewhat narrow slice of insert only
> workloads.
I understand your concern and you might be right. However, I think the
main reason that the default settings for the new threshold and scale
factor has deviated this far from the existing settings is regarding
the example of a large insert-only table that receives inserts of 1
row per xact. If we were to copy the existing settings then when that
table gets to 1 billion rows, it would be eligible for an
insert-vacuum after 200 million tuples/xacts, which does not help the
situation since an anti-wraparound vacuum would be triggering then
anyway.
I'm unsure if it will help with the discussion, but I put together a
quick and dirty C program to show when a table will be eligible for an
auto-vacuum with the given scale_factor and threshold
$ gcc -O2 vacuum.c -o vacuum
$ ./vacuum
Syntax ./vacuum <scale_factor> <threshold> <maximum table size in rows>
$ ./vacuum 0.01 10000000 100000000000 | tail -n 1
Vacuum 463 at 99183465731 reltuples, 991915456 inserts
$ ./vacuum 0.2 50 100000000000 | tail -n 1
Vacuum 108 at 90395206733 reltuples, 15065868288 inserts
So, yeah, certainly, there are more than four times as many vacuums
with an insert-only table of 100 billion rows using the proposed
settings vs the defaults for the existing scale_factor and threshold.
However, at the tail end of the first run there, we were close to a
billion rows (991,915,456) between vacucums. Is that too excessive?
I'm sharing this in the hope that it'll make it easy to experiment
with settings which we can all agree on.
For a 1 billion row table, the proposed settings give us 69 vacuums
and the standard settings 83.