Hi,
right now the defaults for autovacuum cost limiting are so low that they
regularly cause problems for our users. It's not exactly obvious that
any installation above a couple gigabytes definitely needs to change
autovacuum_vacuum_cost_delay &
autovacuum_vacuum_cost_limit/vacuum_cost_limit. Especially
anti-wraparound/full table vacuums basically take forever with the
default settings.
On the other hand we don't want a database of a couple hundred megabytes
to be vacuumed as fast as possible and trash the poor tiny system. So we
can't just massively increase the limits by default; although I believe
some default adjustment would be appropriate anyway.
I wonder if it makes sense to compute the delays / limits in relation to
either cluster or relation size. If you have a 10 TB table, you
obviously don't want to scan with a few megabytes a second, which the
default settings will do for you. With that in mind we could just go for
something like the autovacuum_*_scale_factor settings. But e.g. for
partitioned workloads with a hundreds of tables in the couple gigabyte
range that'd not work that well.
Somehow computing the speed in relation to the cluster/database size is
probably possible, but I wonder how we can do so without constantly
re-computing something relatively expensive?
Thoughts?
- Andres