On Thu, Aug 11, 2016 at 5:39 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
> I wanted to create a new relopt named something like
> autovacuum_vacuum_pagevisible_factor which would cause autovacuum to
> vacuum a table once less than a certain fraction of the relation's
> pages are marked allvisible.
Interesting idea.
> 1) One issue is that pg_class.relpages and pg_class.relallvisible are
> themselves only updated by vacuum/analyze. In the absence of manual
> vacuum or analyze, this means that if the new criterion uses those
> field, it could only kick in after an autoanalyze has already been
> done, which means that autovacuum_vacuum_pagevisible_factor could not
> meaningfully be set lower than autovacuum_analyze_scale_factor.
>
> Should relallvisible be moved/copied from pg_class to
> pg_stat_all_tables, so that it is maintained by the stats collector?
> Or should the autovacuum worker just walk the vm of every table with a
> defined autovacuum_vacuum_pagevisible_factor each time it is launched
> to get an up-to-date count that way?
relation_needs_vacanalyze has access to Form_pg_class, so it is not a
problem to use the value of relallvisible there to decide if a
vacuum/analyze should be run.
> 2) Should there be a guc in addition to the relopt? I can't think of
> a reason why I would want to set this globally, so I'm happy with just
> a relopt. If it were set globally, it would sure increase the cost
> for scanning the vm for each table once each naptime.
Having a GUC is useful to enforce the default behavior of tables that
do not have this parameter directly set with ALTER TABLE.
> 3) Should there be a autovacuum_vacuum_pagevisible_threshold? The
> other settings have both a factor and a threshold. I've never
> understand what the point of the threshold settings is, but presumably
> there is a point to them. Does that reason also apply to keeping vm
> tuned up?
Having both a threshold and a scale would make the most sense to me.
It may be difficult for the lambda user to tune those parameters using
a number of relation pages. An alternative would be to define those
values in kB, like 32MB worth of pages are marked all visible for
example.
--
Michael