Re: new autovacuum criterion for visible pages - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: new autovacuum criterion for visible pages
Date
Msg-id CAB7nPqT5MMdJmBiuxk=B7pAPym5JiQQ5WjPq5s_qupZVhyyiHw@mail.gmail.com
Whole thread Raw
In response to new autovacuum criterion for visible pages  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: new autovacuum criterion for visible pages  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-hackers
On Thu, Aug 11, 2016 at 5:39 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
> I wanted to create a new relopt named something like
> autovacuum_vacuum_pagevisible_factor which would cause autovacuum to
> vacuum a table once less than a certain fraction of the relation's
> pages are marked allvisible.

Interesting idea.

> 1) One issue is that pg_class.relpages and pg_class.relallvisible are
> themselves only updated by vacuum/analyze.  In the absence of manual
> vacuum or analyze, this means that if the new criterion uses those
> field, it could only kick in after an autoanalyze has already been
> done, which means that autovacuum_vacuum_pagevisible_factor could not
> meaningfully be set lower than autovacuum_analyze_scale_factor.
>
> Should relallvisible be moved/copied from pg_class to
> pg_stat_all_tables, so that it is maintained by the stats collector?
> Or should the autovacuum worker just walk the vm of every table with a
> defined autovacuum_vacuum_pagevisible_factor each time it is launched
> to get an up-to-date count that way?

relation_needs_vacanalyze has access to Form_pg_class, so it is not a
problem to use the value of relallvisible there to decide if a
vacuum/analyze should be run.

> 2) Should there be a guc in addition to the relopt?  I can't think of
> a reason why I would want to set this globally, so I'm happy with just
> a relopt.  If it were set globally, it would sure increase the cost
> for scanning the vm for each table once each naptime.

Having a GUC is useful to enforce the default behavior of tables that
do not have this parameter directly set with ALTER TABLE.

> 3) Should there be a autovacuum_vacuum_pagevisible_threshold?  The
> other settings have both a factor and a threshold.  I've never
> understand what the point of the threshold settings is, but presumably
> there is a point to them.  Does that reason also apply to keeping vm
> tuned up?

Having both a threshold and a scale would make the most sense to me.
It may be difficult for the lambda user to tune those parameters using
a number of relation pages. An alternative would be to define those
values in kB, like 32MB worth of pages are marked all visible for
example.
-- 
Michael



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: regression test for extended query protocol
Next
From: Michael Paquier
Date:
Subject: Re: Assertion failure in REL9_5_STABLE