Re: autoanalyze criteria - Mailing list pgsql-general

From Stefan Andreatta
Subject Re: autoanalyze criteria
Date
Msg-id 5128BFF4.7030406@synedra.com
Whole thread Raw
In response to Re: autoanalyze criteria  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: autoanalyze criteria  (Jeff Janes <jeff.janes@gmail.com>)
Re: autoanalyze criteria  (Alban Hertroys <haramrae@gmail.com>)
List pgsql-general
On 02/22/2013 06:27 PM, Jeff Janes wrote:
> On Fri, Feb 22, 2013 at 5:57 AM, Stefan Andreatta
> <s.andreatta@synedra.com <mailto:s.andreatta@synedra.com>> wrote:
>
>     Hi,
>
>     If I understand
>     http://www.postgresql.org/docs/9.1/static/routine-vacuuming.html
>     correctly, the autovacuum threshold in could be estimated like
>     this in PostgreSQL 9.1:
>
>     SELECT pg_stat_user_tables.relname,
>          pg_stat_user_tables.n_dead_tup,
>          CAST(current_setting('autovacuum_vacuum_threshold') AS bigint)
>              + (CAST(current_setting('autovacuum_vacuum_scale_factor')
>     AS numeric)
>                 * pg_class.reltuples) AS av_threshold
>      FROM pg_stat_user_tables
>          JOIN pg_class on pg_stat_user_tables.relid = pg_class.oid
>      ORDER BY 1;
>
>     If pg_stat_user_tables.n_dead_tup exceeds av_threshold autovacuum
>     should kick in. Obviously, that does rely on up-to-date
>     statistics. Is that how it is actually done?
>
>
> Pretty much, yes.  With the caveat that table storage settings can
> override the global settings.
>
>
>     2nd question: because pg_stat_user_tables.n_dead_tup is itself
>     estimated by ANALYZE it cannot be used as a criterion for the next
>     autoanalyze run, I think.
>
>
> n_dead_tup is updated by the stats collector, not by ANALYZE.
>
Thanks Jeff, that helped a lot (as did a careful rereading of
http://www.postgresql.org/docs/9.1/static/monitoring-stats.html and
http://www.postgresql.org/docs/9.1/static/catalog-pg-class.html ;-)

However, to estimate whether autoanalyze should be triggered, I am still
missing something: the analyze threshold is compared to the "total
number of tuples inserted, updated, or deleted since the last ANALYZE."
(according to
http://www.postgresql.org/docs/9.1/static/routine-vacuuming.html).

pg_stat_user_tables.n_live tup - pg_class.reltuples should give
something like the sum of rows inserted minus rows deleted since the
last ANALYZE. But according to the documentation we would need the sum
of those values. And we are still missing a number for rows updated
since the last analyze. pg_stat_usert_tables. n_dead_tup, on the other
hand, is only set back by successful VACUUM. autoanalyzing a table with
more than 10% dead rows would therefore keep autoanalyze in a loop until
the ratio rises beyond 20% (default configuration) and autovacuum kicks
in. So that wouldn't make a lot of sense.

Regards,
Stefan

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Dumb question involving to_tsvector and a view
Next
From: Jeff Janes
Date:
Subject: Re: autoanalyze criteria