Re: Diagnosing outdated auto analyze dates... - Mailing list pgsql-admin

From Keith Fiske
Subject Re: Diagnosing outdated auto analyze dates...
Date
Msg-id CAODZiv5Q70054wv-TzL9mAYtjXLHjCxAJAVqw_r_gD2BY_ianw@mail.gmail.com
Whole thread Raw
In response to Re: Diagnosing outdated auto analyze dates...  (Wells Oliver <wells.oliver@gmail.com>)
Responses Re: Diagnosing outdated auto analyze dates...  (Wells Oliver <wells.oliver@gmail.com>)
List pgsql-admin


On Wed, Apr 21, 2021 at 5:05 PM Wells Oliver <wells.oliver@gmail.com> wrote:
Thank you... so to make it explicit, this table has 15169372 rows, autovacuum_analyze_scale_factor is .05, and autovacuum_analyze_threshold is 10, so: 15169372 * .05 + 10, which is 758478.60 -- so that means auto-analyze will only fire when that number of rows has been deleted or updated?



Correct. One thing that can be troublesome with larger tables and the scale factor is that, as time goes on, analyze and vacuum run less and less often unless your write rate also increases along with the size. So that's why larger tables can sometimes benefit from individual tuning and setting scale factor to zero.

To figure out what to set the threshold to, I typically make a simple script to do a csv copy output with psql for the pg_stat_user_tables entry for the table(s) involved on like an hourly interval for 2-3 weeks. Then pull that into a spreadsheet or import it back into a table. Then figure out the hourly & daily differences for n_tup_upd + n_tup_del. Then try and set the thresholds to have autovac run at least once or twice a day and analyze slightly more often.

Note that prior to PG 13, insert only tables would never cause autovac to kick in. 13 now has autovacuum_vacuum_insert_scale_factor & autovacuum_vacuum_insert_threshold.

 
On Wed, Apr 21, 2021 at 1:48 PM Keith Fiske <keith.fiske@crunchydata.com> wrote:


On Wed, Apr 21, 2021 at 4:06 PM Wells Oliver <wells.oliver@gmail.com> wrote:
A last_autoanalyze date in pg_stat_all_tables for a table we update every day is 3/26, another is 3/13 -- I feel strongly somehow these tables should be auto-analyzed and auto-vacuumed more often, and am wondering where to look first.

My autovacuum_analyze_threshold is 10, my autovacuum_vacuum_threshold is 25. 

How can I better understand why these tables are not more frequently being auto-analyzed and vacuumed?

Thank you.

--

The threshold settings are only part of the calculation that determines when autovacuum kicks in. There's also autovacuum_vacuum_scale_factor and autovacuum_analyze_scale_factor. The scale factors say that when this fraction of the table has been changed (updates/deletes) then run the relevant operation (vacuum or analyze).

The threshold values are added on top of the scale factor values. So, what determines whether your analyze kicks in is actually

(autovacuum_analyze_scale_factor * total row count) + autovacuum_analyze_threshold

The reason it's like this is to avoid autovacuum kicking in excessively on very small tables when a small % changes. Ex. 10 rows on a 100 row table constantly kicking in autovacuum when scale factor is set to .10. So add on 100-500 for the threshold so autovacuum kicks in when a good number of rows actually changes.

If you want a specific number of row changes to cause autovacuum to kick in, you can set the scale factor to zero for that specific table. I only really find that necessary myself on rather large tables though (10s of millions of rows).

--
Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com


--


--
Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com

pgsql-admin by date:

Previous
From: Wells Oliver
Date:
Subject: Re: Diagnosing outdated auto analyze dates...
Next
From: Wells Oliver
Date:
Subject: Re: Diagnosing outdated auto analyze dates...