Re: Diagnosing outdated auto analyze dates... - Mailing list pgsql-admin

From Keith Fiske
Subject Re: Diagnosing outdated auto analyze dates...
Date
Msg-id CAODZiv7sdwbSbgoEt923+MkYG_wsorW1G2oN3Q9nTqLLL0XuoQ@mail.gmail.com
Whole thread Raw
In response to Re: Diagnosing outdated auto analyze dates...  (Wells Oliver <wells.oliver@gmail.com>)
Responses Re: Diagnosing outdated auto analyze dates...  (Keith Fiske <keith.fiske@crunchydata.com>)
List pgsql-admin


On Wed, Apr 21, 2021 at 5:38 PM Wells Oliver <wells.oliver@gmail.com> wrote:
Thanks, really useful detail. One last question: if I go the route of setting auto-vac scale factor to 0 and auto-vac threshold to 10,000 to ensure vacuuming every 10,000 dead tuples, is there still value in adjusting the auto-analyze threshold/scale separately? Does the analyze process do anything other than determine whether it should be vacuumed?

Just to clarify, I only recommend setting scale factor to zero on an individual table basis and actually figuring out appropriate values for the threshold to cause those individual tables to be scanned/vacuumed on a regular basis to keep up with your write rate. The cluster-wide values for scale factor and threshold that are set in postgresql.conf are generally ok for most databases, but I personally tune scale factor down a little bit from the default and threshold up a little bit from the default. But you definitely want a scale-factor set for the cluster-wide value to ensure any write tables get analyze/vacuumed at some point based on a fraction of them changing, not just a specific row number change.

Vacuum is not just for dead-tuple cleanup. It keeps many other important statistics updated (freespace map, visibility map, etc) and, most importantly, prevents transaction id exhaustion. 



On Wed, Apr 21, 2021 at 2:33 PM Keith Fiske <keith.fiske@crunchydata.com> wrote:


On Wed, Apr 21, 2021 at 5:05 PM Wells Oliver <wells.oliver@gmail.com> wrote:
Thank you... so to make it explicit, this table has 15169372 rows, autovacuum_analyze_scale_factor is .05, and autovacuum_analyze_threshold is 10, so: 15169372 * .05 + 10, which is 758478.60 -- so that means auto-analyze will only fire when that number of rows has been deleted or updated?



Correct. One thing that can be troublesome with larger tables and the scale factor is that, as time goes on, analyze and vacuum run less and less often unless your write rate also increases along with the size. So that's why larger tables can sometimes benefit from individual tuning and setting scale factor to zero.

To figure out what to set the threshold to, I typically make a simple script to do a csv copy output with psql for the pg_stat_user_tables entry for the table(s) involved on like an hourly interval for 2-3 weeks. Then pull that into a spreadsheet or import it back into a table. Then figure out the hourly & daily differences for n_tup_upd + n_tup_del. Then try and set the thresholds to have autovac run at least once or twice a day and analyze slightly more often.

Note that prior to PG 13, insert only tables would never cause autovac to kick in. 13 now has autovacuum_vacuum_insert_scale_factor & autovacuum_vacuum_insert_threshold.

 
On Wed, Apr 21, 2021 at 1:48 PM Keith Fiske <keith.fiske@crunchydata.com> wrote:


On Wed, Apr 21, 2021 at 4:06 PM Wells Oliver <wells.oliver@gmail.com> wrote:
A last_autoanalyze date in pg_stat_all_tables for a table we update every day is 3/26, another is 3/13 -- I feel strongly somehow these tables should be auto-analyzed and auto-vacuumed more often, and am wondering where to look first.

My autovacuum_analyze_threshold is 10, my autovacuum_vacuum_threshold is 25. 

How can I better understand why these tables are not more frequently being auto-analyzed and vacuumed?

Thank you.

--

The threshold settings are only part of the calculation that determines when autovacuum kicks in. There's also autovacuum_vacuum_scale_factor and autovacuum_analyze_scale_factor. The scale factors say that when this fraction of the table has been changed (updates/deletes) then run the relevant operation (vacuum or analyze).

The threshold values are added on top of the scale factor values. So, what determines whether your analyze kicks in is actually

(autovacuum_analyze_scale_factor * total row count) + autovacuum_analyze_threshold

The reason it's like this is to avoid autovacuum kicking in excessively on very small tables when a small % changes. Ex. 10 rows on a 100 row table constantly kicking in autovacuum when scale factor is set to .10. So add on 100-500 for the threshold so autovacuum kicks in when a good number of rows actually changes.

If you want a specific number of row changes to cause autovacuum to kick in, you can set the scale factor to zero for that specific table. I only really find that necessary myself on rather large tables though (10s of millions of rows).

--
Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com


--


--
Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com


--


--
Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com

pgsql-admin by date:

Previous
From: Wells Oliver
Date:
Subject: Re: Diagnosing outdated auto analyze dates...
Next
From: Keith Fiske
Date:
Subject: Re: Diagnosing outdated auto analyze dates...