Proposal: autovacuum_max_queue_depth - Mailing list pgsql-hackers

From Chris Travers
Subject Proposal: autovacuum_max_queue_depth
Date
Msg-id CAN-RpxB6qAzjUXk_JMTvJJO8Rmvu7p8KxuMip_xYT=WX_3OMVw@mail.gmail.com
Whole thread Raw
List pgsql-hackers
Hi everyone.  I would like to flesh this out in terms fo feedback before creating a patch.

The Problem

In large production systems often you can have problems when autovacuum is not tuned aggressively enough.  This leads to long autovacuum runs when they happen, and autovacuum ends up eventually causing problems.  A major difficulty is that you cannot just make autovacuum more aggressive because then you get very long autovacuum queues, which means very hot tables might not be vacuumed before they end up being close to unusable.

Example:

Imagine you have a 2TB database with 6000 tables with volumes ranging from a few MB to 100GB in size per table.  You tune autovacuum to make it more aggressive and know you can handle 5 in parallel.  So you set autovacuum_vacuum_scale_factor to a much lower value.

On the next autovacuum run, autovacuum detects that 3000 tables need to be vacuumed, and so creates 5 queues of 600 tables each.  Nothing gets added to this queue until a queue completely empties.

To my experience I have not seen a case where analyze poses the same problem but my solution would fold this in.

Current workarounds.

1.  Periodically kill autovacuum sessions, forcing queue recalculation.
2.  Manually prevacuum everything that exceeds desired thresholds.


Proposed Solution

I would propose a new GUC variable, autovacuum_max_queue_depth, defaulting to 0 (no limit).

When autovacuum starts a run, it would sort the tables according to the following formula if n_dead_tup > 0:

((n_dead_tup - autovac_threshold) / (n_dead_tup + n_live_tup) - (autovacuum_scale_factor * (n_dead_tup)/(n_live_tup + n_dead_tup)) 

For analyze runs, n_dead_tup would have number of inserts since last analyzed added to it.

Then the top rows numbering  autovacuum_max_queue_depth would be added to each autovacuum queue.

In the scenario presented above, if autovacuum_max_queue_depth were to be set to, say, 10, this would mean that after vacuuming 10 tables, each autovacuum worker would exit, and be started the next time autovacuum would wake up.

The goal here is to ensure that very hot tables rise to the top of the queue and are vacuumed frequently even after setting Autovacuum to be far more aggressive on a large production database.

Thoughts?  Feedback?  Waiting for a patch?

--
Best Regards,
Chris Travers
Head of Database

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com 
Saarbrücker Straße 37a, 10405 Berlin

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: PostgreSQL pollutes the file system
Next
From: Jesper Pedersen
Date:
Subject: Re: partitioned tables referenced by FKs