Re: scheduling autovacuum at lean hours only. - Mailing list pgsql-performance

From Ben Chobot
Subject Re: scheduling autovacuum at lean hours only.
Date
Msg-id Pine.LNX.4.64.0902110955110.10022@localhost.localdomain
Whole thread Raw
In response to Re: scheduling autovacuum at lean hours only.  (Scott Carey <scott@richrelevance.com>)
List pgsql-performance
On Wed, 11 Feb 2009, Scott Carey wrote:

> On a large partitioned database, ordinary vacuum is a very very difficult option.
>
> Most of the time on such a system, most tables are dormant with respect to writes and never need to be vacuumed.  A
'dailyvacuum' is likely to take a full day to run on larger systems. Since ordinary vacuum can't be run on subsets of
tableswithout explicitly naming them one at a time (can't just vacuum a schema, tablespace, or use a regex to match
tablenames), good luck using it effectively if you have a few thousand tables in partitions.  You'll have to have
applicationcode or a script with knowledge of all the partition names and which are in need of an analyze/vacuum. 
>
> Autovacuum is good enough in recent versions to be tuned to have very low impact though.  If you have to, rather than
stopand start it, just turn the delay or cost settings up and down during different parts of the day.  More than likely
however,it will be able to keep up with a single set of settings. 
> In particular, rather than making the delay longer, make the appropriate cost larger -- page miss or page dirty
affecthow much I/O it will do, and page hit will mostly affect how much CPU it uses. 
>
> Perhaps a feature request is to have a version of the manual vacuum command that doesn't bother running on tables
thatautovacuum wouldn't touch due to insufficient data churn.  This way, at lean hours one can manually vacuum to help
anautovacuum that was tuned for very low impact 'catch up'. 
> Also, if there was some way to make vacuum not stink so badly on tables that were just loaded with pg_load, where it
causeshuge write activity for tables that clearly have no bloat (I believe this is a hint bits thing?). 

Oh, I agree with everything you say. I'm just pointing out that if you
really do want control over when things get vacuumed (e.g. you have a
mostly-read database 20 hours a day and then 4 hours of heavy churn at
night) then you can still do that if you want.

pgsql-performance by date:

Previous
From: Rajesh Kumar Mallah
Date:
Subject: Re: scheduling autovacuum at lean hours only.
Next
From: Brad Nicholson
Date:
Subject: Re: scheduling autovacuum at lean hours only.