Re: scheduling autovacuum at lean hours only. - Mailing list pgsql-performance

From Brad Nicholson
Subject Re: scheduling autovacuum at lean hours only.
Date
Msg-id 1234375203.6334.196.camel@bnicholson-desktop
Whole thread Raw
In response to Re: scheduling autovacuum at lean hours only.  (Rajesh Kumar Mallah <mallah.rajesh@gmail.com>)
Responses Re: scheduling autovacuum at lean hours only.  (Rajesh Kumar Mallah <mallah.rajesh@gmail.com>)
List pgsql-performance
On Wed, 2009-02-11 at 22:57 +0530, Rajesh Kumar Mallah wrote:
> On Wed, Feb 11, 2009 at 10:03 PM, Grzegorz Jaśkiewicz <gryzman@gmail.com> wrote:
> > On Wed, Feb 11, 2009 at 2:57 PM, Rajesh Kumar Mallah
> > <mallah.rajesh@gmail.com> wrote:
> >
> >>> vacuum_cost_delay = 150
> >>> vacuum_cost_page_hit = 1
> >>> vacuum_cost_page_miss = 10
> >>> vacuum_cost_page_dirty = 20
> >>> vacuum_cost_limit = 1000
> >>> autovacuum_vacuum_cost_delay = 300
> >>
> >> why is it not a good idea to give end users control over when they
> >> want to run it ?
> >
> > Effectively, you have control over autovacuum via these params.
> > You have to remember, that autovacuum doesn't cost much, and it makes
> > planner know more about data.
> > It's not there to clean up databases, as you might imagine - it is
> > there to update stats, and mark pages as free.
> >
> > So make sure you tweak that config fist, because I have a funny
> > feeling that you just think that vacuuming bogs down your machine, and
> > _can_ be turned off without any bad consequences, which is simply not
> > true.
>
> our usage pattern is such that peak activity (indicated by load average)
> during day time is 10 times during night hours. Autovacuum just puts
> more pressure to the system. If less stressing version is used then
> it shall take longer to complete one cycle,  which would mean  less
> performance for longer time . Less performance queues up queries
> and encourages people to re submit their queries which again
> adds to bogging up the system.

That's not exactly how it works in practise, if tuned properly.  It may
take longer, but it is less intensive while running.

We had one system that had spikes happening due to the exact case you
described - there were noticeably high IO wait times while certain
tables were being vacuumed.  We set the cost delay and the wait times
dropped to the point where it was non-issue.  Vacuums take twice as
long, but there is no measurable impact to the performance.

> In our case i feel the hardware is bit underscaled as compared to
> load thats why i think running in lean hours is best of both worlds
> no performance sacrifices and intelligent vacuuming.

That is a different issue altogether.

Not vacuuming a running system at all during peak hours is not
considered intelligent vacuuming IMHO.  There are plenty of use cases
where small, frequent vacuums keep tables under control at a very low
cost.  Letting them go for extended periods of time without vacuuming
causes bloat and eventual slowdowns to table access which manifest in
higher IO usage across the board.

If you really are dead set on vacuuming only at night, you may want to
do a careful analysis of which tables need to be vacuumed and when, and
trigger manual vacuums from cron.

--
Brad Nicholson  416-673-4106
Database Administrator, Afilias Canada Corp.


pgsql-performance by date:

Previous
From: Ben Chobot
Date:
Subject: Re: scheduling autovacuum at lean hours only.
Next
From: Rajesh Kumar Mallah
Date:
Subject: Re: scheduling autovacuum at lean hours only.