Re: Still recommending daily vacuum... - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: Still recommending daily vacuum...
Date
Msg-id 468E622F.EE98.0025.0@wicourts.gov
Whole thread Raw
In response to Re: Still recommending daily vacuum...  (Alvaro Herrera <alvherre@commandprompt.com>)
Responses Re: Still recommending daily vacuum...  (Alvaro Herrera <alvherre@commandprompt.com>)
List pgsql-hackers
>>> On Fri, Jul 6, 2007 at  2:19 PM, in message
<20070706191912.GC15358@alvh.no-ip.org>, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> Kevin Grittner wrote:

> 2. The point of autovacuum is to get rid of maintenance burden, not add
> to it.  If you know which tables are small and frequently updated, then
> configure those to specific settings that you've found to be optimal,
> and then you don't need to worry about vacuuming them any longer.
We have 72 counties using the same schema, which has over 300 tables.
(Each has their own server, located in their county, with their data.)
Rather than trying to fine-tune autovacuum for each table in all locations,
we find it more convenient to use general settings which are aggressive
enough for the small, high-update tables, but lax enough to let the big
ones go until a nightly database vacuum.  (That time is pretty slack
anyway, so why not off-load the overhead of the vacuum to those hours?)
> If you didn't tune it to match specific tables,
> most likely your biggest tables never met the formula's condition, which
> is why you were seeing it affecting only the small tables (which met the
> condition under the values you configured server-wide).
>
> The extra I/O I was talking about would come from vacuuming one of your
> biggest tables, which could cause the amount of I/O to swamp everything
> else the server was doing at the time.
Is there something better about having the autovacuum compete with load
during the week, rather than doing a database vacuum during otherwise
idle weekend hours, immediately after the weekly delete of almost 2% of
the rows?  At the time we run the database vacuum analyze, there is
nothing else running to be swamped.
>> Our tables tend to fall into one of four categories, small tables with high
>> update rates, medium tables (millions or tens of millions of rows) with
>> thousands or tens of thousands of updates per day, static tables of various
>> sizes that are only modified as part of a software release, and big honking
>> tables (100s of GB) which are either insert-only or are insert with
>> periodic purge of old rows.  Only the first group has a chance of being
>> autovacuumed in normal operations.  Event he purges don't cause it to kick
>> in.
>
> It could certainly vacuum all your tables.
Well, sure, if we weren't doing a nightly database vacuum.  (Weekly for the
database containing the largest tables, mentioned above.)
This all started with the question about whether the documentation should
say anything about vacuum schedules other than "enable autovacuum."
My point was that I have a use case where I think that a scheduled vacuum
will be better than leaving everything to autovacuum.  I may not be the only
one, so I'm thinking the documentation should discuss where an explicit
schedule might be useful.
The changes sound good, but I don't see the point of having any vacuum
activity during the work week on the big tables in the database I was
describing.  It seems to me that it would result in at least some
performance degradation for the interactive users, and bloat the table,
since we might start inserting before the post-delete vacuum.
-Kevin



pgsql-hackers by date:

Previous
From: "Matthew T. O'Connor"
Date:
Subject: Re: Still recommending daily vacuum...
Next
From: Alvaro Herrera
Date:
Subject: Re: Still recommending daily vacuum...