Re: Autovacuum Improvements - Mailing list pgsql-general

From Christopher Browne
Subject Re: Autovacuum Improvements
Date
Msg-id 87wt4ijbj4.fsf@wolfe.cbbrowne.com
Whole thread Raw
In response to Re: Second attempt, roll your own autovacuum  (Richard Huxton <dev@archonet.com>)
Responses Re: Autovacuum Improvements
Re: Autovacuum Improvements
Re: Autovacuum Improvements
List pgsql-general
After takin a swig o' Arrakan spice grog, nagy@ecircle-ag.com (Csaba Nagy) belched out:
> On Thu, 2006-12-21 at 18:41, Alvaro Herrera wrote:
>> > From all the discussion here I think the most benefit would result from
>> > a means to assign tables to different categories, and set up separate
>> > autovacuum rules per category (be it time window when vacuuming is
>> > allowed, autovacuum processes assigned, cost settings, etc). I doubt you
>> > can really define upfront all the vacuum strategies you would need in
>> > real life, so why not let the user define it ? Define the categories by
>> > assigning tables to them, and the rules per category. Then you can
>> > decide what rules to implement, and what should be the defaults...
>>
>> Hmm, yeah, I think this is more or less what I have in mind.
>
> Cool :-)
>
> Can I suggest to also consider the idea of some kind of autovacuum
> process group, with settings like:
>
>  - number of processes running in parallel;
>  - time windows when they are allowed to run;
>
> Then have the table categories with all the rest of the
> threshold/cost/delay settings.
>
> Then have the possibility to assign tables to categories, and to assign
> categories to processing groups.
>
> I think this would allow the most flexibility with the minimum of
> repetition in settings (from the user perspective).

Seems to me that you could get ~80% of the way by having the simplest
"2 queue" implementation, where tables with size < some threshold get
thrown at the "little table" queue, and tables above that size go to
the "big table" queue.

That should keep any small tables from getting "vacuum-starved."

I'd think the next step would be to increase the number of queues,
perhaps in a time-based fashion.  There might be times when it's
acceptable to vacuum 5 tables at once, so you burn thru little tables
"like the blazes," and handle larger ones fairly promptly.  And other
times when you don't want to do *any* big tables, and limit a single
queue to just the itty bitty ones.

This approach allows you to stay mostly heuristic-based, as opposed to
having to describe policies in gratuitous detail.

Having a mechanism that requires enormous DBA effort and where there
is considerable risk of simple configuration errors that will be hard
to notice may not be the best kind of "feature" :-).
--
let name="cbbrowne" and tld="gmail.com" in name ^ "@" ^ tld;;
http://linuxdatabases.info/info/slony.html
"You can measure a programmer's perspective by noting his attitude on
the continuing vitality of FORTRAN." -- Alan Perlis

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: tape backups
Next
From: Benjamin Arai
Date:
Subject: Re: [ADMIN] OUTER JOIN IS SLOW