Re: Second attempt, roll your own autovacuum - Mailing list pgsql-general
From | Christopher Browne |
---|---|
Subject | Re: Second attempt, roll your own autovacuum |
Date | |
Msg-id | 87ac1kkmn5.fsf@wolfe.cbbrowne.com Whole thread Raw |
In response to | Second attempt, roll your own autovacuum (Glen Parker <glenebob@nwlink.com>) |
Responses |
Re: Second attempt, roll your own autovacuum
|
List | pgsql-general |
In an attempt to throw the authorities off his trail, tgl@sss.pgh.pa.us (Tom Lane) transmitted: > Glen Parker <glenebob@nwlink.com> writes: >> I am still trying to roll my own auto vacuum thingy. > > Um, is this purely for hack value? What is it that you find inadequate > about regular autovacuum? It is configurable through the pg_autovacuum > catalog --- which I'd be the first to agree is a sucky user interface, > but we're not going to set the user interface in concrete until we are > pretty confident it's feature-complete. So: what do you see missing? I think that about a year ago I proposed a more sophisticated approach to autovacuum; one part of it was to set up a "request queue," a table where vacuum requests would get added. There's some "producer" side stuff: - There could be tables you want to vacuum exceedingly frequently; those could get added periodically via something shaped like cron. - One could ask for all the tables in a given database to be added to the queue, so as to mean that all tables would get vacuumed every so often. - You might even inject requests 'quasi-manually', asking for the queue to do work on particular tables. There's some "policy side" stuff: - Rules might be put in place to eliminate certain tables from the queue, providing some intelligence as to what oughtn't get vacuumed Then there's the "consumer": - The obvious "dumb" approach is simply to have one connection that runs through the queue, pulling the eldest entry, vacuuming, and marking it done. - The obvious extension is that if a table is listed multiple times in the queue, it only need be processed once. - There might be time-based exclusions to the effect that large tables oughtn't be processed during certain periods (backup time?) - One might have *two* consumers, one that will only process small tables, so that those little, frequently updated tables can get handled quickly, and another consumer that does larger tables. Or perhaps that knows that it's fine, between 04:00 and 09:00 UTC, to have 6 consumers, and blow through a lot of larger tables simultaneously. After all, changes in 8.2 mean that concurrent vacuums don't block one another from cleaning out dead content. I went as far as scripting up the simplest form of this, with "injector" and queue and the "dumb consumer." Gave up because it wasn't that much better than what we already had. -- output = reverse("moc.liamg" "@" "enworbbc") http://linuxfinances.info/info/ Minds, like parachutes, only function when they are open.
pgsql-general by date: