Re: pg_autovacuum next steps - Mailing list pgsql-hackers
From | Joe Conway |
---|---|
Subject | Re: pg_autovacuum next steps |
Date | |
Msg-id | 405F7671.7070705@joeconway.com Whole thread Raw |
In response to | Re: pg_autovacuum next steps ("Matthew T. O'Connor" <matthew@zeut.net>) |
Responses |
Re: pg_autovacuum next steps
|
List | pgsql-hackers |
Matthew T. O'Connor wrote: > * Inability to customize thresholds on a per table basis I ran headlong into this one. IMHO fixing this is critical. > * Inability to set default thresholds on a per database basis > * Inability to exclude specific databases / tables from pg_autovacuum > monitoring These would be nice to have, but less critical than #1 I think. > * Inability to schedule vacuums during off-peak times This would be *really* nice to have. In my recent case, if pg_autovacuum could work for say 3 minutes, and then back off for 2 minutes or so while the batch transactions hit, it would be ideal. > I'm not sure how to address all of these concerns, or that they all > should be addressed right now. One of my big questions is backend > integration. I am leaning towards leaving pg_autovacuum as a client > application in contrib for one more release. During this time, I can > continue to tweak and improve pg_autovacuum so that we will have a very > good idea what the final product should be before we make it a standard > backend process. I really think pg_autovacuum ought to get folded into the backend now, for 7.5. I haven't had time yet to read the entire thread, but I saw others making the same comment. It would make some of the listed problems go away, or at least become far easier to deal with. > For PostgreSQL 7.5, I plan to implement these new features: > > 1.Per database defaults and per table thresholds (including total > exclusion) Great! > 2.Persistent data > 3.Single-Pass Mode (external scheduling from cron etc...) > 4.Off peak scheduling Great again! > 1. Per Database defaults and Per table Thresholds: > > 1.Store config data inside a special pg_autovacuum table inside > existing databases that wants custom settings. A natural if folded into the backend. > 3.Single-Pass Mode (External Scheduling): > > I have received requests to be able to run pg_autovacuum only on request > (not as a daemon) making only one pass over all the tables (not looping > indefinately). The advantage being that it will operate more like the > current vacuum command except that it will only vacuum tables that need > to be vacuumed. This feature could be useful as long as pg_autovacuum > exists outside the backend. If pg_autovacuum gets integrated into the > backend and gets automatically started as a daemon during startup, then > this option will no longer make sense. It still might make sense. You could have a mode where the daemon essentially sleeps forever, until explicitly woken up by a signal. When woken, it makes one pass, and goes back to infinite sleep. Then provide a simple way to signal the autovacuum process -- maybe an extension of the current VACUUM syntax. > 4.Off-Peak Scheduling: > > A fundamental advantage of our vacuum system is that the work required > to reclaim table space is taken out of the critical path and can be > moved to and off-peak time when cycles are less precious. One of the > drawbacks of the current pg_autovacuum is that it doesn't have any way > to factor this in. > > In it's simplest form (which I will implement first) I would add the > ability to add a second set of thresholds that will be active only > during an “off-peak” time that can be specified in the pg_autovacuum > database, perhaps in a general_settings table. I don't know how this would work, but it is for sure important. In the recent testing I found that pg_autovacuum (well, lazy vacuum in general, but I was using pg_autovacuum to control it) made a huge difference in performance of batch transactions. They range from 4-5 seconds without vacuum running, to as high as 15 minutes with vacuum running. With the vacuum delay patch, delay = 1, pagecount = 8, I still saw times go as high as 10 minutes. Backing vacuum off any more than that caused it to fall behind the transaction rate unrecoverably. But as I said above, if the transactions could complete without vacuum running in 4-5 seconds, then vacuuming resumes for the 3-to-4 minutes between batches, all would be well. Joe
pgsql-hackers by date: