Thread: how to plan for vacuum?
Hi, For I can not find too much information about how to use vacuum, I want to ask some general information about the guideline of vacuum planning. 1. How do we know if autovacuum is enough for my application, or should I setup a vacuum manually from cron for my application? 2. How to set the GUC parameters for autovacuum? There are two sets of parameters for autovacuum: - vacuum threshold and scale factor (500/0.2) - analyze threshold and scale factor(250/0.1) Is there any guideline to set these parameters? When does it need to change the default values? 3. How to tune cost-based delay vacuum? I had searched in performance list; it seems that most of the practices are based on experience / trial-and-error approach to meet the requirement of disk utilization or CPU utilization. Is there any other guild line to set them? For when autovacuum is turned on by default, if the parameters for vacuum have not been set well, it will make the system rather unstable. So I just wonder if we should setup a section in the manual about the tips of vacuum, then many users can easily set the vacuum parameters for their system. Best Regards Galy Lee NTT OSS Center
Just have one example here: workload: run pgbench in 365x24x7 database size: 100GB the workload distribution: 06:00-24:00 100tps 00:00-06:00 20tps how should we plan vacuum for this situation to get the highest performance? Best regards Galy Galy Lee wrote: > Hi, > > For I can not find too much information about how to use vacuum, I want > to ask some general information about the guideline of vacuum planning. > > 1. How do we know if autovacuum is enough for my application, or should > I setup a vacuum manually from cron for my application? > > 2. How to set the GUC parameters for autovacuum? > There are two sets of parameters for autovacuum: > - vacuum threshold and scale factor (500/0.2) > - analyze threshold and scale factor(250/0.1) > Is there any guideline to set these parameters? When does it need to > change the default values? > > 3. How to tune cost-based delay vacuum? > I had searched in performance list; it seems that most of the practices > are based on experience / trial-and-error approach to meet the > requirement of disk utilization or CPU utilization. Is there any other > guild line to set them? > > For when autovacuum is turned on by default, if the parameters for > vacuum have not been set well, it will make the system rather unstable. > So I just wonder if we should setup a section in the manual about the > tips of vacuum, then many users can easily set the vacuum parameters for > their system.
On Wed, Jan 24, 2007 at 02:37:44PM +0900, Galy Lee wrote: > 1. How do we know if autovacuum is enough for my application, or should > I setup a vacuum manually from cron for my application? Generally I trust autovac unless there's some tables where it's critical that they be vacuumed frequently, such as a queue table or a web session table. > 2. How to set the GUC parameters for autovacuum? > There are two sets of parameters for autovacuum: > - vacuum threshold and scale factor (500/0.2) > ?$B!! - analyze threshold and scale factor(250/0.1) > Is there any guideline to set these parameters? When does it need to > change the default values? I find those are generally pretty good starting points; just bear in mind that it means 20% dead space. > 3. How to tune cost-based delay vacuum? > I had searched in performance list; it seems that most of the practices > are based on experience / trial-and-error approach to meet the > requirement of disk utilization or CPU utilization. Is there any other > guild line to set them? Unless you have a means for the database to monitor IO usage on it's own, I don't know that we have a choice... I'll generally start with a cost delay of 20ms and adjust based on IO utilization. -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
Jim C. Nasby wrote: > I'll generally start with a cost delay of 20ms and adjust based on IO > utilization. I've been considering set a default autovacuum cost delay to 10ms; does this sound reasonable? -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Thu, Jan 25, 2007 at 12:52:02AM -0300, Alvaro Herrera wrote: > Jim C. Nasby wrote: > > > I'll generally start with a cost delay of 20ms and adjust based on IO > > utilization. > > I've been considering set a default autovacuum cost delay to 10ms; does > this sound reasonable? For a lightly loaded system, sure. For a heavier load that might be too much, but of course that's very dependent on not only your hardware, but how much you want vacuum to interfere with normal operations. Though, I'd say as a default it's probably better to be more aggressive rather than less. Also, it might be better to only set autovac_cost_delay by default; presumably if someone's running vacuum by hand they want it done pronto. -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
Jim C. Nasby wrote: > On Wed, Jan 24, 2007 at 02:37:44PM +0900, Galy Lee wrote: >> 1. How do we know if autovacuum is enough for my application, or should >> I setup a vacuum manually from cron for my application? > > Generally I trust autovac unless there's some tables where it's critical > that they be vacuumed frequently, such as a queue table or a web session > table. So how much can we trust autovac? I think at least the following cases can not be covered by autovac now: - small but high update tables which are sensitive to garbage - very big tables which need a long time to be vacuumed. - when we need to adjust the the max_fsm_page >> 2. How to set the GUC parameters for autovacuum? >> There are two sets of parameters for autovacuum: >> - vacuum threshold and scale factor (500/0.2) >> ?$B!! - analyze threshold and scale factor(250/0.1) >> Is there any guideline to set these parameters? When does it need to >> change the default values? > > I find those are generally pretty good starting points; just bear in > mind that it means 20% dead space. so what is the principle to set them? - keep dead space lower than some disk limit - or keep the garbage rate lower than fillfactor or any other general principle?
On Thu, Jan 25, 2007 at 07:29:20PM +0900, Galy Lee wrote: > so what is the principle to set them? > - keep dead space lower than some disk limit > - or keep the garbage rate lower than fillfactor > or any other general principle? How do you measure "dead space" and "garbage rate?" I'm a newbe, I don't even know what these terms mean, but if I can measure them, perhaps it will gel, and really if you can't measure the effect of a setting change, what have you got? I would hope any discussion on autovac parms would include some metric evaluation techniques. Thanks.
Please cc the list so others can reply as well... On Thu, Jan 25, 2007 at 08:45:50AM +0100, Tomas Vondra wrote: > > On Wed, Jan 24, 2007 at 02:37:44PM +0900, Galy Lee wrote: > >> 1. How do we know if autovacuum is enough for my application, or should > >> I setup a vacuum manually from cron for my application? > > > > Generally I trust autovac unless there's some tables where it's critical > > that they be vacuumed frequently, such as a queue table or a web session > > table. > > You can tune thresholds and scale factors for that particular table > using pg_autovacuum. If you lower them appropriately, the vacuum will be > fired more often for that table - but don't lower them too much, just go > step by step until you reach values that are fine for you. That doesn't work well if autovac gets tied up vacuuming a very large table. Granted, when that happens there are considerations about the long-running vacuum transaction (prior to 8.2), but in many systems you'll still get some use out of other vacuums. -- Jim C. Nasby, Database Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
Alvaro Herrera wrote: > Jim C. Nasby wrote: > >> I'll generally start with a cost delay of 20ms and adjust based on IO >> utilization. > > I've been considering set a default autovacuum cost delay to 10ms; does > this sound reasonable? It really depends on the system. Most of our systems run anywhere from 10-25ms. I find that any more than that, Vacuum takes too long. Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
On Thu, Jan 25, 2007 at 08:04:49AM -0800, Joshua D. Drake wrote: > > It really depends on the system. Most of our systems run anywhere from > 10-25ms. I find that any more than that, Vacuum takes too long. How do you measure the impact of setting it to 12 as opposed to 15?
On Jan 25, 2007, at 10:33 AM, Ray Stell wrote: > On Thu, Jan 25, 2007 at 08:04:49AM -0800, Joshua D. Drake wrote: >> >> It really depends on the system. Most of our systems run anywhere >> from >> 10-25ms. I find that any more than that, Vacuum takes too long. > > > How do you measure the impact of setting it to 12 as opposed to 15? If you've got a tool that will report disk utilization as a percentage it's very easy; I'll decrease the setting until I'm at about 90% utilization with the system's normal workload (leaving some room for spikes, etc). Sometimes I'll also tune the costs if reads vs. writes are a concern. -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)