Thread: RFC: changing autovacuum_naptime semantics
Hackers, I want to propose some very simple changes to autovacuum in order to move forward (a bit): 1. autovacuum_naptime semantics 2. limiting the number of workers: global, per database, per tablespace? I still haven't received the magic bullet to solve the hot table problem, but these at least means we continue doing *something*. Changing autovacuum_naptime semantics Are we agreed on changing autovacuum_naptime semantics? The idea is to make it per-database instead of the current per-cluster, i.e., a "nap" would be the minimum time that passes between starting one worker into a database and starting another worker in the same database. Currently, naptime is the time elapsed between two worker runs across all databases. So if you have 15 databases, autovacuuming each one takes place every 15*naptime. Eventually, we could have per-database naptime defined in pg_database, and do away with the autovacuum_naptime GUC param (or maybe keep it as a default value). Say for database D1 you want to have workers every 60 seconds but for database D2 you want 1 hour. Question: Is everybody OK with changing the autovacuum_naptime semantics? Limiting the number of workers I was originally proposing having a GUC parameter which would limit the cluster-wide maximum number of workers. Additionally we could have a per-database limit (stored in a pg_database column), being simple to implement. Josh Drake proposed getting rid of the GUC param, saying that it would confuse users to set the per-database limit to some higher value than the GUC setting and then finding the lower limit enforced (presumably because of being unaware of it). The problem is that we need to set shared memory up for workers, so we really need a hard limit and it must be global. Thus the GUC param is not optional. Other people also proposed having a per-tablespace limit. This would make a lot of sense, tablespaces being the natural I/O units. However, I'm not very sure it's too easy to implement, because you can put half of database D1 and half of database D2 in tablespace T1, and the two other halves in tablespace T2. Then enforcing the limit becomes rather complicated and will probably mean putting a worker to sleep. I think it makes more sense to skip implementing per-tablespace limits for now, and have a plan to put per-tablespace IO throttles in the future. Questions: Is everybody OK with not putting a per-tablespace worker limit? Is everybody OK with putting per-database worker limits on a pg_database column? -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes: > Is everybody OK with changing the autovacuum_naptime semantics? it seems already different from 8.2, so no objection to further change. > Is everybody OK with not putting a per-tablespace worker limit? > Is everybody OK with putting per-database worker limits on a pg_database > column? I don't think we need a new pg_database column. If it's a GUC you can do ALTER DATABASE SET, no? Or was that what you meant? regards, tom lane
Alvaro, Alvaro Herrera wrote: > I still haven't received the magic bullet to solve the hot table > problem, but these at least means we continue doing *something*. Can I know about what is your plan or idea for autovacuum improvement for 8.3 now? And also what is the roadmap of autovacuum improvement for 8.4? Thanks, Galy Lee lee.galy _at_ ntt.oss.co.jp NTT Open Source Software Center
On Mar 7, 2007, at 4:00 PM, Alvaro Herrera wrote: > Is everybody OK with putting per-database worker limits on a > pg_database > column? I'm worried that we would live to regret such a limit. I can't really see any reason to limit how many vacuums are occurring in a database, because there's no limiting factor there; you're either going to be IO bound (per-tablespace), or *maybe* CPU-bound (perhaps the Greenplum folks could enlighten us as to whether they run into vacuum being CPU-bound on thumpers). Changing the naptime behavior to be database related makes perfect sense, because the minimum XID you have to worry about is a per- database thing; I just don't see limiting the number of vacuums as being per-database, though. I'm also skeptical that we'll be able to come up with a good way to limit the number of backends until we get the hot table issue addressed. Perhaps a decent compromise for now would be to limit how many 'small table' vacuums could run on each tablespace, and then limit how many 'unlimited table size' vacuums could run on each tablespace, where 'small table' would probably have to be configurable. I don't think it's the best final solution, but it should at least solve the immediate need. -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Is everybody OK with not putting a per-tablespace worker limit? > > Is everybody OK with putting per-database worker limits on a pg_database > > column? > > I don't think we need a new pg_database column. If it's a GUC you can > do ALTER DATABASE SET, no? Or was that what you meant? No, that doesn't work unless we save the datconfig column to the pg_database flatfile, because it's the launcher (which is not connected) who needs to read it. Same thing with an hypothetical per-database naptime. The launcher would also need to parse it, which is not ideal (though not a dealbreaker either). -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Galy Lee wrote: Hi, > Alvaro Herrera wrote: > > I still haven't received the magic bullet to solve the hot table > > problem, but these at least means we continue doing *something*. > > Can I know about what is your plan or idea for autovacuum improvement > for 8.3 now? And also what is the roadmap of autovacuum improvement for 8.4? Things I want to do for 8.3: - Make use of the launcher/worker stuff, that is, allow multiple autovacuum processes in parallel. With luck we'll findout how to deal with hot tables. Things I'm not sure we'll be able to have in 8.3, in which case I'll get to them for early 8.4: - The maintenance window stuff, i.e., being able to throttle workers depending on a user-defined schedule. 8.4 material: - per-tablespace throttling, coordinating IO from multiple workers I don't have anything else as detailed as a "plan". If you have suggestions, I'm all ears. Now regarding your restartable vacuum work. I think that stopping a vacuum at some point and being able to restart it later is very cool and may get you some hot chicks, but I'm not sure it's really useful. I think it makes more sense to do something like throttling an ongoing vacuum to a reduced IO rate, if the maintenance window closes. So if you're in the middle of a heap scan and the maintenance window closes, you immediately stop the scan and go the the index cleanup phase, *at a reduced IO rate*. This way, the user will be able to get the benefits of vacuuming at some not-too-distant future, without requiring the maintenance window to open again, but without the heavy IO impact that was allowed during the maintenance window. Does this make sense? -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera wrote: > I don't have anything else as detailed as a "plan". If you have > suggestions, I'm all ears. Cool, thanks for the update. :) We also have some new ideas on the improvement of autovacuum now. I will raise it up later. > Now regarding your restartable vacuum work. > Does this make sense? I also have reached a similar conclusion now. Thank you. Regards Galy
Alvaro Herrera <alvherre@commandprompt.com> writes: > Now regarding your restartable vacuum work. I think that stopping a > vacuum at some point and being able to restart it later is very cool and > may get you some hot chicks, but I'm not sure it's really useful. Too true :-( > I think it makes more sense to do something like throttling an ongoing > vacuum to a reduced IO rate, if the maintenance window closes. So if > you're in the middle of a heap scan and the maintenance window closes, > you immediately stop the scan and go the the index cleanup phase, *at a > reduced IO rate*. Er, why not just finish out the scan at the reduced I/O rate? Any sort of "abort" behavior is going to create net inefficiency, eg doing an index scan to remove only a few tuples. ISTM that the vacuum ought to just continue along its existing path at a slower I/O rate. regards, tom lane
Tom Lane wrote: > Er, why not just finish out the scan at the reduced I/O rate? Any sort Sometimes, you may need to vacuum large table in maintenance window and hot table in the service time. If vacuum for hot table does not eat two much foreground resource, then you can vacuum large table with a lower IO rate outside maintenance window; but if vacuum for hot table is overeating the system resource, then launcher had better to stop the long running vacuum outside maintenance window. But I am not insisting on the stop-start feature at this moment. Changing the cost delay dynamically sounds more reasonable. We can use it to balance total I/O of workers in service time or maintenance time. It is not so difficult to achieve this by leveraging the share memory of autovacuum. Best Regards Galy Lee
On Mar 9, 2007, at 6:42 AM, Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: >> Now regarding your restartable vacuum work. I think that stopping a >> vacuum at some point and being able to restart it later is very >> cool and >> may get you some hot chicks, but I'm not sure it's really useful. > > Too true :-( Yeah. Wouldn't 'divide and conquer' kinda approach make it better ? Ie. let vacuum to work on some part of table/db. Than stop, pick up another part later, vacuum it, etc, etc ? -- Grzegorz Jaskiewicz gj@pointblue.com.pl
"Tom Lane" <tgl@sss.pgh.pa.us> writes: > Er, why not just finish out the scan at the reduced I/O rate? Any sort > of "abort" behavior is going to create net inefficiency, eg doing an > index scan to remove only a few tuples. ISTM that the vacuum ought to > just continue along its existing path at a slower I/O rate. I think the main motivation to abort a vacuum scan is so we can switch to some more urgent scan. So if in the middle of a 1-hour long vacuum of some big warehouse table we realize that a small hot table is long overdue for a vacuum we want to be able to remove the tuples we've found so far, switch to the hot table, and when we don't have more urgent tables to vacuum resume the large warehouse table vacuum. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Gregory Stark wrote: > "Tom Lane" <tgl@sss.pgh.pa.us> writes: > > > Er, why not just finish out the scan at the reduced I/O rate? Any sort > > of "abort" behavior is going to create net inefficiency, eg doing an > > index scan to remove only a few tuples. ISTM that the vacuum ought to > > just continue along its existing path at a slower I/O rate. > > I think the main motivation to abort a vacuum scan is so we can switch to some > more urgent scan. So if in the middle of a 1-hour long vacuum of some big > warehouse table we realize that a small hot table is long overdue for a vacuum > we want to be able to remove the tuples we've found so far, switch to the hot > table, and when we don't have more urgent tables to vacuum resume the large > warehouse table vacuum. Why not just let another autovac worker do the hot table? -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support