Thread: Second attempt, roll your own autovacuum
Hi all, I am still trying to roll my own auto vacuum thingy. The goal is to vacuum on demand in one step just like the old days, but not hit the tables that never change (we have a lot). The idea now is to use a combination of SQL and shell scripts to duplicate some of what auto vacuum does. It actually doesn't seem that difficult. I have some SQL that produces a list of tables that need vacuuming based on statistics found in pg_stat_user_tables, and reltuples from pg_class, using the same basic rules as auto vacuum per the documentation. So far so good. The SQL actually produces an SQL script containing VACUUM commands, which I can then feed back into psql. The result is a HUGE savings in vacuum time at night. The trouble now is, I don't see how to reset the statistics. My assumption was that vacuum did it, but that appears to be false. How does autovacuum do it? Can I do it with SQL? -Glen
Glen Parker wrote: > The trouble now is, I don't see how to reset the statistics. My > assumption was that vacuum did it, but that appears to be false. How > does autovacuum do it? Can I do it with SQL? Huh, reset what statistics? Autovacuum does not reset anything. What statistics are you using? The number of dead tuples _should_ show as zero on the stat system after a vacuum, certainly. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera wrote: > Glen Parker wrote: > >> The trouble now is, I don't see how to reset the statistics. My >> assumption was that vacuum did it, but that appears to be false. How >> does autovacuum do it? Can I do it with SQL? > > Huh, reset what statistics? Autovacuum does not reset anything. What > statistics are you using? The number of dead tuples _should_ show as > zero on the stat system after a vacuum, certainly. pg_stat_user_tables.[n_tup_ins|n_tup_upd|n_tup_del]. Hmm maybe I'm doing this all wrong then. Is there a way to find the estimated dead tuples from SQL, the same number autovacuum looks at? -Glen
Glen Parker <glenebob@nwlink.com> writes: > I am still trying to roll my own auto vacuum thingy. Um, is this purely for hack value? What is it that you find inadequate about regular autovacuum? It is configurable through the pg_autovacuum catalog --- which I'd be the first to agree is a sucky user interface, but we're not going to set the user interface in concrete until we are pretty confident it's feature-complete. So: what do you see missing? regards, tom lane
On Tue, 2006-12-19 at 07:28, Tom Lane wrote: > Glen Parker <glenebob@nwlink.com> writes: > > I am still trying to roll my own auto vacuum thingy. > > Um, is this purely for hack value? What is it that you find inadequate > about regular autovacuum? It is configurable through the pg_autovacuum > catalog --- which I'd be the first to agree is a sucky user interface, > but we're not going to set the user interface in concrete until we are > pretty confident it's feature-complete. So: what do you see missing? I'm not sure what the OP had in mind, but the thing which is missing for us is a time window restriction sort of thing. What I mean is to make sure a vacuum will never kick in in the main business hours, but only at night at pre-specified hours, and only if the vacuum threshold was met for the delete/update counts. It would be nice if there could be a flexible time window specification, like specifying only some days, or only weekends, or each night some specific hours... but just one time window would be a big improvement already. Cheers, Csaba.
Glen Parker wrote: > Alvaro Herrera wrote: > >Glen Parker wrote: > > > >>The trouble now is, I don't see how to reset the statistics. My > >>assumption was that vacuum did it, but that appears to be false. How > >>does autovacuum do it? Can I do it with SQL? > > > >Huh, reset what statistics? Autovacuum does not reset anything. What > >statistics are you using? The number of dead tuples _should_ show as > >zero on the stat system after a vacuum, certainly. > > pg_stat_user_tables.[n_tup_ins|n_tup_upd|n_tup_del]. Hmm maybe I'm > doing this all wrong then. Is there a way to find the estimated dead > tuples from SQL, the same number autovacuum looks at? Hmm, I thought the number of dead tuples was being exposed in pg_stat_user_tables but evidently not. I think this is an oversight which we could "fix" in 8.3. (For a current release I guess you could install your own function, it shouldn't be too difficult to code it). -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
In an attempt to throw the authorities off his trail, tgl@sss.pgh.pa.us (Tom Lane) transmitted: > Glen Parker <glenebob@nwlink.com> writes: >> I am still trying to roll my own auto vacuum thingy. > > Um, is this purely for hack value? What is it that you find inadequate > about regular autovacuum? It is configurable through the pg_autovacuum > catalog --- which I'd be the first to agree is a sucky user interface, > but we're not going to set the user interface in concrete until we are > pretty confident it's feature-complete. So: what do you see missing? I think that about a year ago I proposed a more sophisticated approach to autovacuum; one part of it was to set up a "request queue," a table where vacuum requests would get added. There's some "producer" side stuff: - There could be tables you want to vacuum exceedingly frequently; those could get added periodically via something shaped like cron. - One could ask for all the tables in a given database to be added to the queue, so as to mean that all tables would get vacuumed every so often. - You might even inject requests 'quasi-manually', asking for the queue to do work on particular tables. There's some "policy side" stuff: - Rules might be put in place to eliminate certain tables from the queue, providing some intelligence as to what oughtn't get vacuumed Then there's the "consumer": - The obvious "dumb" approach is simply to have one connection that runs through the queue, pulling the eldest entry, vacuuming, and marking it done. - The obvious extension is that if a table is listed multiple times in the queue, it only need be processed once. - There might be time-based exclusions to the effect that large tables oughtn't be processed during certain periods (backup time?) - One might have *two* consumers, one that will only process small tables, so that those little, frequently updated tables can get handled quickly, and another consumer that does larger tables. Or perhaps that knows that it's fine, between 04:00 and 09:00 UTC, to have 6 consumers, and blow through a lot of larger tables simultaneously. After all, changes in 8.2 mean that concurrent vacuums don't block one another from cleaning out dead content. I went as far as scripting up the simplest form of this, with "injector" and queue and the "dumb consumer." Gave up because it wasn't that much better than what we already had. -- output = reverse("moc.liamg" "@" "enworbbc") http://linuxfinances.info/info/ Minds, like parachutes, only function when they are open.
From my POV, autovacuum is doing a very good job, with the exception of: > - There might be time-based exclusions to the effect that large tables > oughtn't be processed during certain periods (backup time?) Either (per table!) exception or permission based control of when a table can be vacuumed is needed to avoid vacuuming big tables during peek business periods. While this can be alleviated by setting lower vacuum cost settings, and it won't block anymore other vacuums, it will still need the multiple vacuum stuff to still process small tables: > - One might have *two* consumers, one that will only process small > tables, so that those little, frequently updated tables can get > handled quickly, and another consumer that does larger tables. > Or perhaps that knows that it's fine, between 04:00 and 09:00 UTC, > to have 6 consumers, and blow through a lot of larger tables > simultaneously. So one of the 2 might be enough. I guess time-based exclusion/permissions are not that easy to implement, and also not easy to set up properly... so what could work well is: - allow a "priority" setting per table in pg_autovacuum; - create a vacuum thread for each priority; - each thread checks it's own tables to be processed based on the priority setting from pg_autovacuum; - there have to be a default priority for tables not explicitly set up in pg_autovacuum; - possibly set a per priority default vacuum cost and delay; In 8.2 the different vacuum threads for the different priorities won't step on each other toes, and the default settings for the priorities can be used to create some easily manageable settings for vacuuming table categories with different update/delete patterns. There could be some preset priorities, but creating new ones would be useful so the user can create one per table update/delete pattern. Maybe priority is not the best word for this, but I can't think now on other better... Cheers, Csaba.
Csaba Nagy wrote: > >> - One might have *two* consumers, one that will only process small >> tables, so that those little, frequently updated tables can get >> handled quickly, and another consumer that does larger tables. >> Or perhaps that knows that it's fine, between 04:00 and 09:00 UTC, >> to have 6 consumers, and blow through a lot of larger tables >> simultaneously. > > So one of the 2 might be enough. I guess time-based > exclusion/permissions are not that easy to implement, and also not easy > to set up properly... so what could work well is: Alternatively, perhaps a threshold so that a table is only considered for vacuum if: (table-size * overall-activity-in-last-hour) < threshold Ideally you'd define your units appropriately so that you could just define threshold in postgresql.conf as 30% (of peak activity in last 100 hours say). -- Richard Huxton Archonet Ltd
> Alternatively, perhaps a threshold so that a table is only considered > for vacuum if: > (table-size * overall-activity-in-last-hour) < threshold > Ideally you'd define your units appropriately so that you could just > define threshold in postgresql.conf as 30% (of peak activity in last 100 > hours say). No, this is definitely not enough. The problem scenario is when autovacuum starts vacuuming a huge table and that keeps it busy 10 hours and in the meantime the small but frequently updated tables get awfully bloated... The only solution to that is to have multiple vacuums running in parallel, and it would be really nice if those multiple vacuums would be coordinated by autovacuum too... Cheers, Csaba.
Csaba Nagy wrote: > > Alternatively, perhaps a threshold so that a table is only considered > > for vacuum if: > > (table-size * overall-activity-in-last-hour) < threshold > > Ideally you'd define your units appropriately so that you could just > > define threshold in postgresql.conf as 30% (of peak activity in last 100 > > hours say). > > No, this is definitely not enough. The problem scenario is when > autovacuum starts vacuuming a huge table and that keeps it busy 10 hours > and in the meantime the small but frequently updated tables get awfully > bloated... > > The only solution to that is to have multiple vacuums running in > parallel, and it would be really nice if those multiple vacuums would be > coordinated by autovacuum too... Yes, I agree, having multiple "autovacuum workers" would be useful. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Csaba Nagy wrote: >> Alternatively, perhaps a threshold so that a table is only considered >> for vacuum if: >> (table-size * overall-activity-in-last-hour) < threshold >> Ideally you'd define your units appropriately so that you could just >> define threshold in postgresql.conf as 30% (of peak activity in last 100 >> hours say). > > No, this is definitely not enough. The problem scenario is when > autovacuum starts vacuuming a huge table and that keeps it busy 10 hours > and in the meantime the small but frequently updated tables get awfully > bloated... Ah (lightbulb goes on)! I see what you mean now. -- Richard Huxton Archonet Ltd
Alvaro Herrera wrote: > Csaba Nagy wrote: >>> Alternatively, perhaps a threshold so that a table is only considered >>> for vacuum if: >>> (table-size * overall-activity-in-last-hour) < threshold >>> Ideally you'd define your units appropriately so that you could just >>> define threshold in postgresql.conf as 30% (of peak activity in last 100 >>> hours say). >> No, this is definitely not enough. The problem scenario is when >> autovacuum starts vacuuming a huge table and that keeps it busy 10 hours >> and in the meantime the small but frequently updated tables get awfully >> bloated... >> >> The only solution to that is to have multiple vacuums running in >> parallel, and it would be really nice if those multiple vacuums would be >> coordinated by autovacuum too... > > Yes, I agree, having multiple "autovacuum workers" would be useful. Bruce, I think there are a couple of items here that might be worth adding to the TODO list. 1) Allow multiple "autovacuum workers": Currently Autovacuum is only capable of ordering one vacuum command at a time, for most work loads this is sufficient but falls down when a hot (very actively updated table) goes unvacuumed for a long period of time because a large table is currently being worked on. 2) Once we can have multiple autovacuum workers: Create the concept of hot tables that require more attention and should never be ignored for more that X minutes, perhaps have one "autovacuum worker" per hot table? (What do people think of this?) 3) Create "Maintenance Windows" for autovacuum: Currently autovacuum makes all of it's decisions based on a single per-table threshold value, maintenance windows would allow the setting of a per-window, per-table threshold. This makes it possible to, for example, forbid (or strongly discourage) autovacuum from doing maintenance work during normal business hours either for the entire system or for specific tables. None of those three items are on the todo list, however I think there is general consensus that they (at least 1 & 3) are good ideas.
Yes, I think there are these TODO items. I was waiting to see what additional replies there are before adding them. --------------------------------------------------------------------------- Matthew O'Connor wrote: > Alvaro Herrera wrote: > > Csaba Nagy wrote: > >>> Alternatively, perhaps a threshold so that a table is only considered > >>> for vacuum if: > >>> (table-size * overall-activity-in-last-hour) < threshold > >>> Ideally you'd define your units appropriately so that you could just > >>> define threshold in postgresql.conf as 30% (of peak activity in last 100 > >>> hours say). > >> No, this is definitely not enough. The problem scenario is when > >> autovacuum starts vacuuming a huge table and that keeps it busy 10 hours > >> and in the meantime the small but frequently updated tables get awfully > >> bloated... > >> > >> The only solution to that is to have multiple vacuums running in > >> parallel, and it would be really nice if those multiple vacuums would be > >> coordinated by autovacuum too... > > > > Yes, I agree, having multiple "autovacuum workers" would be useful. > > Bruce, I think there are a couple of items here that might be worth > adding to the TODO list. > > 1) Allow multiple "autovacuum workers": Currently Autovacuum is only > capable of ordering one vacuum command at a time, for most work loads > this is sufficient but falls down when a hot (very actively updated > table) goes unvacuumed for a long period of time because a large table > is currently being worked on. > > 2) Once we can have multiple autovacuum workers: Create the concept of > hot tables that require more attention and should never be ignored for > more that X minutes, perhaps have one "autovacuum worker" per hot table? > (What do people think of this?) > > 3) Create "Maintenance Windows" for autovacuum: Currently autovacuum > makes all of it's decisions based on a single per-table threshold value, > maintenance windows would allow the setting of a per-window, per-table > threshold. This makes it possible to, for example, forbid (or strongly > discourage) autovacuum from doing maintenance work during normal > business hours either for the entire system or for specific tables. > > None of those three items are on the todo list, however I think there is > general consensus that they (at least 1 & 3) are good ideas. > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org/ -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian wrote: > Yes, I think there are these TODO items. I was waiting to see what > additional replies there are before adding them. Speaking of which, I was just looking at the TODO at: http://www.postgresql.org/docs/faqs.TODO.html and I think this item: * Improve xid wraparound detection by recording per-table rather than per-database is done and working in 8.2 no?
matthew@zeut.net ("Matthew O'Connor") writes: > 2) Once we can have multiple autovacuum workers: Create the concept of > hot tables that require more attention and should never be ignored for > more that X minutes, perhaps have one "autovacuum worker" per hot > table? (What do people think of this?) One worker per "hot table" seems like overkill to me; you could chew up a lot of connections that way, which could be a DOS. That you have a "foot gun" is guaranteed; I think I'd rather that it come in the form that choosing the "hot list" badly hurts the rate of vacuuming than that we have a potential to chew up numbers of connections (which is a relatively non-renewable resource). -- (format nil "~S@~S" "cbbrowne" "cbbrowne.com") http://linuxdatabases.info/info/ There are no "civil aviation for dummies" books out there and most of you would probably be scared and spend a lot of your time looking up if there was one. :-) -- Jordan Hubbard in c.u.b.f.m
Tom Lane wrote: > Glen Parker <glenebob@nwlink.com> writes: >> I am still trying to roll my own auto vacuum thingy. > > Um, is this purely for hack value? Don't be silly ;-) Honestly I sort of thought the problem was fairly obvious. > What is it that you find inadequate > about regular autovacuum? It is configurable through the pg_autovacuum > catalog --- which I'd be the first to agree is a sucky user interface, > but we're not going to set the user interface in concrete until we are > pretty confident it's feature-complete. So: what do you see missing? Traditional vacuum does every table in the DB, which is absolutely The Wrong Thing for us. Vacuum can be fired against individual tables, but then how do I know which tables need it? Autovacuum is smart about which tables it hits, but exceedingly stupid about *when* it hits them. What I want is a way to do all needed vacuuming, in as short a time span as possible, when I decide it should be done. For us, that's between ~2 AM and ~3 AM each morning. If a vacuum runs past 3 AM, so be it, but it's better to hit it hard and try to be done by 3 AM than it is to lolly gag around about it unil 5 AM. The obvious answer for me is to vacuum all the tables that autovacuum would hit, but only on demand. Something like "VACUUM CONDITIONAL WHERE autovacuum_says_so()" :-) -Glen
matthew@zeut.net ("Matthew O'Connor") writes: > Bruce Momjian wrote: >> Yes, I think there are these TODO items. I was waiting to see what >> additional replies there are before adding them. > > Speaking of which, I was just looking at the TODO at: > > http://www.postgresql.org/docs/faqs.TODO.html > > and I think this item: > > * Improve xid wraparound detection by recording per-table rather than > per-database > > is done and working in 8.2 no? That's in the 8.2 release notes: - Track maximum XID age within individual tables, instead of whole databases (Alvaro) This reduces the overhead involved in preventing transaction ID wraparound, by avoiding unnecessary VACUUMs. -- output = reverse("ofni.sesabatadxunil" "@" "enworbbc") http://www3.sympatico.ca/cbbrowne/linux.html The human race will decree from time to time: "There is something at which it is absolutely forbidden to laugh." -- Nietzche on Common Lisp
Glen Parker wrote: > Tom Lane wrote: >> What is it that you find inadequate >> about regular autovacuum? It is configurable through the pg_autovacuum >> catalog --- which I'd be the first to agree is a sucky user interface, >> but we're not going to set the user interface in concrete until we are >> pretty confident it's feature-complete. So: what do you see missing? > > Traditional vacuum does every table in the DB, which is absolutely The > Wrong Thing for us. Vacuum can be fired against individual tables, but > then how do I know which tables need it? Autovacuum is smart about > which tables it hits, but exceedingly stupid about *when* it hits them. > > What I want is a way to do all needed vacuuming, in as short a time span > as possible, when I decide it should be done. For us, that's between ~2 > AM and ~3 AM each morning. If a vacuum runs past 3 AM, so be it, but > it's better to hit it hard and try to be done by 3 AM than it is to > lolly gag around about it unil 5 AM. > > The obvious answer for me is to vacuum all the tables that autovacuum > would hit, but only on demand. Something like "VACUUM CONDITIONAL WHERE > autovacuum_says_so()" :-) I believe the correct answer to this problems is to write a cron script that enables autovacuum at 2AM and disables it at 3AM. I think there is some talk of this in the archives somewhere. If you need to hit specific tables more often than that, then you can have another cron script that vacuums a table ever hour or something or something along those lines.
Chris Browne wrote: > matthew@zeut.net ("Matthew O'Connor") writes: > > > Bruce Momjian wrote: > >> Yes, I think there are these TODO items. I was waiting to see what > >> additional replies there are before adding them. > > > > Speaking of which, I was just looking at the TODO at: > > > > http://www.postgresql.org/docs/faqs.TODO.html > > > > and I think this item: > > > > * Improve xid wraparound detection by recording per-table rather than > > per-database > > > > is done and working in 8.2 no? > > That's in the 8.2 release notes: > > - Track maximum XID age within individual tables, instead of whole > databases (Alvaro) > > This reduces the overhead involved in preventing transaction ID > wraparound, by avoiding unnecessary VACUUMs. Yeah, this is what the TODO item was about, so it certainly is done. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
> 1) Allow multiple "autovacuum workers": Currently Autovacuum is only > capable of ordering one vacuum command at a time, for most work loads > this is sufficient but falls down when a hot (very actively updated > table) goes unvacuumed for a long period of time because a large table > is currently being worked on. > > 2) Once we can have multiple autovacuum workers: Create the concept of > hot tables that require more attention and should never be ignored for > more that X minutes, perhaps have one "autovacuum worker" per hot table? > (What do people think of this?) > > 3) Create "Maintenance Windows" for autovacuum: Currently autovacuum > makes all of it's decisions based on a single per-table threshold value, > maintenance windows would allow the setting of a per-window, per-table > threshold. This makes it possible to, for example, forbid (or strongly > discourage) autovacuum from doing maintenance work during normal > business hours either for the entire system or for specific tables. > > None of those three items are on the todo list, however I think there is > general consensus that they (at least 1 & 3) are good ideas. If it isn't there somewhere already, I would ask to add: 4) Expose all information used by autovacuum to form its decisions. 5) Expose a very easy way to discover autovacuum's opinion about a particular table, for example "table_needs_vacuum(oid)", ignoring any time constraints that may be in place. -Glen
Glen Parker wrote: > If it isn't there somewhere already, I would ask to add: > > 4) Expose all information used by autovacuum to form its decisions. You could argue that this is already there, although not easy to get at I suppose. But all table threshold settings are available either in the pg_autovacuum relation or the defaults via GUC variables, that plus a little math will get the information autovacuum uses to form its decisions. > 5) Expose a very easy way to discover autovacuum's opinion about a > particular table, for example "table_needs_vacuum(oid)", ignoring any > time constraints that may be in place. This might be a nice feature however in the presence of the much talked about but not yet developed maintenance window concept, I'm not sure how this should work. That is, during business hours the table doesn't need vacuuming, but it will when the evening maintenance window opens up.
Matthew O'Connor wrote: > Glen Parker wrote: > >If it isn't there somewhere already, I would ask to add: > > > >4) Expose all information used by autovacuum to form its decisions. > > You could argue that this is already there, although not easy to get at > I suppose. But all table threshold settings are available either in the > pg_autovacuum relation or the defaults via GUC variables, that plus a > little math will get the information autovacuum uses to form its decisions. No, we currently don't expose the number of dead tuples which autovacuum uses. > >5) Expose a very easy way to discover autovacuum's opinion about a > >particular table, for example "table_needs_vacuum(oid)", ignoring any > >time constraints that may be in place. > > This might be a nice feature however in the presence of the much talked > about but not yet developed maintenance window concept, I'm not sure how > this should work. That is, during business hours the table doesn't > need vacuuming, but it will when the evening maintenance window opens up. I intend to work on the maintenance window idea for 8.3. I'm not sure if I'll be able to introduce the worker process stuff in there as well. I actually haven't done much design on the stuff so I can't say. Now, if you (Matthew, or Glen as well!) were to work on that it'll be appreciated ;-) and we could team up. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera wrote: > Chris Browne wrote: > > matthew@zeut.net ("Matthew O'Connor") writes: > > > > > Bruce Momjian wrote: > > >> Yes, I think there are these TODO items. I was waiting to see what > > >> additional replies there are before adding them. > > > > > > Speaking of which, I was just looking at the TODO at: > > > > > > http://www.postgresql.org/docs/faqs.TODO.html > > > > > > and I think this item: > > > > > > * Improve xid wraparound detection by recording per-table rather than > > > per-database > > > > > > is done and working in 8.2 no? > > > > That's in the 8.2 release notes: > > > > - Track maximum XID age within individual tables, instead of whole > > databases (Alvaro) > > > > This reduces the overhead involved in preventing transaction ID > > wraparound, by avoiding unnecessary VACUUMs. > > Yeah, this is what the TODO item was about, so it certainly is done. OK, item removed. Thanks. -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Matthew O'Connor wrote: > Glen Parker wrote: >> If it isn't there somewhere already, I would ask to add: >> Expose a very easy way to discover autovacuum's opinion about a >> particular table, for example "table_needs_vacuum(oid)", ignoring any >> time constraints that may be in place. > > This might be a nice feature however in the presence of the much > talked about but not yet developed maintenance window concept, I'm not > sure how this should work. That is, during business hours the table > doesn't need vacuuming, but it will when the evening maintenance > window opens up. Well, what he's saying is, "Not taking into account any time/maintenance windows, does this table need vacuuming?" -- erik jones <erik@myemma.com> software development emma(r)
Erik Jones wrote: > Matthew O'Connor wrote: >> Glen Parker wrote: >>> If it isn't there somewhere already, I would ask to add: >>> Expose a very easy way to discover autovacuum's opinion about a >>> particular table, for example "table_needs_vacuum(oid)", ignoring any >>> time constraints that may be in place. >> >> This might be a nice feature however in the presence of the much >> talked about but not yet developed maintenance window concept, I'm not >> sure how this should work. That is, during business hours the table >> doesn't need vacuuming, but it will when the evening maintenance >> window opens up. > > Well, what he's saying is, "Not taking into account any time/maintenance > windows, does this table need vacuuming?" Correct. IOW, "does it need it?", not "would you actually do it at this time?"... -Glen
Alvaro Herrera wrote: >>> 4) Expose all information used by autovacuum to form its decisions. >> You could argue that this is already there, although not easy to get at >> I suppose. But all table threshold settings are available either in the >> pg_autovacuum relation or the defaults via GUC variables, that plus a >> little math will get the information autovacuum uses to form its decisions. > > No, we currently don't expose the number of dead tuples which autovacuum > uses. I'd prefer to get this working somehow before 8.3. In the mean time, is this information available at all? I assume a c function could get it, right? Any easier way? -Glen
Glen Parker wrote: > Alvaro Herrera wrote: > >>>4) Expose all information used by autovacuum to form its decisions. > >>You could argue that this is already there, although not easy to get at > >>I suppose. But all table threshold settings are available either in the > >>pg_autovacuum relation or the defaults via GUC variables, that plus a > >>little math will get the information autovacuum uses to form its > >>decisions. > > > >No, we currently don't expose the number of dead tuples which autovacuum > >uses. > > I'd prefer to get this working somehow before 8.3. In the mean time, is > this information available at all? I assume a c function could get it, > right? Any easier way? A C function would do. I don't think anything else would because we don't expose it at the SQL level. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera wrote: > Matthew O'Connor wrote: >> Glen Parker wrote: >>> 5) Expose a very easy way to discover autovacuum's opinion about a >>> particular table, for example "table_needs_vacuum(oid)", ignoring any >>> time constraints that may be in place. >> This might be a nice feature however in the presence of the much talked >> about but not yet developed maintenance window concept, I'm not sure how >> this should work. That is, during business hours the table doesn't >> need vacuuming, but it will when the evening maintenance window opens up. > > I intend to work on the maintenance window idea for 8.3. I'm not sure > if I'll be able to introduce the worker process stuff in there as well. > I actually haven't done much design on the stuff so I can't say. > > Now, if you (Matthew, or Glen as well!) were to work on that it'll be > appreciated ;-) and we could team up. I would like to get back into working on autovacuum, outside of discussions on the lists I haven't done anything since you took it from contrib. Anyway I am interested in helping if I can find the time but there is no chance of that happening in the next few weeks, but maybe in January. I think another thing to consider are good default values for the autovacuum vacuum delay settings. We talked about this a while ago, but I don't think we ever settled on anything.
Glen Parker wrote: > Erik Jones wrote: >> Matthew O'Connor wrote: >>> Glen Parker wrote: >>>> If it isn't there somewhere already, I would ask to add: >>>> Expose a very easy way to discover autovacuum's opinion about a >>>> particular table, for example "table_needs_vacuum(oid)", ignoring >>>> any time constraints that may be in place. >>> >>> This might be a nice feature however in the presence of the much >>> talked about but not yet developed maintenance window concept, I'm >>> not sure how this should work. That is, during business hours the >>> table doesn't need vacuuming, but it will when the evening >>> maintenance window opens up. > > >> Well, what he's saying is, "Not taking into account any >> time/maintenance windows, does this table need vacuuming?" > > Correct. IOW, "does it need it?", not "would you actually do it at this > time?"... I understand that, but it's a subjective question. The only question autovacuum answers is "Am I going to vacuum this table now?", so in the current setup you probably could create a function that answers your question, I was just pointing out in the future when maintenance windows get implemented that this question becomes less clear.
You're saying that the dirtyness of a table is proportional to when you plan on vacuuming it next. I don't see that connection at all. The only correlation I might see is if it happens to be 5:59 AM when your DB decides your table is dirty, and your maintenance window closes at 6:00 AM. Then you have to program the maintenance window to gracefully unplug the vacuum. Currently, autovacuum runs every minute and checks to see if any tables meet the requirements for vacuuming. Are the requirements the amount of time a vacuum would take, or the raw number of dirty tuples? One might be a function of the other, for sure, but exactly what does the autovacuumer use to decide when to clean? -- Brandon Aiken CS/IT Systems Engineer -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Matthew O'Connor Sent: Tuesday, December 19, 2006 5:37 PM To: Glen Parker Cc: Postgres general mailing list Subject: Re: [GENERAL] Autovacuum Improvements Glen Parker wrote: > Erik Jones wrote: >> Matthew O'Connor wrote: >>> Glen Parker wrote: >>>> If it isn't there somewhere already, I would ask to add: >>>> Expose a very easy way to discover autovacuum's opinion about a >>>> particular table, for example "table_needs_vacuum(oid)", ignoring >>>> any time constraints that may be in place. >>> >>> This might be a nice feature however in the presence of the much >>> talked about but not yet developed maintenance window concept, I'm >>> not sure how this should work. That is, during business hours the >>> table doesn't need vacuuming, but it will when the evening >>> maintenance window opens up. > > >> Well, what he's saying is, "Not taking into account any >> time/maintenance windows, does this table need vacuuming?" > > Correct. IOW, "does it need it?", not "would you actually do it at this > time?"... I understand that, but it's a subjective question. The only question autovacuum answers is "Am I going to vacuum this table now?", so in the current setup you probably could create a function that answers your question, I was just pointing out in the future when maintenance windows get implemented that this question becomes less clear. ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings
No, how dirty a table isn't subjective, what is subjective is the question "Does it need to be vacuumed?". A that is 1% dirty (to use your term) probably doesn't *need* to be vacuumed, but you might choose to vacuum it anyway at least you might at night when the system isn't in use. See the docs for the current requirements for autovacuum to issue a vacuum command. Matt Brandon Aiken wrote: > You're saying that the dirtyness of a table is proportional to when you > plan on vacuuming it next. I don't see that connection at all. The > only correlation I might see is if it happens to be 5:59 AM when your DB > decides your table is dirty, and your maintenance window closes at 6:00 > AM. Then you have to program the maintenance window to gracefully > unplug the vacuum. > > Currently, autovacuum runs every minute and checks to see if any tables > meet the requirements for vacuuming. Are the requirements the amount of > time a vacuum would take, or the raw number of dirty tuples? One might > be a function of the other, for sure, but exactly what does the > autovacuumer use to decide when to clean? > > -- > Brandon Aiken > CS/IT Systems Engineer > -----Original Message----- > From: pgsql-general-owner@postgresql.org > [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Matthew > O'Connor > Sent: Tuesday, December 19, 2006 5:37 PM > To: Glen Parker > Cc: Postgres general mailing list > Subject: Re: [GENERAL] Autovacuum Improvements > > Glen Parker wrote: >> Erik Jones wrote: >>> Matthew O'Connor wrote: >>>> Glen Parker wrote: >>>>> If it isn't there somewhere already, I would ask to add: >>>>> Expose a very easy way to discover autovacuum's opinion about a >>>>> particular table, for example "table_needs_vacuum(oid)", ignoring >>>>> any time constraints that may be in place. >>>> This might be a nice feature however in the presence of the much >>>> talked about but not yet developed maintenance window concept, I'm >>>> not sure how this should work. That is, during business hours the >>>> table doesn't need vacuuming, but it will when the evening >>>> maintenance window opens up. >> > >>> Well, what he's saying is, "Not taking into account any >>> time/maintenance windows, does this table need vacuuming?" >> Correct. IOW, "does it need it?", not "would you actually do it at > this >> time?"... > > I understand that, but it's a subjective question. The only question > autovacuum answers is "Am I going to vacuum this table now?", so in the > current setup you probably could create a function that answers your > question, I was just pointing out in the future when maintenance windows > > get implemented that this question becomes less clear. > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings >
Matthew O'Connor wrote: > No, how dirty a table isn't subjective, what is subjective is the > question "Does it need to be vacuumed?". A that is 1% dirty (to use > your term) probably doesn't *need* to be vacuumed, but you might choose > to vacuum it anyway at least you might at night when the system isn't in > use. This leads me further from wanting to see a simple time contraint added. I'd like to see something more dynamic. Perhaps define a "dirtiness" rating, and then allow a minimum "dirtiness" to be configured. When autovacuum wakes up, it could build a list of sufficiently dirty tables sorted in "dirtiness" order, and could call an optional user defined function for each one, passing it useful bits of information including each table's "dirtiness". The function could then decide whether to vacuum or not based on whatever constraints the admin dreamed up. It would then be a simple matter to expose a function that, given a table's OID, could report its "dirtiness" level. I think that explanation leaves room for refinement, but hopefully the idea makes sense :-) -Glen
Glen Parker wrote: > Matthew O'Connor wrote: >> No, how dirty a table isn't subjective, what is subjective is the >> question "Does it need to be vacuumed?". A that is 1% dirty (to use >> your term) probably doesn't *need* to be vacuumed, but you might >> choose to vacuum it anyway at least you might at night when the system >> isn't in use. > > This leads me further from wanting to see a simple time contraint added. > I'd like to see something more dynamic. > > Perhaps define a "dirtiness" rating, and then allow a minimum > "dirtiness" to be configured. When autovacuum wakes up, it could build > a list of sufficiently dirty tables sorted in "dirtiness" order, and > could call an optional user defined function for each one, passing it > useful bits of information including each table's "dirtiness". The > function could then decide whether to vacuum or not based on whatever > constraints the admin dreamed up. > > It would then be a simple matter to expose a function that, given a > table's OID, could report its "dirtiness" level. The idea that has been discussed in the past is the concept of maintenance windows, that is for any given period of time, you can set different vacuum thresholds. So at night you might make the thresholds very low so that nearly everything gets vacuumed but during the day you might only vacuum when something really needs it. This accomplishes what you are asking for in a more general way that can accommodate a wide variety of usage patterns.
Matthew O'Connor wrote: > Glen Parker wrote: > > Matthew O'Connor wrote: > >> No, how dirty a table isn't subjective, what is subjective is the > >> question "Does it need to be vacuumed?". A that is 1% dirty (to use > >> your term) probably doesn't *need* to be vacuumed, but you might > >> choose to vacuum it anyway at least you might at night when the system > >> isn't in use. > > > > This leads me further from wanting to see a simple time contraint added. > > I'd like to see something more dynamic. > > > > Perhaps define a "dirtiness" rating, and then allow a minimum > > "dirtiness" to be configured. When autovacuum wakes up, it could build > > a list of sufficiently dirty tables sorted in "dirtiness" order, and > > could call an optional user defined function for each one, passing it > > useful bits of information including each table's "dirtiness". The > > function could then decide whether to vacuum or not based on whatever > > constraints the admin dreamed up. > > > > It would then be a simple matter to expose a function that, given a > > table's OID, could report its "dirtiness" level. > > The idea that has been discussed in the past is the concept of > maintenance windows, that is for any given period of time, you can set > different vacuum thresholds. So at night you might make the thresholds > very low so that nearly everything gets vacuumed but during the day you > might only vacuum when something really needs it. This accomplishes > what you are asking for in a more general way that can accommodate a > wide variety of usage patterns. I wonder if the simple solution is to just have a cron script modify postgresql.conf and pg_ctl reload. That seems very flexible, or have two postgresql.conf files and move them into place via cron. -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
> I wonder if the simple solution is to just have a cron script modify > postgresql.conf and pg_ctl reload. That seems very flexible, or have > two postgresql.conf files and move them into place via cron. I'd still prefer to vacuum on demand actually. Rather than hope that autovacuum hit all the nastiest tables, I'd like to be able to record the fact that tables (x,y,z) were vacuumed and how long it took. I want the logic autovacuum uses to determine if a table needs vacuuming, but I'd rather do the actual vacuuming myself. I'd also like to use some of this information to issue reindex and cluster commands only when they're needed. In fact, on days when I cluster, there's no need whatsoever to also vacuum those tables. This is something that autovacuum won't do at all. If the best I got was access to the same information autovacuum uses to make its decisions, I'd be pretty happy. -Glen
> The idea that has been discussed in the past is the concept of > maintenance windows, that is for any given period of time, you can set > different vacuum thresholds. So at night you might make the thresholds > very low so that nearly everything gets vacuumed but during the day you > might only vacuum when something really needs it. This accomplishes > what you are asking for in a more general way that can accommodate a > wide variety of usage patterns. That really seems like something that, if it's powerful, would also be very complicated. If the autovacuum system could just call a user defined function, all the complexity could be dropped back into the admin's lap (which is fine with me :-). I am of course ASSuming that it would be simple for autovacuum to call a function if it exists, or take a default action if it does not... -Glen
Glen Parker wrote: > >The idea that has been discussed in the past is the concept of > >maintenance windows, that is for any given period of time, you can set > >different vacuum thresholds. So at night you might make the thresholds > >very low so that nearly everything gets vacuumed but during the day you > >might only vacuum when something really needs it. This accomplishes > >what you are asking for in a more general way that can accommodate a > >wide variety of usage patterns. > > That really seems like something that, if it's powerful, would also be > very complicated. If the autovacuum system could just call a user > defined function, all the complexity could be dropped back into the > admin's lap (which is fine with me :-). I have a quote by Larry Wall about something similar: "In fact, the basic problem with Perl 5's subroutines is that they're not crufty enough, so the cruft leaks out into user-defined code instead, by the Conservation of Cruft Principle." (Larry Wall, Apocalypse 6) With the system described above, you can have it very simple by just not configuring anything. Or you can have a very complex scenario involving holidays and weekends and off-hours and "the two hours of the month when you do all the nasty stuff" by doing a very elaborate and complicated setup. Or you could have a middle ground just defining "off hours" (weekends and nights) which would be just a couple of commands. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Glen Parker wrote: >> I wonder if the simple solution is to just have a cron script modify >> postgresql.conf and pg_ctl reload. That seems very flexible, or have >> two postgresql.conf files and move them into place via cron. > > I'd still prefer to vacuum on demand actually. Rather than hope that > autovacuum hit all the nastiest tables, I'd like to be able to record > the fact that tables (x,y,z) were vacuumed and how long it took. I want > the logic autovacuum uses to determine if a table needs vacuuming, but > I'd rather do the actual vacuuming myself. > > I'd also like to use some of this information to issue reindex and > cluster commands only when they're needed. In fact, on days when I > cluster, there's no need whatsoever to also vacuum those tables. This > is something that autovacuum won't do at all. > > If the best I got was access to the same information autovacuum uses to > make its decisions, I'd be pretty happy Well take a look at the autovacuum code, also, before autovacuum was integrated into core it existed as a libpq based contrib application so you can look in one of the older branches for that code.
Alvaro Herrera wrote: > Glen Parker wrote: >> That really seems like something that, if it's powerful, would also be >> very complicated. If the autovacuum system could just call a user >> defined function, all the complexity could be dropped back into the >> admin's lap (which is fine with me :-). > > I have a quote by Larry Wall about something similar: > > "In fact, the basic problem with Perl 5's subroutines is that they're > not crufty enough, so the cruft leaks out into user-defined code > instead, by the Conservation of Cruft Principle." > (Larry Wall, Apocalypse 6) > > With the system described above, you can have it very simple by just not > configuring anything. Or you can have a very complex scenario involving > holidays and weekends and off-hours and "the two hours of the month when > you do all the nasty stuff" by doing a very elaborate and complicated > setup. Or you could have a middle ground just defining "off hours" > (weekends and nights) which would be just a couple of commands. I would go one step further and suggest that when the maintenance window system gets completed we give it a default setup of midnight to 6AM or something like that.
Glen Parker wrote: > I'd also like to use some of this information to issue reindex and > cluster commands only when they're needed. In fact, on days when I > cluster, there's no need whatsoever to also vacuum those tables. This > is something that autovacuum won't do at all. Well, I'd rather fix CLUSTER so that it reports that there are currently no dead tuples in the table. This should be an easy patch that, hey, maybe you could contribute ;-) -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera wrote: > Glen Parker wrote: > >> I'd also like to use some of this information to issue reindex and >> cluster commands only when they're needed. In fact, on days when I >> cluster, there's no need whatsoever to also vacuum those tables. This >> is something that autovacuum won't do at all. > > Well, I'd rather fix CLUSTER so that it reports that there are currently > no dead tuples in the table. This should be an easy patch that, hey, > maybe you could contribute ;-) I knew that would come up eventually :D. Believe me, I'd love to dig in, but I haven't had a postgres dev tree on my system in years and I don't have any time for it right now. (But there's always time to banter in the mailing list, right? :-) ) Actually what I meant was that autovacuum will always just vacuum. In some cases I'd rather it cluster instead. But on that note, it would also be madly useful to give CLUSTER the ability to ANALYZE as well. -Glen
Brandon Aiken wrote: > You're saying that the dirtyness of a table is proportional to when you > plan on vacuuming it next. The dirtiness of a table should most certainly have an effect on when it gets vacuumed in relation to other tables. If dirtiness could be rated, then the list of vacuumable tables could be sorted, vacuuming really dirty tables before less dirty ones. Now, if I could get my hands on that rating for any given table, then I could write a night time script that would vacuum the dirtiest tables, in order, until either I run out of dirty tables, or I run out of time. In fact, if autovacuum did just that, then I might be inclined to attack the problem with the "update postgresql.conf, pgctl" approach. At least then I'd know that even though ALL the dirty tables might not get cleaned every night, at least the worst ones would. -Glen
Glen Parker wrote: > Alvaro Herrera wrote: > >Glen Parker wrote: > > > >>I'd also like to use some of this information to issue reindex and > >>cluster commands only when they're needed. In fact, on days when I > >>cluster, there's no need whatsoever to also vacuum those tables. This > >>is something that autovacuum won't do at all. > > > >Well, I'd rather fix CLUSTER so that it reports that there are currently > >no dead tuples in the table. This should be an easy patch that, hey, > >maybe you could contribute ;-) > > I knew that would come up eventually :D. Believe me, I'd love to dig > in, but I haven't had a postgres dev tree on my system in years and I > don't have any time for it right now. (But there's always time to > banter in the mailing list, right? :-) ) Heh :-) > Actually what I meant was that autovacuum will always just vacuum. In > some cases I'd rather it cluster instead. But on that note, it would > also be madly useful to give CLUSTER the ability to ANALYZE as well. Well, the problem is that CLUSTER takes an exclusive lock. That's a no-no for an automatic job I think. On doing an ANALYZE in conjunction with the CLUSTER, that's would be certainly interesting, but it'd require that we refactor ANALYZE so as to be able to "reuse" another scan. It would be nice to do it anyway and allow it to reuse VACUUM's scan as well, so that it doesn't have to do a separate one. (And it would also be nicer because it'd would be able to determine n_distinct more accurately, which is something people often wish for.) -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On 19/12/06, Chris Browne <cbbrowne@acm.org> wrote: > matthew@zeut.net ("Matthew O'Connor") writes: > > 2) Once we can have multiple autovacuum workers: Create the concept of > > hot tables that require more attention and should never be ignored for > > more that X minutes, perhaps have one "autovacuum worker" per hot > > table? (What do people think of this?) > > One worker per "hot table" seems like overkill to me; you could chew > up a lot of connections that way, which could be a DOS. Sounds like a max workers config varible would work quite well here. Bit like the max connections varible. If we run out of workers we just have to wait for one to finish. I think we need one daemon to analyse what needs vacuuming and then lauch workers to do the actual work.. Peter Childs > > That you have a "foot gun" is guaranteed; I think I'd rather that it > come in the form that choosing the "hot list" badly hurts the rate of > vacuuming than that we have a potential to chew up numbers of > connections (which is a relatively non-renewable resource). > -- > (format nil "~S@~S" "cbbrowne" "cbbrowne.com") > http://linuxdatabases.info/info/ > There are no "civil aviation for dummies" books out there and most of > you would probably be scared and spend a lot of your time looking up > if there was one. :-) -- Jordan Hubbard in c.u.b.f.m > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings >
Glen Parker wrote: > Brandon Aiken wrote: >> You're saying that the dirtyness of a table is proportional to when you >> plan on vacuuming it next. > > The dirtiness of a table should most certainly have an effect on when it > gets vacuumed in relation to other tables. If dirtiness could be rated, > then the list of vacuumable tables could be sorted, vacuuming really > dirty tables before less dirty ones. Wouldn't it be better to prefer small(er) less dirty tables over large to huge dirty tables? They'd be done quickly, and as it's inside the "maintenance window" there is apparently time to. It'd probably need some logic as to how many of those small tables get vacuumed first (the total amount should be significantly less than the big table(s)). My rationale: if the really dirty huge table doesn't get finished in time anyway, autovacuum might as well spend some time on other tables. The net performance after the maintenance window will probably be better. -- Alban Hertroys alban@magproductions.nl magproductions b.v. T: ++31(0)534346874 F: ++31(0)534346876 M: I: www.magproductions.nl A: Postbus 416 7500 AK Enschede // Integrate Your World //
Alvaro Herrera wrote: > Matthew O'Connor wrote: > >> Glen Parker wrote: >> >>> If it isn't there somewhere already, I would ask to add: >>> >>> 4) Expose all information used by autovacuum to form its decisions. >>> >> You could argue that this is already there, although not easy to get at >> I suppose. But all table threshold settings are available either in the >> pg_autovacuum relation or the defaults via GUC variables, that plus a >> little math will get the information autovacuum uses to form its decisions. >> > > No, we currently don't expose the number of dead tuples which autovacuum > uses. > > >>> 5) Expose a very easy way to discover autovacuum's opinion about a >>> particular table, for example "table_needs_vacuum(oid)", ignoring any >>> time constraints that may be in place. >>> >> This might be a nice feature however in the presence of the much talked >> about but not yet developed maintenance window concept, I'm not sure how >> this should work. That is, during business hours the table doesn't >> need vacuuming, but it will when the evening maintenance window opens up. >> > > I intend to work on the maintenance window idea for 8.3. I'm not sure > if I'll be able to introduce the worker process stuff in there as well. > I actually haven't done much design on the stuff so I can't say. > > What does a maintenance window mean? I am slightly fearful that it as other improvements to vacuum are made, it will change it's meaning. There has been discussion about a bitmap of dirty pages in a relation for vacuum to clean. Do that effect what maintenance means? eg. Does maintenance mean I can only scan the whole relation for XID wrap in maintenance mode and not during non-maintenance time. Does it mean we don't vacuum at all in non-maintenance mode. Or do we just have a different set of thresholds during maintenance. Further to this was a patch a long time ago for partial vacuum, which only vacuumed part of the relation. It was rejected on grounds of not helping as the index cleanup is the expensive part. My view on this is, if with a very large table you should be able to vacuum until you fill your maintenance work mem. You are then forced to process indexes. Then that is a good time to stop vacuuming. You would have to start the process again effectively. This may also change the meaning of maintenance window. Again, only full relation scans in maintenance times, possibly something else entirely. I am unsure of what the long term goal of the maintenance window is. I understand it's to produce a time when vacuum is able to be more aggressive on the system. But I do not know what that means in light of other improvements such as those listed above. Coming up with a method for maintenance window that just used a separate set of thresholds is one option. However is that the best thing to do. Some clarity here from others would probably help. But I also think we need to consider the big picture of where vacuum is going before inventing a mechanism that may not mean anything come 8.4 > Now, if you (Matthew, or Glen as well!) were to work on that it'll be > appreciated ;-) and we could team up. > I am happy to try and put in some design thought and discussion with others to come up with something that will work well. Regards Russell Smith
Alvaro Herrera wrote: > No, we currently don't expose the number of dead tuples which autovacuum > uses. Patch submitted :-)
Russell Smith wrote: > Alvaro Herrera wrote: >> I intend to work on the maintenance window idea for 8.3. I'm not sure >> if I'll be able to introduce the worker process stuff in there as well. >> I actually haven't done much design on the stuff so I can't say. >> >> > What does a maintenance window mean? I am slightly fearful that it as > other improvements to vacuum are made, it will change it's meaning. The maintenance window design as I understand it (Alvaro chime in if I get this wrong) is that we will be able to specify blocks of time that are assigned specific autovacuum settings. For example we might define a maintenance window of Sunday morning from 1AM - 8AM, during that time all autvacuum thresholds will be dropped to .01, that way everything will get vacuumed that needs it during that window. Outside of the window default autovacuum settings apply. > There has been discussion about a bitmap of dirty pages in a relation > for vacuum to clean. Do that effect what maintenance means? eg. > Does maintenance mean I can only scan the whole relation for XID wrap > in maintenance mode and not during non-maintenance time. Does it mean > we don't vacuum at all in non-maintenance mode. Or do we just have a > different set of thresholds during maintenance. Different thresholds during maintenance window. > Further to this was a patch a long time ago for partial vacuum, which > only vacuumed part of the relation. It was rejected on grounds of not > helping as the index cleanup is the expensive part. My view on this > is, if with a very large table you should be able to vacuum until you > fill your maintenance work mem. You are then forced to process > indexes. Then that is a good time to stop vacuuming. You would have > to start the process again effectively. This may also change the > meaning of maintenance window. Again, only full relation scans in > maintenance times, possibly something else entirely. I'm not sure how partial vacuums will effect autovacuum, but I'm not going to worry about it until it gets accepted which I don't think is going to happen anytime soon. BTW, when a vacuum starts during a maintenance window but doesn't finish before the window closes, I think it should continue running, however I believe the default vacuum delay setting will apply which could be setup to help reduce the impact that vacuum has outside the maintenance window. > I am unsure of what the long term goal of the maintenance window is. > I understand it's to produce a time when vacuum is able to be more > aggressive on the system. But I do not know what that means in light > of other improvements such as those listed above. Coming up with a > method for maintenance window that just used a separate set of > thresholds is one option. However is that the best thing to do. Some > clarity here from others would probably help. But I also think we > need to consider the big picture of where vacuum is going before > inventing a mechanism that may not mean anything come 8.4 I think for now all we are talking about are different thresholds. As newer vacuuming options become available we should consider how they apply, but we aren't there yet.
On Thu, 2006-12-21 at 18:03, Matthew T. O'Connor wrote: > The maintenance window design as I understand it (Alvaro chime in if I > get this wrong) is that we will be able to specify blocks of time that > are assigned specific autovacuum settings. For example we might define > a maintenance window of Sunday morning from 1AM - 8AM, during that time > all autvacuum thresholds will be dropped to .01, that way everything > will get vacuumed that needs it during that window. Outside of the > window default autovacuum settings apply. Changing thresholds is not a viable solution for all the cases. If I have a huge table with many indexes, I still don't want to vacuum it unless there are a significant amount of dead pages so that the sequential scan of it and it's indexes pays off. In this case dropping the autovacuum threshold would be totally counterproductive even at night. This solution would only rule out really static tables, which don't change almost at all. In real life there are many more possible data access scenarios... From all the discussion here I think the most benefit would result from a means to assign tables to different categories, and set up separate autovacuum rules per category (be it time window when vacuuming is allowed, autovacuum processes assigned, cost settings, etc). I doubt you can really define upfront all the vacuum strategies you would need in real life, so why not let the user define it ? Define the categories by assigning tables to them, and the rules per category. Then you can decide what rules to implement, and what should be the defaults... Cheers, Csaba.
> I intend to work on the maintenance window idea for 8.3. I'm not sure > if I'll be able to introduce the worker process stuff in there as well. > I actually haven't done much design on the stuff so I can't say. Something to consider, per-day maintenance windows, where Sat & Sun could have 24-hour window, and trying to vacuum largest tables during longest windows. This wouldn't work for every server, but for many... -- Scott Ribe scott_ribe@killerbytes.com http://www.killerbytes.com/ (303) 722-0567 voice
Csaba Nagy wrote: > On Thu, 2006-12-21 at 18:03, Matthew T. O'Connor wrote: > > The maintenance window design as I understand it (Alvaro chime in if I > > get this wrong) is that we will be able to specify blocks of time that > > are assigned specific autovacuum settings. For example we might define > > a maintenance window of Sunday morning from 1AM - 8AM, during that time > > all autvacuum thresholds will be dropped to .01, that way everything > > will get vacuumed that needs it during that window. Outside of the > > window default autovacuum settings apply. > > Changing thresholds is not a viable solution for all the cases. My idea was to change autovacuum parameters in general, for example the "enable" parameter. And the idea is to be able to do it per-table (or rather, per table group), so you can group all your mostly-static tables in a group and then say "I don't want this group to be vacuumed". > From all the discussion here I think the most benefit would result from > a means to assign tables to different categories, and set up separate > autovacuum rules per category (be it time window when vacuuming is > allowed, autovacuum processes assigned, cost settings, etc). I doubt you > can really define upfront all the vacuum strategies you would need in > real life, so why not let the user define it ? Define the categories by > assigning tables to them, and the rules per category. Then you can > decide what rules to implement, and what should be the defaults... Hmm, yeah, I think this is more or less what I have in mind. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Thu, 2006-12-21 at 18:41, Alvaro Herrera wrote: > > From all the discussion here I think the most benefit would result from > > a means to assign tables to different categories, and set up separate > > autovacuum rules per category (be it time window when vacuuming is > > allowed, autovacuum processes assigned, cost settings, etc). I doubt you > > can really define upfront all the vacuum strategies you would need in > > real life, so why not let the user define it ? Define the categories by > > assigning tables to them, and the rules per category. Then you can > > decide what rules to implement, and what should be the defaults... > > Hmm, yeah, I think this is more or less what I have in mind. Cool :-) Can I suggest to also consider the idea of some kind of autovacuum process group, with settings like: - number of processes running in parallel; - time windows when they are allowed to run; Then have the table categories with all the rest of the threshold/cost/delay settings. Then have the possibility to assign tables to categories, and to assign categories to processing groups. I think this would allow the most flexibility with the minimum of repetition in settings (from the user perspective). Cheers, Csaba.
After takin a swig o' Arrakan spice grog, nagy@ecircle-ag.com (Csaba Nagy) belched out: > On Thu, 2006-12-21 at 18:41, Alvaro Herrera wrote: >> > From all the discussion here I think the most benefit would result from >> > a means to assign tables to different categories, and set up separate >> > autovacuum rules per category (be it time window when vacuuming is >> > allowed, autovacuum processes assigned, cost settings, etc). I doubt you >> > can really define upfront all the vacuum strategies you would need in >> > real life, so why not let the user define it ? Define the categories by >> > assigning tables to them, and the rules per category. Then you can >> > decide what rules to implement, and what should be the defaults... >> >> Hmm, yeah, I think this is more or less what I have in mind. > > Cool :-) > > Can I suggest to also consider the idea of some kind of autovacuum > process group, with settings like: > > - number of processes running in parallel; > - time windows when they are allowed to run; > > Then have the table categories with all the rest of the > threshold/cost/delay settings. > > Then have the possibility to assign tables to categories, and to assign > categories to processing groups. > > I think this would allow the most flexibility with the minimum of > repetition in settings (from the user perspective). Seems to me that you could get ~80% of the way by having the simplest "2 queue" implementation, where tables with size < some threshold get thrown at the "little table" queue, and tables above that size go to the "big table" queue. That should keep any small tables from getting "vacuum-starved." I'd think the next step would be to increase the number of queues, perhaps in a time-based fashion. There might be times when it's acceptable to vacuum 5 tables at once, so you burn thru little tables "like the blazes," and handle larger ones fairly promptly. And other times when you don't want to do *any* big tables, and limit a single queue to just the itty bitty ones. This approach allows you to stay mostly heuristic-based, as opposed to having to describe policies in gratuitous detail. Having a mechanism that requires enormous DBA effort and where there is considerable risk of simple configuration errors that will be hard to notice may not be the best kind of "feature" :-). -- let name="cbbrowne" and tld="gmail.com" in name ^ "@" ^ tld;; http://linuxdatabases.info/info/slony.html "You can measure a programmer's perspective by noting his attitude on the continuing vitality of FORTRAN." -- Alan Perlis
> Having a mechanism that requires enormous DBA effort and where there > is considerable risk of simple configuration errors that will be hard > to notice may not be the best kind of "feature" :-). Why not? It seems to have worked remarkably well for the market leader ;-) -- Scott Ribe scott_ribe@killerbytes.com http://www.killerbytes.com/ (303) 722-0567 voice
Christopher Browne wrote: > Seems to me that you could get ~80% of the way by having the simplest > "2 queue" implementation, where tables with size < some threshold get > thrown at the "little table" queue, and tables above that size go to > the "big table" queue. > > That should keep any small tables from getting "vacuum-starved." Hmm, would it make sense to keep 2 queues, one that goes through the tables in smaller-to-larger order, and the other one in the reverse direction? I am currently writing a design on how to create "vacuum queues" but I'm thinking that maybe it's getting too complex to handle, and a simple idea like yours is enough (given sufficient polish). -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
A long time ago, in a galaxy far, far away, alvherre@commandprompt.com (Alvaro Herrera) wrote: > Christopher Browne wrote: > >> Seems to me that you could get ~80% of the way by having the >> simplest "2 queue" implementation, where tables with size < some >> threshold get thrown at the "little table" queue, and tables above >> that size go to the "big table" queue. >> >> That should keep any small tables from getting "vacuum-starved." > > Hmm, would it make sense to keep 2 queues, one that goes through the > tables in smaller-to-larger order, and the other one in the reverse > direction? Interesting approach; that would mean having just one priority queue for all the work. That seems to simplify things a bit, which is a good thing. Unifying policies further might have some merit, too. The worker processes (that do the vacuuming) could be set up to alternate between head and tail of the queue. That is, a worker process could vacuum the littlest table and then go after the biggest table. That way, they'd eat at both ends towards the middle. Adding more workers could easily add to the speed at which both ends of the queue get eaten (assuming you've got the I/O to support having 4 or 5 vacuums running concurrently). There is one thing potentially bad, with that; the thing we never want is for all the workers to get busy on the biggest tables so that little ones are no longer being serviced. So there needs to be a way to make sure that there's one worker devoted to "little tables." I suppose the rule may be that the 1st worker process *never* goes after the biggest tables. That ought to be enough to prevent starvation. > I am currently writing a design on how to create "vacuum queues" but > I'm thinking that maybe it's getting too complex to handle, and a > simple idea like yours is enough (given sufficient polish). There's plenty to like about coming up with a reasonable set of heuristics... -- output = ("cbbrowne" "@" "acm.org") http://linuxdatabases.info/info/slony.html Rules of the Evil Overlord #191. "I will not appoint a relative to my staff of advisors. Not only is nepotism the cause of most breakdowns in policy, but it also causes trouble with the EEOC." <http://www.eviloverlord.com/>
A long time ago, in a galaxy far, far away, alvherre@commandprompt.com (Alvaro Herrera) wrote: > Christopher Browne wrote: > >> Seems to me that you could get ~80% of the way by having the >> simplest "2 queue" implementation, where tables with size < some >> threshold get thrown at the "little table" queue, and tables above >> that size go to the "big table" queue. >> >> That should keep any small tables from getting "vacuum-starved." > > Hmm, would it make sense to keep 2 queues, one that goes through the > tables in smaller-to-larger order, and the other one in the reverse > direction? Interesting approach; that would mean having just one priority queue for all the work. That seems to simplify things a bit, which is a good thing. Unifying policies further might have some merit, too. The worker processes (that do the vacuuming) could be set up to alternate between head and tail of the queue. That is, a worker process could vacuum the littlest table and then go after the biggest table. That way, they'd eat at both ends towards the middle. Adding more workers could easily add to the speed at which both ends of the queue get eaten (assuming you've got the I/O to support having 4 or 5 vacuums running concurrently). There is one thing potentially bad, with that; the thing we never want is for all the workers to get busy on the biggest tables so that little ones are no longer being serviced. So there needs to be a way to make sure that there's one worker devoted to "little tables." I suppose the rule may be that the 1st worker process *never* goes after the biggest tables. That ought to be enough to prevent starvation. > I am currently writing a design on how to create "vacuum queues" but > I'm thinking that maybe it's getting too complex to handle, and a > simple idea like yours is enough (given sufficient polish). There's plenty to like about coming up with a reasonable set of heuristics... -- output = ("cbbrowne" "@" "acm.org") http://linuxdatabases.info/info/slony.html Rules of the Evil Overlord #191. "I will not appoint a relative to my staff of advisors. Not only is nepotism the cause of most breakdowns in policy, but it also causes trouble with the EEOC." <http://www.eviloverlord.com/>
On Sun, 2006-12-24 at 03:03, Christopher Browne wrote: [snip] > Seems to me that you could get ~80% of the way by having the simplest > "2 queue" implementation, where tables with size < some threshold get > thrown at the "little table" queue, and tables above that size go to > the "big table" queue. That would most definitely not cut it for me, I have more than 2 categories of tables: - a few small but very often updated/inserted/deleted table: these must be continuously vacuumed, your "little queue" is not good enough for that, as even the round trip between the small tables could lead to bloat on them; - a few small and moderately updated, that could live with the "little queue"; - a few big and frequently updated, but which only have a small percentage of rows actively updated at any time: those could live with the big queue; - the rest which are rarely updated, I would put those in a separate queue so they won't affect the rest, cause vacuuming them is really mostly not critical; The point is that I'm not sure there couldn't be even more reasons to split the tables in even more queues based on the importance of vacuuming them combined with update rate and their size. If I can set up my own queues I can experiment with what works best for me... for the base setup you could set up some default queues. I wonder though how would you handle dynamics of tables, I mean when will a small table which grows start to be considered a big table for the purpose of putting it in one queue or the other ? I guess it would be done on analyzing the table, which is also handled by autovacuum, so tables with no vacuum queue settings could go to one of the 2 default queues you mention. > That should keep any small tables from getting "vacuum-starved." > > I'd think the next step would be to increase the number of queues, > perhaps in a time-based fashion. There might be times when it's > acceptable to vacuum 5 tables at once, so you burn thru little tables > "like the blazes," and handle larger ones fairly promptly. And other > times when you don't want to do *any* big tables, and limit a single > queue to just the itty bitty ones. This is all nice and it would be cool if you could set it up per vacuum queue. I mean how much more effort would be to allow vacuum queues with generic settings like time windows with max number of threads for each window, and let the user explicitly assign tables to those queues, instead of hard coding the queues and their settings and assign tables to them based on size or any other heuristics ? For the average application which needs simple settings, there could be a default setup with the 2 queues you mention. If it would be possible to set up some rules to assign tables to queues based on their properties on analyze time, instead of explicitly assigning to one queue or other, that would be nice too, and then you can completely cover the default setup with those settings, and allow for more complex setups for those who need it. > This approach allows you to stay mostly heuristic-based, as opposed to > having to describe policies in gratuitous detail. I agree that for simple setups that would be OK, but like I said, if it would be easy enough to code that heuristics, and provide some sane setup as default, and then let the user optimize it, that would be a cool solution. Now it's true I don't really know how would you code 'assign all tables which are smaller than x rows to vacuum queue "little-tables"' ... maybe by providing a function to the queue which "matches" on the table ? And you can change that function ? No idea, but it probably can be done... > Having a mechanism that requires enormous DBA effort and where there > is considerable risk of simple configuration errors that will be hard > to notice may not be the best kind of "feature" :-). I think most people will not want to touch the default settings unless it will not work good enough for them. I definitely not like too much that I had to set up some cron jobs beside autovacuum, as they are most definitely not doing optimal job, but autovacuum was not doing that either, and I'm afraid a 2-queue system would also not do it at least for the queue-like tables I have, which must be vacuumed continuously, but only if they need it... that's what I expect from autovacuum, to vacuum all tables in the proper periodicity/time window for each of them, but only if they need it... and I can imagine way more such periodicity/time window settings than 2. Now if autovacuum could figure out on itself all those settings, that would be even cooler, but if I can set it up myself that would be good enough. Actually I think all vacuum patterns could be automatically figured out by looking at the statistics AND the dynamics of those statistics (i.e. it changes in bursts, or steadily increasing over time, etc.), and possibly also the read access statistics (there's no big reward in too frequently vacuuming a table which is only inserted and deleted and rarely read), and perhaps some hints from the user about speed requirements for specific tables. The problem with all this is that I doubt there is enough experience to write such a heuristics to optimally cover all situations, and even if there were, it could result in really complex code, so that's why I think it is more reasonable to let people set up their vacuum queues... Another point: it would be nice if autovacuum could also decide to do a full vacuum, or even better a CLUSTER under certain circumstances for tables which are badly bloated (you could argue that should never happen if autovacuum is set up properly, but think about a queue-like table heavily updated during a backup is running). Of course this can also backfire if set up by default, so I guess this would have to be set up explicitly... possibly with rules like table size, max bloat allowed, time window, etc. One thing to avoid badly locking the application is to acquire an exclusive lock with nowait and only do the full vacuum if the lock succeeds (to avoid situations like: a backup is running, and it will for the next 2 hours, we ask for an exclusive lock, will stay on hold, but in the same time we lock all new read requests for the next 2 hours till the backup is done... while the operation we wanted to do is guarantied to be finished in 10 seconds, as the table is heavily bloated but still small). Another thing autovacuum could figure out is not to do a vacuum at all if there is a long running transaction running and disabling anyway the work vacuum would do (although I'm not sure it does not do this one already, does it ?). Well, maybe not all what I rambled along makes sense, but I dumped my brain now anyway... hope I didn't bore you too much :-) Cheers, Csaba.
nagy@ecircle-ag.com (Csaba Nagy) writes: > On Sun, 2006-12-24 at 03:03, Christopher Browne wrote: > [snip] >> Seems to me that you could get ~80% of the way by having the simplest >> "2 queue" implementation, where tables with size < some threshold get >> thrown at the "little table" queue, and tables above that size go to >> the "big table" queue. > > That would most definitely not cut it for me, I have more than 2 > categories of tables: > > - a few small but very often updated/inserted/deleted table: these must > be continuously vacuumed, your "little queue" is not good enough for > that, as even the round trip between the small tables could lead to > bloat on them; I disagree; if we added more "work processes," that could eat quickly through the short end of the queue. > - a few small and moderately updated, that could live with the "little > queue"; > > - a few big and frequently updated, but which only have a small > percentage of rows actively updated at any time: those could live with > the big queue; > > - the rest which are rarely updated, I would put those in a separate > queue so they won't affect the rest, cause vacuuming them is really > mostly not critical; > > The point is that I'm not sure there couldn't be even more reasons > to split the tables in even more queues based on the importance of > vacuuming them combined with update rate and their size. If I can > set up my own queues I can experiment with what works best for > me... for the base setup you could set up some default queues. I > wonder though how would you handle dynamics of tables, I mean when > will a small table which grows start to be considered a big table > for the purpose of putting it in one queue or the other ? I guess it > would be done on analyzing the table, which is also handled by > autovacuum, so tables with no vacuum queue settings could go to one > of the 2 default queues you mention. The heuristic I was thinking of didn't involve having two queues, but rather just 1. By having some size information, work processes could eat at the queue from both ends. If you have cases where tables need to be vacuumed *really* frequently, then you make sure that they are being injected frequently, and that some of the workers are tied to Just Doing Small Tables. I think that *does* cover your scenario quite adequately, and without having to get into having a bunch of queues. The heuristic is incomplete in one other fashion, namely that it doesn't guarantee that tables in the middle will ever get "gotten to." That mandates having a third policy, namely to have a worker that goes through tables in the (singular) queue some form of chronological order. >> That should keep any small tables from getting "vacuum-starved." >> >> I'd think the next step would be to increase the number of queues, >> perhaps in a time-based fashion. There might be times when it's >> acceptable to vacuum 5 tables at once, so you burn thru little tables >> "like the blazes," and handle larger ones fairly promptly. And other >> times when you don't want to do *any* big tables, and limit a single >> queue to just the itty bitty ones. > > This is all nice and it would be cool if you could set it up per vacuum > queue. I mean how much more effort would be to allow vacuum queues with > generic settings like time windows with max number of threads for each > window, and let the user explicitly assign tables to those queues, > instead of hard coding the queues and their settings and assign tables > to them based on size or any other heuristics ? > > For the average application which needs simple settings, there could be > a default setup with the 2 queues you mention. If it would be possible > to set up some rules to assign tables to queues based on their > properties on analyze time, instead of explicitly assigning to one queue > or other, that would be nice too, and then you can completely cover the > default setup with those settings, and allow for more complex setups for > those who need it. My thinking has headed more towards simplifying this; two queues seems to be one too many :-). >> This approach allows you to stay mostly heuristic-based, as opposed to >> having to describe policies in gratuitous detail. > > I agree that for simple setups that would be OK, but like I said, if it > would be easy enough to code that heuristics, and provide some sane > setup as default, and then let the user optimize it, that would be a > cool solution. > > Now it's true I don't really know how would you code 'assign all tables > which are smaller than x rows to vacuum queue "little-tables"' ... maybe > by providing a function to the queue which "matches" on the table ? And > you can change that function ? No idea, but it probably can be done... Based on the three policies I've seen, it could make sense to assign worker policies: 1. You have a worker that moves its way through the queue in some sort of sequential order, based on when the table is added to the queue, to guarantee that all tables get processed, eventually. 2. You have workers that always pull the "cheapest" tables in the queue, perhaps with some sort of upper threshold that they won't go past. 3. You have workers that alternate between eating from the two ends of the queue. Only one queue is needed, and there's only one size parameter involved. Having multiple workers of type #2 seems to me to solve the problem you're concerned about. -- (format nil "~S@~S" "cbbrowne" "cbbrowne.com") http://cbbrowne.com/info/spiritual.html <a href="http://www.netizen.com.au/">thorfinn@netizen.com.au</a> Millihelen, adj: The amount of beauty required to launch one ship.
On Mon, 2007-01-08 at 22:29, Chris Browne wrote: [snip] > Based on the three policies I've seen, it could make sense to assign > worker policies: > > 1. You have a worker that moves its way through the queue in some sort of > sequential order, based on when the table is added to the queue, to > guarantee that all tables get processed, eventually. > > 2. You have workers that always pull the "cheapest" tables in the > queue, perhaps with some sort of upper threshold that they won't go > past. > > 3. You have workers that alternate between eating from the two ends of the > queue. > > Only one queue is needed, and there's only one size parameter > involved. > Having multiple workers of type #2 seems to me to solve the problem > you're concerned about. This sounds better, but define "cheapest" in #2... I actually want to continuously vacuum tables which are small, heavily recycled (insert/update/delete), and which would bloat quickly. So how do you define the cost function for having these tables the "cheapest" ? And how will you define the worker thread count policy ? Always 1 worker per category, or you can define the number of threads in the 3 categories ? Or you still have in mind time window policies with allowed number of threads per worker category ? (those numbers could be 0 to disable a a worker category). Other thing, how will the vacuum queue be populated ? Or the "queue" here means nothing, all workers will always go throughall tables to pick one based on their own criteria ? My concern here is that the current way of checking 1 DB perminute is not going to work with category #2 tables, they really have to be vacuumed continuously sometimes. Cheers, Csaba.
Csaba Nagy wrote: > Other thing, how will the vacuum queue be populated ? Or the "queue" here means nothing, all workers will always go throughall tables to pick one based on their own criteria ? My concern here is that the current way of checking 1 DB perminute is not going to work with category #2 tables, they really have to be vacuumed continuously sometimes. Without getting into all the details, the autovacuum naptime is a GUC variable right now, so it can be much more frequent than the current default which is 60 seconds.
On Tue, 2007-01-09 at 17:31, Matthew T. O'Connor wrote: > Without getting into all the details, the autovacuum naptime is a GUC > variable right now, so it can be much more frequent than the current > default which is 60 seconds. Hmm, for some reason I thought the granularity is minutes, but it is indeed in seconds... one more thing learned. Cheers, Csaba.
On Tue, 2007-01-09 at 17:36, Csaba Nagy wrote: > On Tue, 2007-01-09 at 17:31, Matthew T. O'Connor wrote: > > Without getting into all the details, the autovacuum naptime is a GUC > > variable right now, so it can be much more frequent than the current > > default which is 60 seconds. > > Hmm, for some reason I thought the granularity is minutes, but it is > indeed in seconds... one more thing learned. OK, so after checking my config, it is still not optimal because it refers to all the data bases in the cluster. I have setups where I have multiple data bases in the same cluster, with various degrees of activity... some of them should be checked continuously, some rarely... so now if I let the default 60 seconds, each data base will be checked in db_count * (60 + vacuum time) seconds. This is not optimal... some of the DBs have way much activity than others. Those I would like to be checked say each 10 seconds, the rest each 5 minutes... Cheers, Csaba.
nagy@ecircle-ag.com (Csaba Nagy) writes: > On Sun, 2006-12-24 at 03:03, Christopher Browne wrote: > [snip] >> Seems to me that you could get ~80% of the way by having the simplest >> "2 queue" implementation, where tables with size < some threshold get >> thrown at the "little table" queue, and tables above that size go to >> the "big table" queue. > > That would most definitely not cut it for me, I have more than 2 > categories of tables: > > - a few small but very often updated/inserted/deleted table: these must > be continuously vacuumed, your "little queue" is not good enough for > that, as even the round trip between the small tables could lead to > bloat on them; I disagree; if we added more "work processes," that could eat quickly through the short end of the queue. > - a few small and moderately updated, that could live with the "little > queue"; > > - a few big and frequently updated, but which only have a small > percentage of rows actively updated at any time: those could live with > the big queue; > > - the rest which are rarely updated, I would put those in a separate > queue so they won't affect the rest, cause vacuuming them is really > mostly not critical; > > The point is that I'm not sure there couldn't be even more reasons > to split the tables in even more queues based on the importance of > vacuuming them combined with update rate and their size. If I can > set up my own queues I can experiment with what works best for > me... for the base setup you could set up some default queues. I > wonder though how would you handle dynamics of tables, I mean when > will a small table which grows start to be considered a big table > for the purpose of putting it in one queue or the other ? I guess it > would be done on analyzing the table, which is also handled by > autovacuum, so tables with no vacuum queue settings could go to one > of the 2 default queues you mention. The heuristic I was thinking of didn't involve having two queues, but rather just 1. By having some size information, work processes could eat at the queue from both ends. If you have cases where tables need to be vacuumed *really* frequently, then you make sure that they are being injected frequently, and that some of the workers are tied to Just Doing Small Tables. I think that *does* cover your scenario quite adequately, and without having to get into having a bunch of queues. The heuristic is incomplete in one other fashion, namely that it doesn't guarantee that tables in the middle will ever get "gotten to." That mandates having a third policy, namely to have a worker that goes through tables in the (singular) queue some form of chronological order. >> That should keep any small tables from getting "vacuum-starved." >> >> I'd think the next step would be to increase the number of queues, >> perhaps in a time-based fashion. There might be times when it's >> acceptable to vacuum 5 tables at once, so you burn thru little tables >> "like the blazes," and handle larger ones fairly promptly. And other >> times when you don't want to do *any* big tables, and limit a single >> queue to just the itty bitty ones. > > This is all nice and it would be cool if you could set it up per vacuum > queue. I mean how much more effort would be to allow vacuum queues with > generic settings like time windows with max number of threads for each > window, and let the user explicitly assign tables to those queues, > instead of hard coding the queues and their settings and assign tables > to them based on size or any other heuristics ? > > For the average application which needs simple settings, there could be > a default setup with the 2 queues you mention. If it would be possible > to set up some rules to assign tables to queues based on their > properties on analyze time, instead of explicitly assigning to one queue > or other, that would be nice too, and then you can completely cover the > default setup with those settings, and allow for more complex setups for > those who need it. My thinking has headed more towards simplifying this; two queues seems to be one too many :-). >> This approach allows you to stay mostly heuristic-based, as opposed to >> having to describe policies in gratuitous detail. > > I agree that for simple setups that would be OK, but like I said, if it > would be easy enough to code that heuristics, and provide some sane > setup as default, and then let the user optimize it, that would be a > cool solution. > > Now it's true I don't really know how would you code 'assign all tables > which are smaller than x rows to vacuum queue "little-tables"' ... maybe > by providing a function to the queue which "matches" on the table ? And > you can change that function ? No idea, but it probably can be done... Based on the three policies I've seen, it could make sense to assign worker policies: 1. You have a worker that moves its way through the queue in some sort of sequential order, based on when the table is added to the queue, to guarantee that all tables get processed, eventually. 2. You have workers that always pull the "cheapest" tables in the queue, perhaps with some sort of upper threshold that they won't go past. 3. You have workers that alternate between eating from the two ends of the queue. Only one queue is needed, and there's only one size parameter involved. Having multiple workers of type #2 seems to me to solve the problem you're concerned about. -- (format nil "~S@~S" "cbbrowne" "cbbrowne.com") http://cbbrowne.com/info/spiritual.html <a href="http://www.netizen.com.au/">thorfinn@netizen.com.au</a> Millihelen, adj: The amount of beauty required to launch one ship.
Csaba Nagy wrote: > On Tue, 2007-01-09 at 17:36, Csaba Nagy wrote: > >> On Tue, 2007-01-09 at 17:31, Matthew T. O'Connor wrote: >> >>> Without getting into all the details, the autovacuum naptime is a GUC >>> variable right now, so it can be much more frequent than the current >>> default which is 60 seconds. >>> >> Hmm, for some reason I thought the granularity is minutes, but it is >> indeed in seconds... one more thing learned. >> > > OK, so after checking my config, it is still not optimal because it > refers to all the data bases in the cluster. I have setups where I have > multiple data bases in the same cluster, with various degrees of > activity... some of them should be checked continuously, some rarely... > so now if I let the default 60 seconds, each data base will be checked > in db_count * (60 + vacuum time) seconds. This is not optimal... some of > the DBs have way much activity than others. Those I would like to be > checked say each 10 seconds, the rest each 5 minutes... > Agreed, this is the point of this whole thread that there are lots of setups where autovacuum could do better. My point was only that as we move forward with these multiple queue / multiple worker process setups etc, that we already have some infrastructure to make things go faster.
A long time ago, in a galaxy far, far away, nagy@ecircle-ag.com (Csaba Nagy) wrote: > On Mon, 2007-01-08 at 22:29, Chris Browne wrote: > [snip] >> Based on the three policies I've seen, it could make sense to assign >> worker policies: >> >> 1. You have a worker that moves its way through the queue in some sort of >> sequential order, based on when the table is added to the queue, to >> guarantee that all tables get processed, eventually. >> >> 2. You have workers that always pull the "cheapest" tables in the >> queue, perhaps with some sort of upper threshold that they won't go >> past. >> >> 3. You have workers that alternate between eating from the two ends of the >> queue. >> >> Only one queue is needed, and there's only one size parameter >> involved. >> Having multiple workers of type #2 seems to me to solve the problem >> you're concerned about. > > This sounds better, but define "cheapest" in #2... I actually want to > continuously vacuum tables which are small, heavily recycled > (insert/update/delete), and which would bloat quickly. So how do you > define the cost function for having these tables the "cheapest" ? Cost would be based on the number of pages in the table. The smallest tables are obviously the cheapest to vacuum. That's separate from the policy for adding tables to the queue; THAT would sensibly be based on the number of dead tuples; the current policy of autovacuum seems not unreasonable... > And how will you define the worker thread count policy ? Always 1 > worker per category, or you can define the number of threads in the > 3 categories ? Or you still have in mind time window policies with > allowed number of threads per worker category ? (those numbers could > be 0 to disable a a worker category). It would make a lot of sense to have time ranges that would indicate when different values were wanted. Good question... > Other thing, how will the vacuum queue be populated ? Or the "queue" > here means nothing, all workers will always go through all tables to > pick one based on their own criteria ? My concern here is that the > current way of checking 1 DB per minute is not going to work with > category #2 tables, they really have to be vacuumed continuously > sometimes. I think it makes considerable sense to have a queue table for this. Having one of the threads look for new entries makes considerable sense. Offering the Gentle DBA the ability to add in entries based on their special knowledge would also seem sensible. -- (format nil "~S@~S" "cbbrowne" "gmail.com") http://linuxdatabases.info/info/slony.html Keeping instructions and operands in different memories saves .20 (.09) microseconds.
A long time ago, in a galaxy far, far away, nagy@ecircle-ag.com (Csaba Nagy) wrote: > On Mon, 2007-01-08 at 22:29, Chris Browne wrote: > [snip] >> Based on the three policies I've seen, it could make sense to assign >> worker policies: >> >> 1. You have a worker that moves its way through the queue in some sort of >> sequential order, based on when the table is added to the queue, to >> guarantee that all tables get processed, eventually. >> >> 2. You have workers that always pull the "cheapest" tables in the >> queue, perhaps with some sort of upper threshold that they won't go >> past. >> >> 3. You have workers that alternate between eating from the two ends of the >> queue. >> >> Only one queue is needed, and there's only one size parameter >> involved. >> Having multiple workers of type #2 seems to me to solve the problem >> you're concerned about. > > This sounds better, but define "cheapest" in #2... I actually want to > continuously vacuum tables which are small, heavily recycled > (insert/update/delete), and which would bloat quickly. So how do you > define the cost function for having these tables the "cheapest" ? Cost would be based on the number of pages in the table. The smallest tables are obviously the cheapest to vacuum. That's separate from the policy for adding tables to the queue; THAT would sensibly be based on the number of dead tuples; the current policy of autovacuum seems not unreasonable... > And how will you define the worker thread count policy ? Always 1 > worker per category, or you can define the number of threads in the > 3 categories ? Or you still have in mind time window policies with > allowed number of threads per worker category ? (those numbers could > be 0 to disable a a worker category). It would make a lot of sense to have time ranges that would indicate when different values were wanted. Good question... > Other thing, how will the vacuum queue be populated ? Or the "queue" > here means nothing, all workers will always go through all tables to > pick one based on their own criteria ? My concern here is that the > current way of checking 1 DB per minute is not going to work with > category #2 tables, they really have to be vacuumed continuously > sometimes. I think it makes considerable sense to have a queue table for this. Having one of the threads look for new entries makes considerable sense. Offering the Gentle DBA the ability to add in entries based on their special knowledge would also seem sensible. -- (format nil "~S@~S" "cbbrowne" "gmail.com") http://linuxdatabases.info/info/slony.html Keeping instructions and operands in different memories saves .20 (.09) microseconds.
On Fri, 2006-12-29 at 20:25 -0300, Alvaro Herrera wrote: > Christopher Browne wrote: > > > Seems to me that you could get ~80% of the way by having the simplest > > "2 queue" implementation, where tables with size < some threshold get > > thrown at the "little table" queue, and tables above that size go to > > the "big table" queue. > > > > That should keep any small tables from getting "vacuum-starved." > > Hmm, would it make sense to keep 2 queues, one that goes through the > tables in smaller-to-larger order, and the other one in the reverse > direction? > > I am currently writing a design on how to create "vacuum queues" but I'm > thinking that maybe it's getting too complex to handle, and a simple > idea like yours is enough (given sufficient polish). Sounds good to me. My colleague Pavan has just suggested multiple autovacuums and then prototyped something almost as a side issue while trying to solve other problems. I'll show him this entry, maybe he saw it already? I wasn't following this discussion until now. The 2 queue implementation seemed to me to be the most straightforward implementation, mirroring Chris' suggestion. A few aspects that haven't been mentioned are: - if you have more than one VACUUM running, we'll need to watch memory management. Having different queues based upon table size is a good way of doing that, since the smaller queues have a naturally limited memory consumption. - with different size-based queues, the larger VACUUMs can be delayed so they take much longer, while the small tables can go straight through Some feedback from initial testing is that 2 queues probably isn't enough. If you have tables with 100s of blocks and tables with millions of blocks, the tables in the mid-range still lose out. So I'm thinking that a design with 3 queues based upon size ranges, plus the idea that when a queue is empty it will scan for tables slightly above/below its normal range. That way we wouldn't need to specify the cut-offs with a difficult to understand new set of GUC parameters, define them exactly and then have them be wrong when databases grow. The largest queue would be the one reserved for Xid wraparound avoidance. No table would be eligible for more than one queue at a time, though it might change between queues as it grows. Alvaro, have you completed your design? Pavan, what are your thoughts? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Simon Riggs wrote: > Some feedback from initial testing is that 2 queues probably isn't > enough. If you have tables with 100s of blocks and tables with millions > of blocks, the tables in the mid-range still lose out. So I'm thinking > that a design with 3 queues based upon size ranges, plus the idea that > when a queue is empty it will scan for tables slightly above/below its > normal range. Yeah, eventually it occurred to me the fact that as soon as you have 2 queues, you may as well want to have 3 or in fact any number. Which in my proposal is very easily achieved. > Alvaro, have you completed your design? No, I haven't, and the part that's missing is precisely the queues stuff. I think I've been delaying posting it for too long, and that is harmful because it makes other people waste time thinking on issues that I may already have resolved, and delays the bashing that yet others will surely inflict on my proposal, which is never a good thing ;-) So maybe I'll put in a stub about the "queues" stuff and see how people like the whole thing. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Fri, 2007-01-12 at 19:33 -0300, Alvaro Herrera wrote: > > Alvaro, have you completed your design? > > No, I haven't, and the part that's missing is precisely the queues > stuff. I think I've been delaying posting it for too long, and that is > harmful because it makes other people waste time thinking on issues that > I may already have resolved, and delays the bashing that yet others will > surely inflict on my proposal, which is never a good thing ;-) So maybe > I'll put in a stub about the "queues" stuff and see how people like the > whole thing. I've not read a word spoken against the general idea, so I think we should pursue this actively for 8.3. It should be straightforward to harvest the good ideas, though there will definitely be many. Perhaps we should focus on the issues that might result, so that we address those before we spend time on the details of the user interface. Can we deadlock or hang from running multiple autovacuums? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
On Fri, Jan 12, 2007 at 07:33:05PM -0300, Alvaro Herrera wrote: > Simon Riggs wrote: > > > Some feedback from initial testing is that 2 queues probably isn't > > enough. If you have tables with 100s of blocks and tables with millions > > of blocks, the tables in the mid-range still lose out. So I'm thinking > > that a design with 3 queues based upon size ranges, plus the idea that > > when a queue is empty it will scan for tables slightly above/below its > > normal range. > > Yeah, eventually it occurred to me the fact that as soon as you have 2 > queues, you may as well want to have 3 or in fact any number. Which in > my proposal is very easily achieved. > > > > Alvaro, have you completed your design? > > No, I haven't, and the part that's missing is precisely the queues > stuff. I think I've been delaying posting it for too long, and that is > harmful because it makes other people waste time thinking on issues that > I may already have resolved, and delays the bashing that yet others will > surely inflict on my proposal, which is never a good thing ;-) So maybe > I'll put in a stub about the "queues" stuff and see how people like the > whole thing. Have you made any consideration of providing feedback on autovacuum to users? Right now we don't even know what tables were vacuumed when and what was reaped. This might actually be another topic. ---elein elein@varlena.com
Simon Riggs wrote: > On Fri, 2006-12-29 at 20:25 -0300, Alvaro Herrera wrote: >> Christopher Browne wrote: >> >>> Seems to me that you could get ~80% of the way by having the simplest >>> "2 queue" implementation, where tables with size < some threshold get >>> thrown at the "little table" queue, and tables above that size go to >>> the "big table" queue. >>> >>> That should keep any small tables from getting "vacuum-starved." > This is exectly what I am trying, two process autovacuum and a GUC to seperate small tables. In this case, one process takes up vacuuming of the small tables and other process vacuuming of the remaining tables as well as Xid avoidance related vacuuming. The goal is to avoid starvation of small tables when a large table is being vacuumed (which may take several hours) without adding too much complexity to the code. > > Some feedback from initial testing is that 2 queues probably isn't > enough. If you have tables with 100s of blocks and tables with millions > of blocks, the tables in the mid-range still lose out. So I'm thinking > that a design with 3 queues based upon size ranges, plus the idea that > when a queue is empty it will scan for tables slightly above/below its > normal range. That way we wouldn't need to specify the cut-offs with a > difficult to understand new set of GUC parameters, define them exactly > and then have them be wrong when databases grow. > > The largest queue would be the one reserved for Xid wraparound > avoidance. No table would be eligible for more than one queue at a time, > though it might change between queues as it grows. > > Alvaro, have you completed your design? > > Pavan, what are your thoughts? > IMO 2-queue is a good step forward, but in long term we may need to go for a multiprocess autovacuum where the number and tasks of processes are either demand based and/or user configurable. Another idea is to vacuum the tables in round-robin fashion where the quantum could be either "time" or "number of block". The autovacuum process would vacuum 'x' blocks of one table and then schedule next table in the queue. This would avoid starvation of small tables, though cost of index cleanup might go up because of increased IO. Any thoughts of this approach ? Thanks, Pavan
Simon Riggs wrote: > On Fri, 2006-12-29 at 20:25 -0300, Alvaro Herrera wrote: >> Christopher Browne wrote: >> >>> Seems to me that you could get ~80% of the way by having the simplest >>> "2 queue" implementation, where tables with size < some threshold get >>> thrown at the "little table" queue, and tables above that size go to >>> the "big table" queue. >>> >>> That should keep any small tables from getting "vacuum-starved." > This is exectly what I am trying, two process autovacuum and a GUC to seperate small tables. In this case, one process takes up vacuuming of the small tables and other process vacuuming of the remaining tables as well as Xid avoidance related vacuuming. The goal is to avoid starvation of small tables when a large table is being vacuumed (which may take several hours) without adding too much complexity to the code. > > Some feedback from initial testing is that 2 queues probably isn't > enough. If you have tables with 100s of blocks and tables with millions > of blocks, the tables in the mid-range still lose out. So I'm thinking > that a design with 3 queues based upon size ranges, plus the idea that > when a queue is empty it will scan for tables slightly above/below its > normal range. That way we wouldn't need to specify the cut-offs with a > difficult to understand new set of GUC parameters, define them exactly > and then have them be wrong when databases grow. > > The largest queue would be the one reserved for Xid wraparound > avoidance. No table would be eligible for more than one queue at a time, > though it might change between queues as it grows. > > Alvaro, have you completed your design? > > Pavan, what are your thoughts? > IMO 2-queue is a good step forward, but in long term we may need to go for a multiprocess autovacuum where the number and tasks of processes are either demand based and/or user configurable. Another idea is to vacuum the tables in round-robin fashion where the quantum could be either "time" or "number of block". The autovacuum process would vacuum 'x' blocks of one table and then schedule next table in the queue. This would avoid starvation of small tables, though cost of index cleanup might go up because of increased IO. Any thoughts of this approach ? Thanks, Pavan
Pavan Deolasee wrote: > Simon Riggs wrote: >> On Fri, 2006-12-29 at 20:25 -0300, Alvaro Herrera wrote: >>> Christopher Browne wrote: >>> >>>> Seems to me that you could get ~80% of the way by having the simplest >>>> "2 queue" implementation, where tables with size < some threshold get >>>> thrown at the "little table" queue, and tables above that size go to >>>> the "big table" queue. >>>> >>>> That should keep any small tables from getting "vacuum-starved." >> > > This is exectly what I am trying, two process autovacuum and a GUC to > seperate small tables. > > In this case, one process takes up vacuuming of the small tables and > other process vacuuming of the remaining tables as well as Xid > avoidance related vacuuming. The goal is to avoid starvation of small > tables when a large table is being vacuumed (which may take > several hours) without adding too much complexity to the code. Would it work to make the queues push the treshold into the direction of the still running queue if the other queue finishes before the still running one? This would achieve some kind of auto-tuning, but that is usually tricky. For example, what if one of the queues got stuck on a lock? -- Alban Hertroys alban@magproductions.nl magproductions b.v. T: ++31(0)534346874 F: ++31(0)534346876 M: I: www.magproductions.nl A: Postbus 416 7500 AK Enschede // Integrate Your World //
Simon Riggs wrote: > Perhaps we should focus on the issues that might result, so that we > address those before we spend time on the details of the user interface. > Can we deadlock or hang from running multiple autovacuums? If you were to run multiple autovacuum processes the way they are today, maybe. But that's not my intention -- the launcher would be the only one to read the catalogs; the workers would be started only to do a single VACUUM job. This reduces the risk of this kind of problems. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
elein wrote: > Have you made any consideration of providing feedback on autovacuum to users? > Right now we don't even know what tables were vacuumed when and what was > reaped. This might actually be another topic. I'd like to hear other people's opinions on Darcy Buskermolen proposal to have a log table, on which we'd register what did we run, at what time, how long did it last, how many tuples did it clean, etc. I feel having it on the regular text log is useful but it's not good enough. Keep in mind that in the future we may want to peek at that collected information to be able to take better scheduling decisions (or at least inform the DBA that he sucks). Now, I'd like this to be a VACUUM thing, not autovacuum. That means that manually-run vacuums would be logged as well. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Tuesday 16 January 2007 06:29, Alvaro Herrera wrote: > elein wrote: > > Have you made any consideration of providing feedback on autovacuum to > > users? Right now we don't even know what tables were vacuumed when and > > what was reaped. This might actually be another topic. > > I'd like to hear other people's opinions on Darcy Buskermolen proposal > to have a log table, on which we'd register what did we run, at what > time, how long did it last, how many tuples did it clean, etc. I feel > having it on the regular text log is useful but it's not good enough. > Keep in mind that in the future we may want to peek at that collected > information to be able to take better scheduling decisions (or at least > inform the DBA that he sucks). > > Now, I'd like this to be a VACUUM thing, not autovacuum. That means > that manually-run vacuums would be logged as well. Yes I did intend this thought for vacuum, not strictly autovacuum.
Alvaro Herrera wrote: > I'd like to hear other people's opinions on Darcy Buskermolen proposal > to have a log table, on which we'd register what did we run, at what > time, how long did it last, how many tuples did it clean, etc. I feel > having it on the regular text log is useful but it's not good enough. > Keep in mind that in the future we may want to peek at that collected > information to be able to take better scheduling decisions (or at least > inform the DBA that he sucks). I'm not familiar with his proposal, but I'm not sure what I think of logging vacuum (and perhaps analyze) commands to a table. We have never logged anything to tables inside PG. I would be worried about this eating a lot of space in some situations. I think most people would just be happy if we could get autovacuum to log it's actions at a much higher log level. I think that "autovacuum vacuumed table x" is important and shouldn't be all the way down at the debug level. The other (more involved) solution to this problem was proposed which was create a separate set of logging control params for autovacuum so that you can turn it up or down independent of the general server logging. > Now, I'd like this to be a VACUUM thing, not autovacuum. That means > that manually-run vacuums would be logged as well. +1
On Tue, 2007-01-16 at 07:16 -0800, Darcy Buskermolen wrote: > On Tuesday 16 January 2007 06:29, Alvaro Herrera wrote: > > elein wrote: > > > Have you made any consideration of providing feedback on autovacuum to > > > users? Right now we don't even know what tables were vacuumed when and > > > what was reaped. This might actually be another topic. > > > > I'd like to hear other people's opinions on Darcy Buskermolen proposal > > to have a log table, on which we'd register what did we run, at what > > time, how long did it last, how many tuples did it clean, etc. I feel > > having it on the regular text log is useful but it's not good enough. > > Keep in mind that in the future we may want to peek at that collected > > information to be able to take better scheduling decisions (or at least > > inform the DBA that he sucks). > > > > Now, I'd like this to be a VACUUM thing, not autovacuum. That means > > that manually-run vacuums would be logged as well. > > Yes I did intend this thought for vacuum, not strictly autovacuum. I agree, for all VACUUMs: we need a log table. The only way we can get a feedback loop on what has come before is by remembering what happened. Simply logging it is interesting, but not enough. There is some complexity there, because with many applications a small table gets VACUUMed every few minutes, so the log table would become a frequently updated table itself. I'd also suggest that we might want to take account of the number of tuples removed by btree pre-split VACUUMs also. I also like the idea of a single scheduler and multiple child workers. The basic architecture is clear and obviously beneficial. What worries me is how the scheduler will work; there seems to be as many ideas as we have hackers. I'm wondering if we should provide the facility of a pluggable scheduler? That way you'd be able to fine tune the schedule to both the application and to the business requirements. That would allow integration with external workflow engines and job schedulers, for when VACUUMs need to not-conflict with external events. If no scheduler has been defined, just use a fairly simple default. The three main questions are - what is the maximum size of VACUUM that can start *now* - can *this* VACUUM start now? - which is the next VACUUM to run? If we have an API that allows those 3 questions to be asked, then a scheduler plug-in could supply the answers. That way any complex application rules (table A is available for VACUUM now for next 60 mins, table B is in constant use so we must use vacuum_delay), external events (long running reports have now finished, OK to VACUUM), time-based rules (e.g. first Sunday of the month 00:00 - 04:00 is scheduled downtime, first 3 days of the each month is financial accounting close) can be specified. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
On Friday 19 January 2007 01:47, Simon Riggs wrote: > On Tue, 2007-01-16 at 07:16 -0800, Darcy Buskermolen wrote: > > On Tuesday 16 January 2007 06:29, Alvaro Herrera wrote: > > > elein wrote: > > > > Have you made any consideration of providing feedback on autovacuum > > > > to users? Right now we don't even know what tables were vacuumed when > > > > and what was reaped. This might actually be another topic. > > > > > > I'd like to hear other people's opinions on Darcy Buskermolen proposal > > > to have a log table, on which we'd register what did we run, at what > > > time, how long did it last, how many tuples did it clean, etc. I feel > > > having it on the regular text log is useful but it's not good enough. > > > Keep in mind that in the future we may want to peek at that collected > > > information to be able to take better scheduling decisions (or at least > > > inform the DBA that he sucks). > > > > > > Now, I'd like this to be a VACUUM thing, not autovacuum. That means > > > that manually-run vacuums would be logged as well. > > > > Yes I did intend this thought for vacuum, not strictly autovacuum. > > I agree, for all VACUUMs: we need a log table. > > The only way we can get a feedback loop on what has come before is by > remembering what happened. Simply logging it is interesting, but not > enough. Correct, I think we are all saying the same thing that is this log table is purely inserts so that we can see trends over time. > > There is some complexity there, because with many applications a small > table gets VACUUMed every few minutes, so the log table would become a > frequently updated table itself. I'd also suggest that we might want to > take account of the number of tuples removed by btree pre-split VACUUMs > also. Thinking on this a bit more, I suppose that this table really should allow for user defined triggers on it, so that a DBA can create partioning for it, not to mention being able to move it off into it's own tablespace. > > I also like the idea of a single scheduler and multiple child workers. > > The basic architecture is clear and obviously beneficial. What worries > me is how the scheduler will work; there seems to be as many ideas as we > have hackers. I'm wondering if we should provide the facility of a > pluggable scheduler? That way you'd be able to fine tune the schedule to > both the application and to the business requirements. That would allow > integration with external workflow engines and job schedulers, for when > VACUUMs need to not-conflict with external events. > > If no scheduler has been defined, just use a fairly simple default. > > The three main questions are > - what is the maximum size of VACUUM that can start *now* How can we determine this given we have no real knowledge of the upcoming adverse IO conditions ? > - can *this* VACUUM start now? > - which is the next VACUUM to run? > > If we have an API that allows those 3 questions to be asked, then a > scheduler plug-in could supply the answers. That way any complex > application rules (table A is available for VACUUM now for next 60 mins, > table B is in constant use so we must use vacuum_delay), external events > (long running reports have now finished, OK to VACUUM), time-based rules > (e.g. first Sunday of the month 00:00 - 04:00 is scheduled downtime, > first 3 days of the each month is financial accounting close) can be > specified. Another thought, is it at all possible to do a partial vacuum? ie spend the next 30 minutes vacuuming foo table, and update the fsm with what hew have learned over the 30 mins, even if we have not done a full table scan ? -- Darcy Buskermolen The PostgreSQL company, Command Prompt Inc.
Added to TODO: > o Allow multiple vacuums so large tables do not starve small > tables > > http://archives.postgresql.org/pgsql-general/2007-01/msg00031.php > > o Improve control of auto-vacuum > > http://archives.postgresql.org/pgsql-hackers/2006-12/msg00876.php --------------------------------------------------------------------------- Darcy Buskermolen wrote: > On Friday 19 January 2007 01:47, Simon Riggs wrote: > > On Tue, 2007-01-16 at 07:16 -0800, Darcy Buskermolen wrote: > > > On Tuesday 16 January 2007 06:29, Alvaro Herrera wrote: > > > > elein wrote: > > > > > Have you made any consideration of providing feedback on autovacuum > > > > > to users? Right now we don't even know what tables were vacuumed when > > > > > and what was reaped. This might actually be another topic. > > > > > > > > I'd like to hear other people's opinions on Darcy Buskermolen proposal > > > > to have a log table, on which we'd register what did we run, at what > > > > time, how long did it last, how many tuples did it clean, etc. I feel > > > > having it on the regular text log is useful but it's not good enough. > > > > Keep in mind that in the future we may want to peek at that collected > > > > information to be able to take better scheduling decisions (or at least > > > > inform the DBA that he sucks). > > > > > > > > Now, I'd like this to be a VACUUM thing, not autovacuum. That means > > > > that manually-run vacuums would be logged as well. > > > > > > Yes I did intend this thought for vacuum, not strictly autovacuum. > > > > I agree, for all VACUUMs: we need a log table. > > > > The only way we can get a feedback loop on what has come before is by > > remembering what happened. Simply logging it is interesting, but not > > enough. > > Correct, I think we are all saying the same thing that is this log table is > purely inserts so that we can see trends over time. > > > > > There is some complexity there, because with many applications a small > > table gets VACUUMed every few minutes, so the log table would become a > > frequently updated table itself. I'd also suggest that we might want to > > take account of the number of tuples removed by btree pre-split VACUUMs > > also. > > Thinking on this a bit more, I suppose that this table really should allow for > user defined triggers on it, so that a DBA can create partioning for it, not > to mention being able to move it off into it's own tablespace. > > > > > > I also like the idea of a single scheduler and multiple child workers. > > > > The basic architecture is clear and obviously beneficial. What worries > > me is how the scheduler will work; there seems to be as many ideas as we > > have hackers. I'm wondering if we should provide the facility of a > > pluggable scheduler? That way you'd be able to fine tune the schedule to > > both the application and to the business requirements. That would allow > > integration with external workflow engines and job schedulers, for when > > VACUUMs need to not-conflict with external events. > > > > If no scheduler has been defined, just use a fairly simple default. > > > > The three main questions are > > - what is the maximum size of VACUUM that can start *now* > > How can we determine this given we have no real knowledge of the upcoming > adverse IO conditions ? > > > - can *this* VACUUM start now? > > - which is the next VACUUM to run? > > > > If we have an API that allows those 3 questions to be asked, then a > > scheduler plug-in could supply the answers. That way any complex > > application rules (table A is available for VACUUM now for next 60 mins, > > table B is in constant use so we must use vacuum_delay), external events > > (long running reports have now finished, OK to VACUUM), time-based rules > > (e.g. first Sunday of the month 00:00 - 04:00 is scheduled downtime, > > first 3 days of the each month is financial accounting close) can be > > specified. > > Another thought, is it at all possible to do a partial vacuum? ie spend the > next 30 minutes vacuuming foo table, and update the fsm with what hew have > learned over the 30 mins, even if we have not done a full table scan ? > > > -- > > > Darcy Buskermolen > The PostgreSQL company, Command Prompt Inc. > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Darcy Buskermolen wrote:
1. partial vacuum would mean that parts of the table are missed, the user could never vacuum certain parts and transaction wraparound would get you. You may also have other performance issues as you forgot certian parts of the table
2. Index cleanup is the most expensive part of vacuum. So doing a partial vacuum actually means more I/O as you have to do index cleanup more often.
If we are talking about autovacuum, 1 doesn't become so much of an issue as you just make the autovacuum remember what parts of the table it's vacuumed. This really has great power when you have a dead space map.
Item 2 will still be an issue. But if you define "partial" as either fill maintenance_work_mem, or finish the table, you are not increasing I/O at all. As when maintenance work mem is full, you have to cleanup all the indexes anyway. This is really more like VACUUM SINGLE, but the same principal applies.
I believe all planning really needs to think about how a dead space map will effect what vacuum is going to be doing in the future.
Strange idea that I haven't researched, Given Vacuum can't be run in a transaction, it is possible at a certain point to quit the current transaction and start another one. There has been much chat and now a TODO item about allowing multiple vacuums to not starve small tables. But if a big table has a long running vacuum the vacuum of the small table won't be effective anyway will it? If vacuum of a big table was done in multiple transactions you could reduce the effect of long running vacuum. I'm not sure how this effects the rest of the system thought.
Russell Smith
[snip]There was a proposal for this, but it was dropped on 2 grounds.Another thought, is it at all possible to do a partial vacuum? ie spend the next 30 minutes vacuuming foo table, and update the fsm with what hew have learned over the 30 mins, even if we have not done a full table scan ?
1. partial vacuum would mean that parts of the table are missed, the user could never vacuum certain parts and transaction wraparound would get you. You may also have other performance issues as you forgot certian parts of the table
2. Index cleanup is the most expensive part of vacuum. So doing a partial vacuum actually means more I/O as you have to do index cleanup more often.
If we are talking about autovacuum, 1 doesn't become so much of an issue as you just make the autovacuum remember what parts of the table it's vacuumed. This really has great power when you have a dead space map.
Item 2 will still be an issue. But if you define "partial" as either fill maintenance_work_mem, or finish the table, you are not increasing I/O at all. As when maintenance work mem is full, you have to cleanup all the indexes anyway. This is really more like VACUUM SINGLE, but the same principal applies.
I believe all planning really needs to think about how a dead space map will effect what vacuum is going to be doing in the future.
Strange idea that I haven't researched, Given Vacuum can't be run in a transaction, it is possible at a certain point to quit the current transaction and start another one. There has been much chat and now a TODO item about allowing multiple vacuums to not starve small tables. But if a big table has a long running vacuum the vacuum of the small table won't be effective anyway will it? If vacuum of a big table was done in multiple transactions you could reduce the effect of long running vacuum. I'm not sure how this effects the rest of the system thought.
Russell Smith
Russell Smith wrote: > Strange idea that I haven't researched, Given Vacuum can't be run in a > transaction, it is possible at a certain point to quit the current > transaction and start another one. There has been much chat and now a > TODO item about allowing multiple vacuums to not starve small tables. > But if a big table has a long running vacuum the vacuum of the small > table won't be effective anyway will it? If vacuum of a big table was > done in multiple transactions you could reduce the effect of long > running vacuum. I'm not sure how this effects the rest of the system > thought. That was fixed by Hannu Krosing's patch in 8.2 that made vacuum to ignore other vacuums in the oldest xmin calculation. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Sat, 2007-01-20 at 09:41 +1100, Russell Smith wrote: > Darcy Buskermolen wrote: > > [snip] > > > > Another thought, is it at all possible to do a partial vacuum? ie spend the > > next 30 minutes vacuuming foo table, and update the fsm with what hew have > > learned over the 30 mins, even if we have not done a full table scan ? > > > There was a proposal for this, but it was dropped on 2 grounds. > 1. partial vacuum would mean that parts of the table are missed, the > user could never vacuum certain parts and transaction wraparound would > get you. You may also have other performance issues as you forgot > certian parts of the table Partial vacuum would still be possible if you remembered where you got to in the VACUUM and then started from that same point next time. It could then go to the end of the table and wrap back around. > 2. Index cleanup is the most expensive part of vacuum. So doing a > partial vacuum actually means more I/O as you have to do index cleanup > more often. Again, not necessarily. A large VACUUM can currently perform more than one set of index scans, so if you chose the right stopping place for a partial VACUUM you need never incur any additional work. It might even save effort in the long run. I'm not necessarily advocating partial VACUUM, just pointing out that the problems you raise need not be barriers to implementation, should that be considered worthwhile. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
On Sun, Jan 21, 2007 at 12:24:38PM +0000, Simon Riggs wrote: > Partial vacuum would still be possible if you remembered where you got > to in the VACUUM and then started from that same point next time. It > could then go to the end of the table and wrap back around. ISTM the Dead Space Map would give you this automatically, since that's your memory... Once you have the DSM to track where the dead pages are, you can set it up to target clusters first, thus giving maximum bang for buck. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Attachment
On Sun, Jan 21, 2007 at 11:39:45AM +0000, Heikki Linnakangas wrote: > Russell Smith wrote: > >Strange idea that I haven't researched, Given Vacuum can't be run in a > >transaction, it is possible at a certain point to quit the current > >transaction and start another one. There has been much chat and now a > >TODO item about allowing multiple vacuums to not starve small tables. > >But if a big table has a long running vacuum the vacuum of the small > >table won't be effective anyway will it? If vacuum of a big table was > >done in multiple transactions you could reduce the effect of long > >running vacuum. I'm not sure how this effects the rest of the system > >thought. > > That was fixed by Hannu Krosing's patch in 8.2 that made vacuum to > ignore other vacuums in the oldest xmin calculation. And IIRC in 8.1 every time vacuum finishes a pass over the indexes it will commit and start a new transaction. That's still useful even with Hannu's patch in case you start a vacuum with maintenance_work_mem too small; you can abort the vacuum some time later and at least some of the work it's done will get committed. -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
On Sun, 2007-01-21 at 14:26 -0600, Jim C. Nasby wrote: > On Sun, Jan 21, 2007 at 11:39:45AM +0000, Heikki Linnakangas wrote: > > Russell Smith wrote: > > >Strange idea that I haven't researched, Given Vacuum can't be run in a > > >transaction, it is possible at a certain point to quit the current > > >transaction and start another one. There has been much chat and now a > > >TODO item about allowing multiple vacuums to not starve small tables. > > >But if a big table has a long running vacuum the vacuum of the small > > >table won't be effective anyway will it? If vacuum of a big table was > > >done in multiple transactions you could reduce the effect of long > > >running vacuum. I'm not sure how this effects the rest of the system > > >thought. > > > > That was fixed by Hannu Krosing's patch in 8.2 that made vacuum to > > ignore other vacuums in the oldest xmin calculation. > > And IIRC in 8.1 every time vacuum finishes a pass over the indexes it > will commit and start a new transaction. err...It doesn't do this now and IIRC didn't do that in 8.1 either. > That's still useful even with > Hannu's patch in case you start a vacuum with maintenance_work_mem too > small; you can abort the vacuum some time later and at least some of the > work it's done will get committed. True, but not recommended, though for a variety of reasons. The reason is not intermediate commits, but just that the work of VACUUM is mostly non-transactional in nature, apart from the various catalog entries when it completes. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Russell Smith wrote: > 2. Index cleanup is the most expensive part of vacuum. So doing a > partial vacuum actually means more I/O as you have to do index cleanup > more often. I don't think that's usually the case. Index(es) are typically only a fraction of the size of the table, and since 8.2 we do index vacuums in a single scan in physical order. In fact, in many applications the index is be mostly cached and the index scan doesn't generate any I/O at all. I believe the heap scans are the biggest issue at the moment. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas wrote: > Russell Smith wrote: > > 2. Index cleanup is the most expensive part of vacuum. So doing a > > partial vacuum actually means more I/O as you have to do index cleanup > > more often. > > I don't think that's usually the case. Index(es) are typically only a > fraction of the size of the table, and since 8.2 we do index vacuums in > a single scan in physical order. In fact, in many applications the index > is be mostly cached and the index scan doesn't generate any I/O at all. Are _all_ the indexes cached? I would doubt that. Also, for typical table, what percentage is the size of all indexes combined? -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian wrote: > Heikki Linnakangas wrote: >> Russell Smith wrote: >>> 2. Index cleanup is the most expensive part of vacuum. So doing a >>> partial vacuum actually means more I/O as you have to do index cleanup >>> more often. >> I don't think that's usually the case. Index(es) are typically only a >> fraction of the size of the table, and since 8.2 we do index vacuums in >> a single scan in physical order. In fact, in many applications the index >> is be mostly cached and the index scan doesn't generate any I/O at all. > > Are _all_ the indexes cached? I would doubt that. Well, depends on your schema, of course. In many applications, yes. > Also, for typical > table, what percentage is the size of all indexes combined? Well, there's no such thing as a typical table. As an anecdote here's the ratios (total size of all indexes of a table)/(size of corresponding heap) for the bigger tables for a DBT-2 run I have at hand: Stock: 1190470/68550 = 6% Order_line: 950103/274372 = 29% Customer: 629011 /(5711+20567) = 8% In any case, for the statement "Index cleanup is the most expensive part of vacuum" to be true, you're indexes would have to take up 2x as much space as the heap, since the heap is scanned twice. I'm sure there's databases like that out there, but I don't think it's the common case. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas wrote: > Bruce Momjian wrote: > > Heikki Linnakangas wrote: > >> Russell Smith wrote: > >>> 2. Index cleanup is the most expensive part of vacuum. So doing a > >>> partial vacuum actually means more I/O as you have to do index cleanup > >>> more often. > >> I don't think that's usually the case. Index(es) are typically only a > >> fraction of the size of the table, and since 8.2 we do index vacuums in > >> a single scan in physical order. In fact, in many applications the index > >> is be mostly cached and the index scan doesn't generate any I/O at all. > > > > Are _all_ the indexes cached? I would doubt that. > > Well, depends on your schema, of course. In many applications, yes. > > > Also, for typical > > table, what percentage is the size of all indexes combined? > > Well, there's no such thing as a typical table. As an anecdote here's > the ratios (total size of all indexes of a table)/(size of corresponding > heap) for the bigger tables for a DBT-2 run I have at hand: > > Stock: 1190470/68550 = 6% > Order_line: 950103/274372 = 29% > Customer: 629011 /(5711+20567) = 8% > > In any case, for the statement "Index cleanup is the most expensive part > of vacuum" to be true, you're indexes would have to take up 2x as much > space as the heap, since the heap is scanned twice. I'm sure there's > databases like that out there, but I don't think it's the common case. I agree it index cleanup isn't > 50% of vacuum. I was trying to figure out how small, and it seems about 15% of the total table, which means if we have bitmap vacuum, we can conceivably reduce vacuum load by perhaps 80%, assuming 5% of the table is scanned. -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On Mon, 2007-01-22 at 12:18 -0500, Bruce Momjian wrote: > Heikki Linnakangas wrote: > > > > In any case, for the statement "Index cleanup is the most expensive part > > of vacuum" to be true, you're indexes would have to take up 2x as much > > space as the heap, since the heap is scanned twice. I'm sure there's > > databases like that out there, but I don't think it's the common case. > > I agree it index cleanup isn't > 50% of vacuum. I was trying to figure > out how small, and it seems about 15% of the total table, which means if > we have bitmap vacuum, we can conceivably reduce vacuum load by perhaps > 80%, assuming 5% of the table is scanned. Clearly keeping track of what needs vacuuming will lead to a more efficient VACUUM. Your math applies to *any* design that uses some form of book-keeping to focus in on the hot spots. On a separate thread, Heikki has raised a different idea for VACUUM. Heikki's idea asks an important question: where and how should DSM information be maintained? Up to now everybody has assumed that it would be maintained when DML took place and that the DSM would be a transactional data structure (i.e. on-disk). Heikki's idea requires similar bookkeeping requirements to the original DSM concept, but the interesting aspect is that the DSM information is collected off-line, rather than being an overhead on every statement's response time. That idea seems extremely valuable to me. One of the main challenges is how we cope with large tables that have a very fine spray of updates against them. A DSM bitmap won't help with that situation, regrettably. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
"Bruce Momjian" <bruce@momjian.us> writes: > I agree it index cleanup isn't > 50% of vacuum. I was trying to figure > out how small, and it seems about 15% of the total table, which means if > we have bitmap vacuum, we can conceivably reduce vacuum load by perhaps > 80%, assuming 5% of the table is scanned. Actually no. A while back I did experiments to see how fast reading a file sequentially was compared to reading the same file sequentially but skipping x% of the blocks randomly. The results were surprising (to me) and depressing. The breakeven point was about 7%. That is, if you assum that only 5% of the table will be scanned and you arrange to do it sequentially then you should expect the i/o to be marginally faster than just reading the entire table. Vacuum does do some cpu work and wouldn't have to consult the clog as often, so it would still be somewhat faster. The theory online was that as long as you're reading one page from each disk track you're going to pay the same seek overhead as reading the entire track. I also had some theories involving linux being confused by the seeks and turning off read-ahead but I could never prove them. In short, to see big benefits you would have to have a much smaller percentage of the table being read. That shouldn't be taken to mean that the DSM is a loser. There are plenty of use cases where tables can be extremely large and have only very small percentages that are busy. The big advantage of the DSM is that it takes the size of the table out of the equation and replaces it with the size of the busy portion of the table. So updating a single record in a terabyte table has the same costs as updating a single record in a kilobyte table. Sadly that's not quite true due to indexes, and due to the size of the bitmap itself. But going back to your numbers it does mean that if you update a single row out of a terabyte table then we'll be removing about 85% of the i/o (minus the i/o needed to read the DSM, about .025%). If you update about 1% then you would be removing substantially less, and once you get to about 10% then you're back where you started. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Yep, agreed on the random I/O issue. The larger question is if you have a huge table, do you care to reclaim 3% of the table size, rather than just vacuum it when it gets to 10% dirty? I realize the vacuum is going to take a lot of time, but vacuuming to relaim 3% three times seems like it is going to be more expensive than just vacuuming the 10% once. And vacuuming to reclaim 1% ten times seems even more expensive. The partial vacuum idea is starting to look like a loser to me again. --------------------------------------------------------------------------- Gregory Stark wrote: > "Bruce Momjian" <bruce@momjian.us> writes: > > > I agree it index cleanup isn't > 50% of vacuum. I was trying to figure > > out how small, and it seems about 15% of the total table, which means if > > we have bitmap vacuum, we can conceivably reduce vacuum load by perhaps > > 80%, assuming 5% of the table is scanned. > > Actually no. A while back I did experiments to see how fast reading a file > sequentially was compared to reading the same file sequentially but skipping > x% of the blocks randomly. The results were surprising (to me) and depressing. > The breakeven point was about 7%. > > That is, if you assum that only 5% of the table will be scanned and you > arrange to do it sequentially then you should expect the i/o to be marginally > faster than just reading the entire table. Vacuum does do some cpu work and > wouldn't have to consult the clog as often, so it would still be somewhat > faster. > > The theory online was that as long as you're reading one page from each disk > track you're going to pay the same seek overhead as reading the entire track. > I also had some theories involving linux being confused by the seeks and > turning off read-ahead but I could never prove them. > > In short, to see big benefits you would have to have a much smaller percentage > of the table being read. That shouldn't be taken to mean that the DSM is a > loser. There are plenty of use cases where tables can be extremely large and > have only very small percentages that are busy. The big advantage of the DSM > is that it takes the size of the table out of the equation and replaces it > with the size of the busy portion of the table. So updating a single record in > a terabyte table has the same costs as updating a single record in a kilobyte > table. > > Sadly that's not quite true due to indexes, and due to the size of the bitmap > itself. But going back to your numbers it does mean that if you update a > single row out of a terabyte table then we'll be removing about 85% of the i/o > (minus the i/o needed to read the DSM, about .025%). If you update about 1% > then you would be removing substantially less, and once you get to about 10% > then you're back where you started. > > > -- > Gregory Stark > EnterpriseDB http://www.enterprisedb.com > > ---------------------------(end of broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
"Bruce Momjian" <bruce@momjian.us> writes: > Yep, agreed on the random I/O issue. The larger question is if you have > a huge table, do you care to reclaim 3% of the table size, rather than > just vacuum it when it gets to 10% dirty? I realize the vacuum is going > to take a lot of time, but vacuuming to relaim 3% three times seems like > it is going to be more expensive than just vacuuming the 10% once. And > vacuuming to reclaim 1% ten times seems even more expensive. The > partial vacuum idea is starting to look like a loser to me again. Well the answer is of course "that depends". If you maintain the dead space at a steady state averaging 1.5% instead of 5% your table is 3.33% smaller on average. If this is a DSS system that will translate into running your queries 3.33% faster. It will take a lot of vacuums before they hurt more than a 3%+ performance drop. If it's an OLTP system the it's harder to figure. a 3.33% increase in data density will translate to a higher cache hit rate but how much higher depends on a lot of factors. In our experiments we actually got bigger boost in these kinds of situations than the I expected (I expected comparable to the 3.33% improvement). So it could be even more than 3.33%. But like said it depends. If you already have the whole database cache you won't see any improvement. If you are right on the cusp you could see a huge benefit. It sounds like you're underestimating the performance drain 10% wasted space has. If we found out that one routine was unnecessarily taking 10% of the cpu time it would be an obvious focus of attention. 10% wasted space is going to work out to about 10% of the i/o time. It also sounds like we're still focused on the performance impact in absolute terms. I'm much more interested in changing the performance characteristics so they're predictable and scalable. It doesn't matter much if your 1kb table is 100% slower than necessary but it does matter if your 1TB table needs 1,000x as much vacuuming as your 1GB table even if it's getting the same update traffic. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
On Mon, 2007-01-22 at 13:27 -0500, Bruce Momjian wrote: > Yep, agreed on the random I/O issue. The larger question is if you have > a huge table, do you care to reclaim 3% of the table size, rather than > just vacuum it when it gets to 10% dirty? I realize the vacuum is going > to take a lot of time, but vacuuming to relaim 3% three times seems like > it is going to be more expensive than just vacuuming the 10% once. And > vacuuming to reclaim 1% ten times seems even more expensive. The > partial vacuum idea is starting to look like a loser to me again. Hold that thought! Read Heikki's Piggyback VACUUM idea on new thread... -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Bruce Momjian wrote: > > Yep, agreed on the random I/O issue. The larger question is if you have > a huge table, do you care to reclaim 3% of the table size, rather than > just vacuum it when it gets to 10% dirty? I realize the vacuum is going > to take a lot of time, but vacuuming to relaim 3% three times seems like > it is going to be more expensive than just vacuuming the 10% once. And > vacuuming to reclaim 1% ten times seems even more expensive. The > partial vacuum idea is starting to look like a loser to me again. But if the partial vacuum is able to clean the busiest pages and reclaim useful space, currently-running transactions will be able to use that space and thus not have to extend the table. Not that extension is a problem on itself, but it'll keep your working set smaller. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Bruce Momjian wrote: > Yep, agreed on the random I/O issue. The larger question is if you have > a huge table, do you care to reclaim 3% of the table size, rather than > just vacuum it when it gets to 10% dirty? I realize the vacuum is going > to take a lot of time, but vacuuming to relaim 3% three times seems like > it is going to be more expensive than just vacuuming the 10% once. And > vacuuming to reclaim 1% ten times seems even more expensive. The > partial vacuum idea is starting to look like a loser to me again. Buying a house with a 25-year mortgage is much more expensive than just paying cash too, but you don't always have a choice. Surely the key benefit of the partial vacuuming thing is that you can at least do something useful with a large table if a full vacuum takes 24 hours and you only have 4 hours of idle I/O. It's also occurred to me that all the discussion of scheduling way back when isn't directly addressing the issue. What most people want (I'm guessing) is to vacuum *when the user-workload allows* and the time-tabling is just a sysadmin first-approximation at that. With partial vacuuming possible, we can arrange things with just three thresholds and two measurements: Measurement 1 = system workload Measurement 2 = a per-table "requires vacuuming" value Threshold 1 = workload at which we do more vacuuming Threshold 2 = workload at which we do less vacuuming Threshold 3 = point at which a table is considered worth vacuuming. Once every 10 seconds, the manager compares the current workload to the thresholds and starts a new vacuum, kills one or does nothing. New vacuum processes keep getting started as long as there is workload spare and tables that need vacuuming. Now the trick of course is how you measure system workload in a meaningful manner. -- Richard Huxton Archonet Ltd
Gregory Stark wrote: > "Bruce Momjian" <bruce@momjian.us> writes: > >> I agree it index cleanup isn't > 50% of vacuum. I was trying to figure >> out how small, and it seems about 15% of the total table, which means if >> we have bitmap vacuum, we can conceivably reduce vacuum load by perhaps >> 80%, assuming 5% of the table is scanned. > > Actually no. A while back I did experiments to see how fast reading a file > sequentially was compared to reading the same file sequentially but skipping > x% of the blocks randomly. The results were surprising (to me) and depressing. > The breakeven point was about 7%. Note that with uniformly random updates, you have dirtied every page of the table until you get anywhere near 5% of dead space. So we have to assume non-uniform distribution of update for the DSM to be of any help. And if we assume non-uniform distribution, it's a good bet that the blocks that need vacuuming are also not randomly distributed. In fact, they might very well all be in one cluster, so that scanning that cluster is indeed sequential I/O. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Kenneth Marshall wrote: > On Mon, Jan 22, 2007 at 06:42:09PM +0000, Simon Riggs wrote: >> Hold that thought! Read Heikki's Piggyback VACUUM idea on new thread... > > There may be other functions that could leverage a similar sort of > infrastructure. For example, a long DB mining query could be registered > with the system. Then as the pieces of the table/database are brought in > to shared memory during the normal daily DB activity they can be acquired > without forcing the DB to run a very I/O expensive query when waiting a > bit for the results would be acceptable. As long as we are thinking > piggyback. Yeah, I had the same idea when we discussed synchronizing sequential scans. The biggest difference is that with queries, there's often a user waiting for the query to finish, but with vacuum we don't care so much how long it takes. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Alvaro Herrera wrote: > Bruce Momjian wrote: > > > > Yep, agreed on the random I/O issue. The larger question is if you have > > a huge table, do you care to reclaim 3% of the table size, rather than > > just vacuum it when it gets to 10% dirty? I realize the vacuum is going > > to take a lot of time, but vacuuming to relaim 3% three times seems like > > it is going to be more expensive than just vacuuming the 10% once. And > > vacuuming to reclaim 1% ten times seems even more expensive. The > > partial vacuum idea is starting to look like a loser to me again. > > But if the partial vacuum is able to clean the busiest pages and reclaim > useful space, currently-running transactions will be able to use that > space and thus not have to extend the table. Not that extension is a > problem on itself, but it'll keep your working set smaller. Yes, but my point is that if you are trying to avoid vacuuming the table, I am afraid the full index scan is going to be painful too. I can see corner cases where partial vacuum is a win (I only have 4 hours of idle I/O), but for the general case I am still worried that partial vacuum will not be that useful as long as we have to scan the indexes. -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
>-----Original Message----- >From: pgsql-general-owner@postgresql.org >[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Gregory Stark >Sent: maandag 22 januari 2007 19:41 >To: Bruce Momjian >Cc: Heikki Linnakangas; Russell Smith; Darcy Buskermolen; >Simon Riggs; Alvaro Herrera; Matthew T. O'Connor; Pavan >Deolasee; Christopher Browne; pgsql-general@postgresql.org; >pgsql-hackers@postgresql.org >Subject: Re: [HACKERS] [GENERAL] Autovacuum Improvements > >"Bruce Momjian" <bruce@momjian.us> writes: > >> Yep, agreed on the random I/O issue. The larger question is if you >> have a huge table, do you care to reclaim 3% of the table >size, rather >> than just vacuum it when it gets to 10% dirty? I realize the vacuum >> is going to take a lot of time, but vacuuming to relaim 3% >three times >> seems like it is going to be more expensive than just vacuuming the >> 10% once. And vacuuming to reclaim 1% ten times seems even more >> expensive. The partial vacuum idea is starting to look like >a loser to me again. > >Well the answer is of course "that depends". > >If you maintain the dead space at a steady state averaging >1.5% instead of 5% your table is 3.33% smaller on average. If >this is a DSS system that will translate into running your >queries 3.33% faster. It will take a lot of vacuums before >they hurt more than a 3%+ performance drop. Good, this means a DSS system will mostly do table scans (right?). So probably you should witness the 'table scan' statistic and rows fetched aproaching the end of the universe (at least compared to inserts/updates/deletes)? >If it's an OLTP system the it's harder to figure. a 3.33% >increase in data density will translate to a higher cache hit >rate but how much higher depends on a lot of factors. In our >experiments we actually got bigger boost in these kinds of >situations than the I expected (I expected comparable to the >3.33% improvement). So it could be even more than 3.33%. But >like said it depends. >If you already have the whole database cache you won't see any >improvement. If you are right on the cusp you could see a huge benefit. These tables have high insert, update and delete rates, probably a lot of index scans? I believe the workload on table scans should be (close to) none. Are you willing to share some of this measured data? I'm quite interested in such figures. >It sounds like you're underestimating the performance drain >10% wasted space has. If we found out that one routine was >unnecessarily taking 10% of the cpu time it would be an >obvious focus of attention. 10% wasted space is going to work >out to about 10% of the i/o time. > >It also sounds like we're still focused on the performance >impact in absolute terms. I'm much more interested in changing >the performance characteristics so they're predictable and >scalable. It doesn't matter much if your 1kb table is 100% >slower than necessary but it does matter if your 1TB table >needs 1,000x as much vacuuming as your 1GB table even if it's >getting the same update traffic. Or rather, the vacuuming should pay back. A nice metric might be: cost_of_not_vacuuming / cost_of_vacuuming. Obviously, the higher the better. - Joris Dobbelsteen
On Mon, Jan 22, 2007 at 06:42:09PM +0000, Simon Riggs wrote: > On Mon, 2007-01-22 at 13:27 -0500, Bruce Momjian wrote: > > Yep, agreed on the random I/O issue. The larger question is if you have > > a huge table, do you care to reclaim 3% of the table size, rather than > > just vacuum it when it gets to 10% dirty? I realize the vacuum is going > > to take a lot of time, but vacuuming to relaim 3% three times seems like > > it is going to be more expensive than just vacuuming the 10% once. And > > vacuuming to reclaim 1% ten times seems even more expensive. The > > partial vacuum idea is starting to look like a loser to me again. > > Hold that thought! Read Heikki's Piggyback VACUUM idea on new thread... > > -- > Simon Riggs > EnterpriseDB http://www.enterprisedb.com > There may be other functions that could leverage a similar sort of infrastructure. For example, a long DB mining query could be registered with the system. Then as the pieces of the table/database are brought in to shared memory during the normal daily DB activity they can be acquired without forcing the DB to run a very I/O expensive query when waiting a bit for the results would be acceptable. As long as we are thinking piggyback. Ken
On Mon, Jan 22, 2007 at 07:24:20PM +0000, Heikki Linnakangas wrote: > Kenneth Marshall wrote: > >On Mon, Jan 22, 2007 at 06:42:09PM +0000, Simon Riggs wrote: > >>Hold that thought! Read Heikki's Piggyback VACUUM idea on new thread... > > > >There may be other functions that could leverage a similar sort of > >infrastructure. For example, a long DB mining query could be registered > >with the system. Then as the pieces of the table/database are brought in > >to shared memory during the normal daily DB activity they can be acquired > >without forcing the DB to run a very I/O expensive query when waiting a > >bit for the results would be acceptable. As long as we are thinking > >piggyback. > > Yeah, I had the same idea when we discussed synchronizing sequential > scans. The biggest difference is that with queries, there's often a user > waiting for the query to finish, but with vacuum we don't care so much > how long it takes. > Yes, but with trending and statistical analysis you may not need the exact answer ASAP. An approximate answer based on a fraction of the information would be useful. Also, "what if" queries could be run without impacting the production uses of a database. One might imagine having a query with results that "converge" as the table is processed during normal use. Ken
On Jan 22, 2007, at 11:16 AM, Richard Huxton wrote: > Bruce Momjian wrote: >> Yep, agreed on the random I/O issue. The larger question is if >> you have >> a huge table, do you care to reclaim 3% of the table size, rather >> than >> just vacuum it when it gets to 10% dirty? I realize the vacuum is >> going >> to take a lot of time, but vacuuming to relaim 3% three times >> seems like >> it is going to be more expensive than just vacuuming the 10% >> once. And >> vacuuming to reclaim 1% ten times seems even more expensive. The >> partial vacuum idea is starting to look like a loser to me again. > > Buying a house with a 25-year mortgage is much more expensive than > just paying cash too, but you don't always have a choice. > > Surely the key benefit of the partial vacuuming thing is that you > can at least do something useful with a large table if a full > vacuum takes 24 hours and you only have 4 hours of idle I/O. > > It's also occurred to me that all the discussion of scheduling way > back when isn't directly addressing the issue. What most people > want (I'm guessing) is to vacuum *when the user-workload allows* > and the time-tabling is just a sysadmin first-approximation at that. Yup. I'd really like for my app to be able to say "Hmm. No interactive users at the moment, no critical background tasks. Now would be a really good time for the DB to do some maintenance." but also to be able to interrupt the maintenance process if some new users or other system load show up. > With partial vacuuming possible, we can arrange things with just > three thresholds and two measurements: > Measurement 1 = system workload > Measurement 2 = a per-table "requires vacuuming" value > Threshold 1 = workload at which we do more vacuuming > Threshold 2 = workload at which we do less vacuuming > Threshold 3 = point at which a table is considered worth vacuuming. > Once every 10 seconds, the manager compares the current workload to > the thresholds and starts a new vacuum, kills one or does nothing. > New vacuum processes keep getting started as long as there is > workload spare and tables that need vacuuming. > > Now the trick of course is how you measure system workload in a > meaningful manner. I'd settle for a "start maintenance", "stop maintenance" API. Anything else (for instance the heuristics you suggest above) would definitely be gravy. It's not going to be simple to do, though, I don't think. Cheers, Steve
On Mon, Jan 22, 2007 at 05:51:53PM +0000, Gregory Stark wrote: > Actually no. A while back I did experiments to see how fast reading a file > sequentially was compared to reading the same file sequentially but skipping > x% of the blocks randomly. The results were surprising (to me) and depressing. > The breakeven point was about 7%. I asusume this means you were reading 7% of the blocks, not skipping 7% of the blocks when you broke even? I presume by break-even you mean it took just as long, time-wise. But did it have the same effect on system load? If reading only 7% of the blocks allows the drive to complete other requests more quickly then it's beneficial, even if the vacuum takes longer. This may be a silly thought, I'm not sure how drives handle multiple requests... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.