Thread: new GUC var: autovacuum_process_all_tables
Hi, Right now, when autovacuum is turned on we always assume it's supposed to process all tables except those that have autovacuum_enabled=false. Now, sometimes it might make more sense to keep it enabled but have it only check for certain tables, and leave the majority of them disabled. For this we'd have a separate GUC parameter, as in $SUBJECT (I'm not wedded to the name), and have the user set autovacuum_enabled=true via reloptions to enable it. Opinions? -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Thu, 2009-02-05 at 17:45 -0300, Alvaro Herrera wrote: > Hi, > > Right now, when autovacuum is turned on we always assume it's supposed > to process all tables except those that have autovacuum_enabled=false. > > Now, sometimes it might make more sense to keep it enabled but have it > only check for certain tables, and leave the majority of them disabled. > For this we'd have a separate GUC parameter, as in $SUBJECT (I'm not > wedded to the name), and have the user set autovacuum_enabled=true via > reloptions to enable it. > > Opinions? > So you are inverting the option? What I mean is you are giving the option of either: A. Process everything unless false B. Process nothing unless true If I am understanding what you wrote correctly I am not sure I like the idea as a whole. I think we should just always have it on and not have it be optional. The rule of thumb should be, we autovacuum everything, unless there is a extremely good reason not to and I think you should have to explicitly turn off autovacuum for a relation. Sincerely, Joshua D. Drake > -- > Alvaro Herrera http://www.CommandPrompt.com/ > PostgreSQL Replication, Consulting, Custom Development, 24x7 support > -- PostgreSQL - XMPP: jdrake@jabber.postgresql.org Consulting, Development, Support, Training 503-667-4564 - http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997
On Thu, 2009-02-05 at 17:45 -0300, Alvaro Herrera wrote: > Right now, when autovacuum is turned on we always assume it's supposed > to process all tables except those that have autovacuum_enabled=false. > > Now, sometimes it might make more sense to keep it enabled but have it > only check for certain tables, and leave the majority of them disabled. > For this we'd have a separate GUC parameter, as in $SUBJECT (I'm not > wedded to the name), and have the user set autovacuum_enabled=true via > reloptions to enable it. I would prefer it if that behaviour was enabled by putting a special entry into pg_autovacuum, e.g. "ALL TABLES", autovacuum_enabled=false I don't really want more GUCs for every nuance of AV behaviour. If you do this we'd want it to be selectable by database and schema as well. Perhaps by inserting the oid of the relevant database or schema? e.g. if we want to turn off AV for database X, which has oid x then we insert into pg_autovacuum(x, false, ....) or to make the default no-autovacuum for all tables in all databases insert into pg_autovacuum(0, false, ....) It would be useful if all but the first two columns were nullable also, to avoid having to specify -1. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
Alvaro Herrera <alvherre@commandprompt.com> writes: > Right now, when autovacuum is turned on we always assume it's supposed > to process all tables except those that have autovacuum_enabled=false. > Now, sometimes it might make more sense to keep it enabled but have it > only check for certain tables, and leave the majority of them disabled. When would that be? I can follow the use-case for vacuuming a selected set of tables via cron-driven commands or whatever, and then excluding those tables from autovacuum's purview. But there isn't a command to "vacuum all tables except these". Without such a command available to the cron-job, a switch such as you suggest is merely a foot-gun, because it's dead certain some tables are going to get left out of both manual and autovacuum processing. And before anyone suggests it, I don't want to invent "vacuum all tables except these". It'd make more sense to put the effort into developing better scheduling control over autovacuum, such as a concept of maintenance windows. (BTW, autovac does vacuum tables to prevent wraparound even if you try to tell it to skip them, right?) regards, tom lane
On Thu, Feb 5, 2009 at 3:45 PM, Alvaro Herrera <alvherre@commandprompt.com> wrote: > Right now, when autovacuum is turned on we always assume it's supposed > to process all tables except those that have autovacuum_enabled=false. > > Now, sometimes it might make more sense to keep it enabled but have it > only check for certain tables, and leave the majority of them disabled. > For this we'd have a separate GUC parameter, as in $SUBJECT (I'm not > wedded to the name), and have the user set autovacuum_enabled=true via > reloptions to enable it. > > Opinions? Sounds horribly confusing. It's not very difficult to write a script to set this value for every table in the database, if that's what you want to do. Having autovacuum_enabled potentially mean two different things depending on the value of some GUC sounds like a recipe for confusion (and no I don't like it any better if we put the global switch somewhere other than a GUC). ...Robert
Simon Riggs wrote: > > On Thu, 2009-02-05 at 17:45 -0300, Alvaro Herrera wrote: > > > For this we'd have a separate GUC parameter, as in $SUBJECT (I'm not > > wedded to the name), and have the user set autovacuum_enabled=true via > > reloptions to enable it. > > I would prefer it if that behaviour was enabled by putting a special > entry into pg_autovacuum, So you're not aware that we're doing away with pg_autovacuum for good? It's going to be replaced by reloptions, i.e. ALTER TABLE foo SET (autovacuum_enabled = false); Obviously there's no way to add a "catchall" setting. > e.g. > "ALL TABLES", autovacuum_enabled=false > > I don't really want more GUCs for every nuance of AV behaviour. In any case I fail to see how is this much different from a new GUC var. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Thu, 2009-02-05 at 16:29 -0500, Tom Lane wrote: > It'd make more sense to put the effort into developing > better scheduling control over autovacuum, such as a concept of > maintenance windows. We need that as well, not instead of. People want to be able to specify (as an example) * autovac these problem tables anytime required * for all other tables disable AV, except on sundays [non-busy-times] If we're going to build in scheduling featured for AV, I'd like to make those scheduling features available to user defined tasks also. No need to limit ourselves to just AV. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
On Thu, 2009-02-05 at 18:25 -0300, Alvaro Herrera wrote: > So you're not aware that we're doing away with pg_autovacuum for good? > It's going to be replaced by reloptions, i.e. > ALTER TABLE foo SET (autovacuum_enabled = false); > > Obviously there's no way to add a "catchall" setting. Seems like a bad plan then. How do you reconcile those conflicting requirements? > > e.g. > > "ALL TABLES", autovacuum_enabled=false > > > > I don't really want more GUCs for every nuance of AV behaviour. > > In any case I fail to see how is this much different from a new GUC var. Rows in a table v. new parameters. We can allow endless table driven complexity. Adding my_little_nuance=on|off strains most people's patience. How would I specify that database A wants AV turned off, but database B wants it on? -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
Alvaro Herrera escreveu: > Hi, > > Right now, when autovacuum is turned on we always assume it's supposed > to process all tables except those that have autovacuum_enabled=false. > > Now, sometimes it might make more sense to keep it enabled but have it > only check for certain tables, and leave the majority of them disabled. > For this we'd have a separate GUC parameter, as in $SUBJECT (I'm not > wedded to the name), and have the user set autovacuum_enabled=true via > reloptions to enable it. > > Opinions? > What about 'autovacuum_mode'? Values could be 'all' and 'reloption'. If we don't want to add another GUC, I'll go changing the 'autovacuum' GUC. Values would be: 'on' means enable autovacuum and process all tables, 'off' means disable autovacuum and 'reloption' (?) means only process those tables that have reloption autovacuum_enabled=true. The con is that we couldn't implement a per-{schema,database} switch for autovacuum. So I prefer the first one. -- Euler Taveira de Oliveira http://www.timbira.com/
Tom Lane wrote: > (BTW, autovac does vacuum tables to prevent wraparound even if you try > to tell it to skip them, right?) Yes. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Simon Riggs wrote: > > On Thu, 2009-02-05 at 18:25 -0300, Alvaro Herrera wrote: > > > So you're not aware that we're doing away with pg_autovacuum for good? > > It's going to be replaced by reloptions, i.e. > > ALTER TABLE foo SET (autovacuum_enabled = false); > > > > Obviously there's no way to add a "catchall" setting. > > Seems like a bad plan then. How do you reconcile those conflicting > requirements? I don't see them as conflicting; I see yours as a missing feature, namely the ability to add tables to an autovacuum "group", which could have settings attached. Being able to do that is the whole point of moving settings to reloptions. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Simon Riggs <simon@2ndQuadrant.com> writes: > On Thu, 2009-02-05 at 16:29 -0500, Tom Lane wrote: >> It'd make more sense to put the effort into developing >> better scheduling control over autovacuum, such as a concept of >> maintenance windows. > We need that as well, not instead of. I disagree; adding every frammish anyone could ever think of is not an overall improvement to the system. My feeling is that we should be trying to eliminate use-cases for cron-driven vacuuming, not trying to make sure that cron-driven scripts can do anything autovacuum can. The main remaining use-case seems to me to make vacuuming work adhere to some business-determined schedule, hence maintenance windows seem like the next thing to do. regards, tom lane
On Thu, 2009-02-05 at 18:54 -0300, Alvaro Herrera wrote: > Simon Riggs wrote: > > > > On Thu, 2009-02-05 at 18:25 -0300, Alvaro Herrera wrote: > > > > > So you're not aware that we're doing away with pg_autovacuum for good? > > > It's going to be replaced by reloptions, i.e. > > > ALTER TABLE foo SET (autovacuum_enabled = false); > > > > > > Obviously there's no way to add a "catchall" setting. > > > > Seems like a bad plan then. How do you reconcile those conflicting > > requirements? > > I don't see them as conflicting; I see yours as a missing feature, > namely the ability to add tables to an autovacuum "group", which could > have settings attached. Being able to do that is the whole point of > moving settings to reloptions. So your changes will allow these? ALTER DATABASE foo SET (autovacuum_enabled = false); ALTER SCHEMA foo SET (autovacuum_enabled = false); CREATE TABLE GROUP foo_group; ALTER TABLE foo SET TABLE GROUP foo_group; ALTER TABLE foo2 SET TABLE GROUP foo_group; ALTER TABLE GROUP SET (autovacuum_enabled = false); Hopefully the grouping of tables is not purely related to AV? -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
On Thu, 2009-02-05 at 17:08 -0500, Tom Lane wrote: > I disagree; adding every frammish anyone could ever think of is not > an overall improvement to the system. > :) > My feeling is that we should be trying to eliminate use-cases for > cron-driven vacuuming, not trying to make sure that cron-driven > scripts can do anything autovacuum can. Agreed. IMO, the user should only have to think about "vacuum" in an abstract sense. With the exception being those "few" tables that need the customized configuration (thus reloptions). > > The main remaining use-case seems to me to make vacuuming work adhere > to some business-determined schedule, hence maintenance windows seem > like the next thing to do. > Also agreed. Sincerely, Joshua D. Drake > regards, tom lane > -- PostgreSQL - XMPP: jdrake@jabber.postgresql.org Consulting, Development, Support, Training 503-667-4564 - http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997
Tom Lane wrote: > Simon Riggs <simon@2ndQuadrant.com> writes: > > On Thu, 2009-02-05 at 16:29 -0500, Tom Lane wrote: > >> It'd make more sense to put the effort into developing > >> better scheduling control over autovacuum, such as a concept of > >> maintenance windows. > > > We need that as well, not instead of. > > I disagree; adding every frammish anyone could ever think of is not > an overall improvement to the system. Agreed, let's get this capability out in 8.4 and we can always adust it based on user demand. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On Thu, 2009-02-05 at 17:57 -0500, Bruce Momjian wrote: > Tom Lane wrote: > > Simon Riggs <simon@2ndQuadrant.com> writes: > > > On Thu, 2009-02-05 at 16:29 -0500, Tom Lane wrote: > > >> It'd make more sense to put the effort into developing > > >> better scheduling control over autovacuum, such as a concept of > > >> maintenance windows. > > > > > We need that as well, not instead of. > > > > I disagree; adding every frammish anyone could ever think of is not > > an overall improvement to the system. > > Agreed, let's get this capability out in 8.4 and we can always adust it > based on user demand. Oh, I agree to limiting what we do for 8,4, but we need more later. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
On Thu, 2009-02-05 at 17:08 -0500, Tom Lane wrote: > Simon Riggs <simon@2ndQuadrant.com> writes: > > On Thu, 2009-02-05 at 16:29 -0500, Tom Lane wrote: > >> It'd make more sense to put the effort into developing > >> better scheduling control over autovacuum, such as a concept of > >> maintenance windows. > > > We need that as well, not instead of. > > I disagree; adding every frammish anyone could ever think of is not > an overall improvement to the system. I like your word frammish and am watchful of such things myself. > My feeling is that we should be trying to eliminate use-cases for > cron-driven vacuuming, Agreed. > not trying to make sure that cron-driven > scripts can do anything autovacuum can. I'm not in favour of limiting our capability to internal actions only. If we add a capability for scheduling work, we can easily make it capable of scheduling many kinds of work. Writing an application maintenance utility in PL/pgSQL is much better than having to write it for all the different servers an application may need to run on. We can't ignore that many people use Windows. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
On Thu, Feb 5, 2009 at 11:57 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > > Writing an application maintenance utility in PL/pgSQL is much better > than having to write it for all the different servers an application may > need to run on. Welcome to the suction effect. If your scheduler is in the database then you're stuck with the interfaces the database provides. When you use those interfaces you're going to be stuck with whatever tools work with them. Imagine trying to compose MIME email in plpgsql or do dns lookups or interface with your C application code. Plpgsql is singularly unsuited for anything other than database work. Yes we have other languages but there are still relatively few and having them running within a PL interface makes integrating with the rest of their systems more awkward. And more dangerous -- consider what a simple memory management bug can do if it's in a database backend instead of a network client. > We can't ignore that many people use Windows. I think that logic is backwards. People choose their development and server environment based on what works best for them. It makes no sense to engineer the system around the assumption that they don't like developing using the best native tools. Our reimplementation of the OS is always going to be second-rate by comparison and it's doing nothing for them but imposing the disadvantages of the restrictions being stuck in a database backend brings. -- greg
>> Agreed, let's get this capability out in 8.4 and we can always adust it >> based on user demand. > > Oh, I agree to limiting what we do for 8,4, but we need more later. Thinking about this a little more, the biggest problem I have with this feature is that it makes autovacuum_enabled mean two different things depending on context. But maybe we should change the name of the reloption to "autovacuum" and have three values for it: default|enabled|disabled. Then we could add a GUC called autovacuum_by_default = on|off, and we could later insert per-schema or per-database or per-table-group defaults without introducing any backwards-incompatibility (default would still mean default, though the default might come from a different source). ...Robert
Simon Riggs wrote: >> not trying to make sure that cron-driven >> scripts can do anything autovacuum can. >> > > I'm not in favour of limiting our capability to internal actions only. > If we add a capability for scheduling work, we can easily make it > capable of scheduling many kinds of work. > > Writing an application maintenance utility in PL/pgSQL is much better > than having to write it for all the different servers an application may > need to run on. We can't ignore that many people use Windows. > > I'm not sure what you're saying here. Windows has a scheduler (in my setup, that's how my buildfarm members run). And there are third party cron utilities as well. cheers andrew
Alvaro, First off, with over 200 GUC variables currently active, in general we should be looking to *eliminate* variables rather than adding them. So my personal bar for endorsing a new GUC is set pretty high. > Now, sometimes it might make more sense to keep it enabled but have it > only check for certain tables, and leave the majority of them disabled. > For this we'd have a separate GUC parameter, as in $SUBJECT (I'm not > wedded to the name), and have the user set autovacuum_enabled=true via > reloptions to enable it. I can't imagine, nor have I encountered in the 3 years of consulting I did since Autovaccum became available, such a use case. Unless there's a real, critical use case for this which is common, I'm opposed to this GUC. On the other hand, I'd been keen on a runtime suset autovaccum=on/off which we could call from a cron job or the pgadmin scheduler in order to have maintenance windows. Unless that's already becoming possible? --Josh
On Thu, Feb 5, 2009 at 7:13 PM, Robert Haas <robertmhaas@gmail.com> wrote: > Thinking about this a little more, the biggest problem I have with > this feature is that it makes autovacuum_enabled mean two different > things depending on context. But maybe we should change the name of > the reloption to "autovacuum" and have three values for it: > default|enabled|disabled. > > Then we could add a GUC called autovacuum_by_default = on|off, and we > could later insert per-schema or per-database or per-table-group > defaults without introducing any backwards-incompatibility (default > would still mean default, though the default might come from a > different source). In fact (he said to himself), we could take this a step further and call both the reloption and GUC "autovacuum_policy". Then we could have two policies for this release ("always" and "never") plus allow "default" for the reloption. Then future releases could allow users to define additional policies, like "off-hours". Just thinking out loud here folks... ...Robert
On Thu, 2009-02-05 at 19:23 -0500, Andrew Dunstan wrote: > > Simon Riggs wrote: > >> not trying to make sure that cron-driven > >> scripts can do anything autovacuum can. > >> > > > > I'm not in favour of limiting our capability to internal actions only. > > If we add a capability for scheduling work, we can easily make it > > capable of scheduling many kinds of work. > > > > Writing an application maintenance utility in PL/pgSQL is much better > > than having to write it for all the different servers an application may > > need to run on. We can't ignore that many people use Windows. > > > > > > I'm not sure what you're saying here. Windows has a scheduler (in my > setup, that's how my buildfarm members run). And there are third party > cron utilities as well. All I'm saying is *if* we put scheduling inside Postgres for autovacuum *then* we should make it general purpose scheduling. If anybody uses the argument that "we have external schedulers, so don't put them in the database" then that argument applies equally to scheduling autovacuum. It's easy to turn autovacuum on/off via an external scheduler, yet look upthread and see how many people think it should be in the database. Whichever way you think the decision should go, the same arguments apply to scheduling autovacuum and scheduling other database maintenance tasks. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
On Fri, 2009-02-06 at 00:07 +0000, Greg Stark wrote: > On Thu, Feb 5, 2009 at 11:57 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > > > > Writing an application maintenance utility in PL/pgSQL is much better > > than having to write it for all the different servers an application may > > need to run on. > > Welcome to the suction effect. If your scheduler is in the database > then you're stuck with the interfaces the database provides. When you > use those interfaces you're going to be stuck with whatever tools work > with them. Imagine trying to compose MIME email in plpgsql or do dns > lookups or interface with your C application code. Plpgsql is > singularly unsuited for anything other than database work. Yes we have > other languages but there are still relatively few and having them > running within a PL interface makes integrating with the rest of their > systems more awkward. And more dangerous -- consider what a simple > memory management bug can do if it's in a database backend instead of > a network client. You're saying that because it would be wrong for some things, we're not going to allow it at all. Re-read what you've written and you'll see that same argument would ban all PLs, on the basis that we have external language environments. I don't suppose you believe that. There are many in-database-only actions that people want to schedule, not just autovacuum. End of month data deletion, partition setup, pre-joined or summary table creation, FY rollover calculations etc etc. None of those have anything to do with the world outside the DB. If writing database functions is possible and sometimes desirable in Postgres, then scheduling them is also sometimes desirable also. I would hope that we can give our users the choice. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
Simon Riggs wrote: > All I'm saying is *if* we put scheduling inside Postgres for autovacuum > *then* we should make it general purpose scheduling. > > If anybody uses the argument that "we have external schedulers, so don't > put them in the database" then that argument applies equally to > scheduling autovacuum. It's easy to turn autovacuum on/off via an > external scheduler, yet look upthread and see how many people think it > should be in the database. > > Whichever way you think the decision should go, the same arguments apply > to scheduling autovacuum and scheduling other database maintenance > tasks. > > OK, I agree with that. cheers andrew
Josh Berkus wrote: > I can't imagine, nor have I encountered in the 3 years of consulting I > did since Autovaccum became available, such a use case. Ok, so due to popular demand, I'm not implementing this new GUC. > On the other hand, I'd been keen on a runtime suset autovaccum=on/off > which we could call from a cron job or the pgadmin scheduler in order to > have maintenance windows. Unless that's already becoming possible? autovacuum=on/off is already SIGHUP as of 8.3 (not SUSET, since it makes no sense to change it in a single connection). -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Joshua D. Drake wrote: > On Thu, 2009-02-05 at 17:08 -0500, Tom Lane wrote: >> My feeling is that we should be trying to eliminate use-cases for >> cron-driven vacuuming, not trying to make sure that cron-driven >> scripts can do anything autovacuum can. > Agreed. IMO, the user should only have to think about "vacuum" in an > abstract sense. +1 >> The main remaining use-case seems to me to make vacuuming work adhere >> to some business-determined schedule, hence maintenance windows seem >> like the next thing to do. > Also agreed. Somewhat agreed - since in many cases the business-determined schedule is just a rough estimate of measurable attributes of the machine. When we say "vacuum between midnight and 5am" we often actually mean "vacuum when the I/O subsystem has bandwidth to spare and the machine's otherwise lightly loaded, and we guess that means late at night".
Ron Mayer wrote: > Joshua D. Drake wrote: > >> The main remaining use-case seems to me to make vacuuming work adhere > >> to some business-determined schedule, hence maintenance windows seem > >> like the next thing to do. > > Also agreed. > > Somewhat agreed - since in many cases the business-determined schedule > is just a rough estimate of measurable attributes of the machine. When > we say "vacuum between midnight and 5am" we often actually mean "vacuum > when the I/O subsystem has bandwidth to spare and the machine's otherwise > lightly loaded, and we guess that means late at night". The current state of the system is not necessarily a good indicator of the immediately future state. If we were to collect history (I/O load versus time of day and day of week) that would be another matter, but I'm not sure that's a productive use of our development time. If we could guess what's going to be the load of the server in the near future, that would help scheduling considerable. Since we can't, we have to trust that the user can provide an educated guess, which is what "maintenance windows" are all about. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro, >> On the other hand, I'd been keen on a runtime suset autovaccum=on/off >> which we could call from a cron job or the pgadmin scheduler in order to >> have maintenance windows. Unless that's already becoming possible? > > autovacuum=on/off is already SIGHUP as of 8.3 (not SUSET, since it makes > no sense to change it in a single connection). Right. What I'm saying is that if it *didn't* require a sighup, then users could cronjob starting and stopping Autovac themselves. --Josh
Simon Riggs wrote: > > On Thu, 2009-02-05 at 18:54 -0300, Alvaro Herrera wrote: > > I don't see them as conflicting; I see yours as a missing feature, > > namely the ability to add tables to an autovacuum "group", which could > > have settings attached. Being able to do that is the whole point of > > moving settings to reloptions. > > So your changes will allow these? > > ALTER DATABASE foo SET (autovacuum_enabled = false); > ALTER SCHEMA foo SET (autovacuum_enabled = false); Hmm. Perhaps being able to turn autovacuum on/off per-database is desirable and possible, but I can't see doing it per schema. > CREATE TABLE GROUP foo_group; > ALTER TABLE foo SET TABLE GROUP foo_group; > ALTER TABLE foo2 SET TABLE GROUP foo_group; > ALTER TABLE GROUP SET (autovacuum_enabled = false); Yes, something like that. > Hopefully the grouping of tables is not purely related to AV? Hmm, good question. I was envisioning it only for autovacuum, but it hasn't been vetted on pgsql-hackers. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Josh Berkus wrote: >>> On the other hand, I'd been keen on a runtime suset autovaccum=on/off >>> which we could call from a cron job or the pgadmin scheduler in >>> order to have maintenance windows. Unless that's already becoming >>> possible? >> >> autovacuum=on/off is already SIGHUP as of 8.3 (not SUSET, since it makes >> no sense to change it in a single connection). > > Right. What I'm saying is that if it *didn't* require a sighup, then > users could cronjob starting and stopping Autovac themselves. Hmm, I'm not sure I understand what you're suggesting. Maybe what you want is that we have a SQL-accesible function that sends SIGHUP to the postmaster?
>> Hopefully the grouping of tables is not purely related to AV? > > Hmm, good question. I was envisioning it only for autovacuum, but it > hasn't been vetted on pgsql-hackers. I think we're in danger of inventing a solution in search of a problem here. AIUI, the main reason for table groups would be to define different autovacuum policies for different groups of tables. Right now, that would be pretty stupid, because there are only two possible policies: "yes" and "no". But if the policy is something very complex, then you're not going to want to redefine it for each individual table. Instead, you're going to want to define it once and then point individual tables at it. But you could do that just as well by assigning each policy a name or number and then setting a reloption on the table to refer to that name or number, which would completely avoid the need to invent all-new, non-standard syntax. But if we do decide to invent such a syntax, it's not good enough to say that we should make it general because it might be useful for a purpose other than autovacuum. We should have a pretty specific idea of what sort of purpose that might be. Otherwise, we'll likely find (when the purpose finally arises) that the supposedly-general model we introduced doesn't fit it as well as we thought. But right now, we don't even have ONE use case for the general syntax, let alone two, because the future autovacuum enhancements that would make use of that syntax haven't been designed yet (or at least haven't been discussed here yet). ...Robert
Alvaro Herrera <alvherre@commandprompt.com> writes: > Josh Berkus wrote: >> Right. What I'm saying is that if it *didn't* require a sighup, then >> users could cronjob starting and stopping Autovac themselves. > Hmm, I'm not sure I understand what you're suggesting. Maybe what you > want is that we have a SQL-accesible function that sends SIGHUP to the > postmaster? Like, say, pg_reload_conf()? But actually a cron job would almost certainly find it more pleasant to use pg_ctl reload and never bother with a database connection at all. regards, tom lane
On Mon, Feb 9, 2009 at 12:31 PM, Robert Haas <robertmhaas@gmail.com> wrote: >>> Hopefully the grouping of tables is not purely related to AV? >> >> Hmm, good question. I was envisioning it only for autovacuum, but it >> hasn't been vetted on pgsql-hackers. > > I think we're in danger of inventing a solution in search of a problem here. > > AIUI, the main reason for table groups would be to define different > autovacuum policies for different groups of tables. Right now, that > would be pretty stupid, because there are only two possible policies: > "yes" and "no". not really... the idea is to let one group to have autovacuum on in certain periods of time and let them of the rest of the time... or maybe a group of tables should be autovacuumed every 50 updates (vac_base_thresh) and some tables every 100, in some hours maybe we need to have different vac_cost_delay and vac_cost_limit... actually there are different parameters that could be set... > Instead, you're going to want to define it once and then point > individual tables at it. But you could do that just as well by > assigning each policy a name or number and then setting a reloption on > the table to refer to that name or number, which would completely > avoid the need to invent all-new, non-standard syntax. > well the reloptions *is* invented and non-standard syntax > But if we do decide to invent such a syntax, it's not good enough to > say that we should make it general because it might be useful for a > purpose other than autovacuum. We should have a pretty specific idea > of what sort of purpose that might be. Otherwise, we'll likely find > (when the purpose finally arises) that the supposedly-general model we > introduced doesn't fit it as well as we thought. > > But right now, we don't even have ONE use case for the general syntax, > let alone two, because the future autovacuum enhancements that would > make use of that syntax haven't been designed yet (or at least haven't > been discussed here yet). --- devil's advocate mode on --- a general purpose scheduler could be used for: - REINDEX - moving data around for OLAP - periodically execute SP that has to change the status of a process in a time driven way... - autovacuum, and programming manual vacuums --- devil's advocate mode off --- now, we actually can do that work with external schedulers (cron in linux, the windows task scheduler, etc)... the only two reasons i can think to prefer our own sintax for this is: pg_dump support to keep pilicies alive even in a fresh installed machine and marketing (two good reasons if you ask me) -- Atentamente, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. +59387171157
>> AIUI, the main reason for table groups would be to define different >> autovacuum policies for different groups of tables. Right now, that >> would be pretty stupid, because there are only two possible policies: >> "yes" and "no". > > not really... the idea is to let one group to have autovacuum on in > certain periods of time and let them of the rest of the time... Yes, but that's a future enhancement, we don't have that now. > or maybe a group of tables should be autovacuumed every 50 updates > (vac_base_thresh) and some tables every 100, in some hours maybe we > need to have different vac_cost_delay and vac_cost_limit... > > actually there are different parameters that could be set... > >> Instead, you're going to want to define it once and then point >> individual tables at it. But you could do that just as well by >> assigning each policy a name or number and then setting a reloption on >> the table to refer to that name or number, which would completely >> avoid the need to invent all-new, non-standard syntax. > > well the reloptions *is* invented and non-standard syntax Yes, but we already have that one. IMO we should try to reuse it and only invent new stuff if there is a compelling reason - which is so far absent from this discussion. >> But if we do decide to invent such a syntax, it's not good enough to >> say that we should make it general because it might be useful for a >> purpose other than autovacuum. We should have a pretty specific idea >> of what sort of purpose that might be. Otherwise, we'll likely find >> (when the purpose finally arises) that the supposedly-general model we >> introduced doesn't fit it as well as we thought. >> >> But right now, we don't even have ONE use case for the general syntax, >> let alone two, because the future autovacuum enhancements that would >> make use of that syntax haven't been designed yet (or at least haven't >> been discussed here yet). > > > --- devil's advocate mode on --- > > a general purpose scheduler could be used for: > - REINDEX > - moving data around for OLAP > - periodically execute SP that has to change the status of a process > in a time driven way... > - autovacuum, and programming manual vacuums > > --- devil's advocate mode off --- AFAICS, table groups wouldn't help with any of that stuff. I think you're proving my point that we have no idea what we're implementing, so it's a little premature to talk about what else the same infrastructure can be used for. > now, we actually can do that work with external schedulers (cron in > linux, the windows task scheduler, etc)... the only two reasons i can > think to prefer our own sintax for this is: pg_dump support to keep > pilicies alive even in a fresh installed machine and marketing (two > good reasons if you ask me) Which are all great points, but not what I was talking about. I am talking about the table group stuff. ...Robert
On Mon, Feb 9, 2009 at 1:44 PM, Robert Haas <robertmhaas@gmail.com> wrote: >>> AIUI, the main reason for table groups would be to define different >>> autovacuum policies for different groups of tables. Right now, that >>> would be pretty stupid, because there are only two possible policies: >>> "yes" and "no". >> >> not really... the idea is to let one group to have autovacuum on in >> certain periods of time and let them of the rest of the time... > > Yes, but that's a future enhancement, we don't have that now. > that was what simon was talking about, IIRC... he was speculating about a possible future syntax for grouping tables for use with a possible future postgres scheduler... >> >> well the reloptions *is* invented and non-standard syntax > > Yes, but we already have that one. IMO we should try to reuse it and > only invent new stuff if there is a compelling reason - which is so > far absent from this discussion. > reloptions is what we will use for autovacumm (actually Alvaro already applied that patch)... no one is touching that... the group syntax is for a future feature... >>> But if we do decide to invent such a syntax, it's not good enough to >>> say that we should make it general because it might be useful for a >>> purpose other than autovacuum. We should have a pretty specific idea >>> of what sort of purpose that might be. Otherwise, we'll likely find >>> (when the purpose finally arises) that the supposedly-general model we >>> introduced doesn't fit it as well as we thought. >>> >>> But right now, we don't even have ONE use case for the general syntax, >>> let alone two, because the future autovacuum enhancements that would >>> make use of that syntax haven't been designed yet (or at least haven't >>> been discussed here yet). >> >> >> --- devil's advocate mode on --- >> >> a general purpose scheduler could be used for: >> - REINDEX >> - moving data around for OLAP >> - periodically execute SP that has to change the status of a process >> in a time driven way... >> - autovacuum, and programming manual vacuums >> >> --- devil's advocate mode off --- > > AFAICS, table groups wouldn't help with any of that stuff. I think table groups are not being implemented now... it was a mere speculation about a way to apply a policy in a set of tables... actually, Alvaro's response was: "something like that" so we have to actually wait for his proposal before start a war on that and before we think it could be general enough to include other policies (like the ones for an scheduler) > you're proving my point that we have no idea what we're implementing, > so it's a little premature to talk about what else the same > infrastructure can be used for. > that's because we are not implementing that now... it's for the future... >> now, we actually can do that work with external schedulers (cron in >> linux, the windows task scheduler, etc)... the only two reasons i can >> think to prefer our own sintax for this is: pg_dump support to keep >> pilicies alive even in a fresh installed machine and marketing (two >> good reasons if you ask me) > > Which are all great points, but not what I was talking about. I am > talking about the table group stuff. > me too -- Atentamente, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. +59387171157