Thread: scheduler in core
Hi, I'm trying to figure out how difficult is this What we need: - a shared catalog - an API for filling the catalog - a scheduler daemon - pg_dump support A shared catalog ------------------------- Why shared? obviously because we don't want to scan all database's pg_job every time the daemon wake up. Maybe something like: pg_job ( oid -- use the oid as pk jobname jobdatoid -- job database oid jobowner -- forpermission's checking jobstarttime -- year to minute jobfrequency -- an interval? jobnexttime or joblasttime jobtype -- if we are going to allow plain sql or executable/shell job types jobexecute or jobscript ) comments about the catalog? An API for filling the catalog ----------------------------------------- do we want a CREATE JOB SQL synatx? FWIW, Oracle uses functions to create/remove jobs. An scheduler daemon -------------------------------- I think we can use 8.3's autovacuum daemon as a reference for this... AFAIK, it's a child of postmaster that sleep for $naptime and then looks for something to do (it also looks in a catalog) and the send a worker to do it that's what we need to do but... for the $naptime i think we can autoconfigure it, when we execute a job look for the next job in queue and sleep until we are going to reach the time to execute it i don't think we need a max_worker parameter, it should launch as many workers as it needs pg_dump support -------------------------- dump every entry of the pg_job catalog as a CREATE JOB SQL statement or a create_job() function depending on what we decided ideas? comments? -- Atentamente, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. +59387171157
On Sat, Feb 20, 2010 at 9:33 PM, Jaime Casanova <jcasanov@systemguards.com.ec> wrote: > Hi, > > I'm trying to figure out how difficult is this Why not just use pgAgent? It's far more flexible than the design you've suggested, and already exists. -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com
On Sat, Feb 20, 2010 at 4:33 PM, Jaime Casanova <jcasanov@systemguards.com.ec> wrote: > Hi, > > I'm trying to figure out how difficult is this > > What we need: > - a shared catalog > - an API for filling the catalog > - a scheduler daemon > - pg_dump support > > > A shared catalog > ------------------------- > Why shared? obviously because we don't want to scan all database's > pg_job every time the daemon wake up. > Maybe something like: > > pg_job ( > oid -- use the oid as pk > jobname > jobdatoid -- job database oid > jobowner -- for permission's checking > jobstarttime -- year to minute > jobfrequency -- an interval? > jobnexttime or joblasttime > jobtype -- if we are going to allow plain sql or > executable/shell job types > jobexecute or jobscript > ) > > comments about the catalog? > > > An API for filling the catalog > ----------------------------------------- > do we want a CREATE JOB SQL synatx? FWIW, Oracle uses functions to > create/remove jobs. > > > An scheduler daemon > -------------------------------- > I think we can use 8.3's autovacuum daemon as a reference for this... > AFAIK, it's a child of postmaster that sleep for $naptime and then > looks for something to do (it also looks in a > catalog) and the send a worker to do it > that's what we need to do but... > > for the $naptime i think we can autoconfigure it, when we execute a > job look for the next job in queue and sleep > until we are going to reach the time to execute it > > i don't think we need a max_worker parameter, it should launch as many > workers as it needs > > > pg_dump support > -------------------------- > dump every entry of the pg_job catalog as a CREATE JOB SQL statement > or a create_job() function depending > on what we decided > > ideas? comments? IMNSHO, an 'in core' scheduler would be useful. however, I think before you tackle a scheduler, we need proper stored procedures. Our existing functions don't cut it because you can manage the transaction state yourself. merlin
Dave Page <dpage@pgadmin.org> writes: > Why not just use pgAgent? It's far more flexible than the design > you've suggested, and already exists. What would it take to have it included in core, so that it's not a separate install to do? I'd love to have some support for running my maintenance pl functions directly from the database. I mean without installing, running and monitoring another (set of) process. Main advantage over cron or another scheduler being that it'd be part of my transactional backups, of course. Use cases, in case it's needed already, include creating new partitions, materializing views at known intervals, more general maintenance like vacuum and clusters operations, some reporting that could be done in the database itself, etc. Regards, -- dim
> > pg_job ( > oid -- use the oid as pk > jobname > jobdatoid -- job database oid > jobowner -- for permission's checking > jobstarttime -- year to minute > jobfrequency -- an interval? > jobnexttime or joblasttime > jobtype -- if we are going to allow plain sql or > executable/shell job types > jobexecute or jobscript > ) > > comments about the catalog? > + success_action +failure_action > > An API for filling the catalog > ----------------------------------------- > do we want a CREATE JOB SQL synatx? FWIW, Oracle uses functions to > create/remove jobs. > > > An scheduler daemon > -------------------------------- > I think we can use 8.3's autovacuum daemon as a reference for this... > AFAIK, it's a child of postmaster that sleep for $naptime and then > looks for something to do (it also looks in a > catalog) and the send a worker to do it > that's what we need to do but... > > for the $naptime i think we can autoconfigure it, when we execute a > job look for the next job in queue and sleep > until we are going to reach the time to execute it > > i don't think we need a max_worker parameter, it should launch as many > workers as it needs > > > pg_dump support > -------------------------- > dump every entry of the pg_job catalog as a CREATE JOB SQL statement > or a create_job() function depending > on what we decided > > ideas? comments? > > -- > Atentamente, > Jaime Casanova > Soporte y capacitación de PostgreSQL > Asesoría y desarrollo de sistemas > Guayaquil - Ecuador > Cel. +59387171157 > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >
On Sat, Feb 20, 2010 at 10:03 PM, Dimitri Fontaine <dfontaine@hi-media.com> wrote: > What would it take to have it included in core, so that it's not a > separate install to do? I'd love to have some support for running my > maintenance pl functions directly from the database. I mean without > installing, running and monitoring another (set of) process. It'll always be another (set of) processes even if it's "in core". All it means to be "in core" is that it will be harder to make modifications and you'll be tied to the Postgres release cycle. > Main advantage over cron or another scheduler being that it'd be part of > my transactional backups, of course. All you need for that is to store the schedule in a database table. This has nothing to do with where the scheduler code lives. -- greg
Dimitri Fontaine <dfontaine@hi-media.com> writes: > Dave Page <dpage@pgadmin.org> writes: >> Why not just use pgAgent? It's far more flexible than the design >> you've suggested, and already exists. > What would it take to have it included in core, I don't think this really makes sense. There's basically no argument for having it in core other than "I'm too lazy to install a separate package". Unlike the case for autovacuum, there isn't anything an in-core implementation could do that an external one doesn't do as well or better. So I'm not eager to take on additional maintenance burden for such a thing. regards, tom lane
Tom,
--
Lucas
I believe that "in core" may be "installed by default" in case of the pgAgent or similar solution...Many big companies does not allow the developers to configure and install components.... we need to request everthing in 10 copies of forms...By making it "in core" or "installed by default" means that we have more chance that the db scheduler would be widely accepted...And more important... we would not have to check its availability on the setup and provide an alternate scheduler if the database scheduler is off...I believe that a database scheduler would allow me to drop 20 thousand lines of java code in my server...
2010/2/20 Tom Lane <tgl@sss.pgh.pa.us>
Dimitri Fontaine <dfontaine@hi-media.com> writes:I don't think this really makes sense. There's basically no argument
> Dave Page <dpage@pgadmin.org> writes:
>> Why not just use pgAgent? It's far more flexible than the design
>> you've suggested, and already exists.
> What would it take to have it included in core,
for having it in core other than "I'm too lazy to install a separate
package". Unlike the case for autovacuum, there isn't anything an
in-core implementation could do that an external one doesn't do as well
or better. So I'm not eager to take on additional maintenance burden
for such a thing.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
--
Lucas
On Sat, Feb 20, 2010 at 4:37 PM, Dave Page <dpage@pgadmin.org> wrote: > On Sat, Feb 20, 2010 at 9:33 PM, Jaime Casanova > <jcasanov@systemguards.com.ec> wrote: >> Hi, >> >> I'm trying to figure out how difficult is this > > Why not just use pgAgent? It's far more flexible than the design > you've suggested, and already exists. > - it's not that easy if you don't have pgadmin - i need to backup postgres database to backup the schedules - the use pgagent here is not very extended but the few a know have tried desisted because they said: "not always executed the jobs"... i don't have any real evidence of that and probably what happens was that the pgagent daemon wasn't working (error prone), but being it started by the postmaster get rid of that problem... The first one could be rid out with a set of functions in pgagent and clear docs... i can live with the other two at some degree... but getting rid of the third one should be nice :) -- Atentamente, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. +59387171157
On Sun, Feb 21, 2010 at 12:03 AM, Jaime Casanova <jcasanov@systemguards.com.ec> wrote: > On Sat, Feb 20, 2010 at 4:37 PM, Dave Page <dpage@pgadmin.org> wrote: >> On Sat, Feb 20, 2010 at 9:33 PM, Jaime Casanova >> <jcasanov@systemguards.com.ec> wrote: >>> Hi, >>> >>> I'm trying to figure out how difficult is this >> >> Why not just use pgAgent? It's far more flexible than the design >> you've suggested, and already exists. >> > > - it's not that easy if you don't have pgadmin That's easily changed. EDB's Advanced Server emulates Oracles DBMS_JOB interface with it for example. > - i need to backup postgres database to backup the schedules Only if you put the control schema in that database. If you don't want to do that, stick it somewhere else. With your proposed scheme, you'd probably have to use pg_dumpall --backup-globals (or whatever it's called) > - the use pgagent here is not very extended but the few a know have > tried desisted because they > said: "not always executed the jobs"... i don't have any real evidence > of that and probably what happens > was that the pgagent daemon wasn't working (error prone), but being it > started by the postmaster get rid of that > problem... Noone has ever reported such a bug that I'm aware of. -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com
On Sat, Feb 20, 2010 at 11:55 PM, Lucas <lucas75@gmail.com> wrote: > I believe that a database scheduler would allow me to drop 20 thousand lines > of java code in my server... How does that work? If you don't have a scheduler in the database, or pgAgent, why aren't you using cron or Windows task scheduler, neither of which would require 20K lines of Java code. -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com
On Sat, Feb 20, 2010 at 7:32 PM, Dave Page <dpage@pgadmin.org> wrote: > On Sun, Feb 21, 2010 at 12:03 AM, Jaime Casanova > <jcasanov@systemguards.com.ec> wrote: >> On Sat, Feb 20, 2010 at 4:37 PM, Dave Page <dpage@pgadmin.org> wrote: >>> On Sat, Feb 20, 2010 at 9:33 PM, Jaime Casanova >>> <jcasanov@systemguards.com.ec> wrote: >>>> Hi, >>>> >>>> I'm trying to figure out how difficult is this >>> >>> Why not just use pgAgent? It's far more flexible than the design >>> you've suggested, and already exists. >>> >> >> - it's not that easy if you don't have pgadmin > > That's easily changed. EDB's Advanced Server emulates Oracles DBMS_JOB > interface with it for example. > maybe i can work on that, then -- Atentamente, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. +59387171157
On Sat, Feb 20, 2010 at 10:03 PM, Dimitri Fontaine <dfontaine@hi-media.com> wrote: > Dave Page <dpage@pgadmin.org> writes: >> Why not just use pgAgent? It's far more flexible than the design >> you've suggested, and already exists. > > What would it take to have it included in core, so that it's not a > separate install to do? I'd love to have some support for running my > maintenance pl functions directly from the database. I mean without > installing, running and monitoring another (set of) process. It's currently written in C++/pl/pgsql and uses wxWidgets, none of which couldn't be changed with a little work. Having it in core will almost certainly result in reduced functionality though - there are use cases in which you may have multiple agents running against one control database, or executing jobs on remote databases for example. We originally wrote the code such that it might be easily included in core in the future, but every time this topic comes up in -hackers, there are a significant number of people who don't think a scheduler should be tied to the core code so we stopped assuming it ever would be. -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com
Lucas wrote: > Tom, > > I believe that "in core" may be "installed by default" in case of > the pgAgent or similar solution... > > Many big companies does not allow the developers to configure and > install components.... we need to request everthing in 10 copies > of forms... > > By making it "in core" or "installed by default" means that we > have more chance that the db scheduler would be widely accepted... > > This reasoning just doesn't fly in the PostgreSQL world. PostgreSQL is designed to be extensible, not a monolithic product. We're not going to change that because some companies have insane corporate policies. The answer, as Jefferson said in another context, is to "inform their ignorance." That isn't to say that there isn't a case for an in core scheduler, but this at least isn't a good reason for it. cheers andrew
On Sun, Feb 21, 2010 at 12:38 AM, Jaime Casanova <jcasanov@systemguards.com.ec> wrote: > On Sat, Feb 20, 2010 at 7:32 PM, Dave Page <dpage@pgadmin.org> wrote: >> On Sun, Feb 21, 2010 at 12:03 AM, Jaime Casanova >> <jcasanov@systemguards.com.ec> wrote: >>> On Sat, Feb 20, 2010 at 4:37 PM, Dave Page <dpage@pgadmin.org> wrote: >>>> On Sat, Feb 20, 2010 at 9:33 PM, Jaime Casanova >>>> <jcasanov@systemguards.com.ec> wrote: >>>>> Hi, >>>>> >>>>> I'm trying to figure out how difficult is this >>>> >>>> Why not just use pgAgent? It's far more flexible than the design >>>> you've suggested, and already exists. >>>> >>> >>> - it's not that easy if you don't have pgadmin >> >> That's easily changed. EDB's Advanced Server emulates Oracles DBMS_JOB >> interface with it for example. >> > > maybe i can work on that, then I'd love to add a management API to pgAgent if you'd like to work on it. -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com
On Sat, 2010-02-20 at 18:19 -0500, Tom Lane wrote: > Dimitri Fontaine <dfontaine@hi-media.com> writes: > > Dave Page <dpage@pgadmin.org> writes: > >> Why not just use pgAgent? It's far more flexible than the design > >> you've suggested, and already exists. > > > What would it take to have it included in core, > > I don't think this really makes sense. There's basically no argument > for having it in core other than "I'm too lazy to install a separate > package". Unlike the case for autovacuum, there isn't anything an > in-core implementation could do that an external one doesn't do as well > or better. So I'm not eager to take on additional maintenance burden > for such a thing. There is zero technical reason for this to be in core. That doesn't mean it isn't a really good idea. It would be nice to have a comprehensive job scheduling solution that allows me to continue abstract away from external solutions and operating system dependencies. Joshua D. Drake > > regards, tom lane > -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564 Consulting, Training, Support, Custom Development, Engineering Respect is earned, not gained through arbitrary and repetitive use or Mr. or Sir.
On Feb 20, 2010, at 8:06 PM, "Joshua D. Drake" <jd@commandprompt.com> wrote: > There is zero technical reason for this to be in core. > > That doesn't mean it isn't a really good idea. It would be nice to > have > a comprehensive job scheduling solution that allows me to continue > abstract away from external solutions and operating system > dependencies. Well put. That pretty much sums up my feelings on this perfectly. ...Robert
Ah! wxWidgets... Yes, i knew there was something i didn't like about pgAgent. So is not as simple as installing it 2010/2/20, Dave Page <dpage@pgadmin.org>: > On Sat, Feb 20, 2010 at 10:03 PM, Dimitri Fontaine > <dfontaine@hi-media.com> wrote: >> Dave Page <dpage@pgadmin.org> writes: >>> Why not just use pgAgent? It's far more flexible than the design >>> you've suggested, and already exists. >> >> What would it take to have it included in core, so that it's not a >> separate install to do? I'd love to have some support for running my >> maintenance pl functions directly from the database. I mean without >> installing, running and monitoring another (set of) process. > > It's currently written in C++/pl/pgsql and uses wxWidgets, none of > which couldn't be changed with a little work. Having it in core will > almost certainly result in reduced functionality though - there are > use cases in which you may have multiple agents running against one > control database, or executing jobs on remote databases for example. > > We originally wrote the code such that it might be easily included in > core in the future, but every time this topic comes up in -hackers, > there are a significant number of people who don't think a scheduler > should be tied to the core code so we stopped assuming it ever would > be. > > -- > Dave Page > EnterpriseDB UK: http://www.enterprisedb.com > -- Enviado desde mi dispositivo móvil Atentamente, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. +59387171157
2010/2/21 Andrew Dunstan <andrew@dunslane.net>: > > > Lucas wrote: >> >> Tom, >> >> I believe that "in core" may be "installed by default" in case of >> the pgAgent or similar solution... >> >> Many big companies does not allow the developers to configure and >> install components.... we need to request everthing in 10 copies >> of forms... >> >> By making it "in core" or "installed by default" means that we >> have more chance that the db scheduler would be widely accepted... >> > > This reasoning just doesn't fly in the PostgreSQL world. PostgreSQL is > designed to be extensible, not a monolithic product. We're not going to > change that because some companies have insane corporate policies. The > answer, as Jefferson said in another context, is to "inform their > ignorance." > > That isn't to say that there isn't a case for an in core scheduler, but this > at least isn't a good reason for it. What I remember - this is exactly same discus like was about replication thre years ago fiirst strategy - we doesn't need it in core next we was last with replacation Regards Pavel Stehule > > cheers > > andrew > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >
On Sat, 2010-02-20 at 18:19 -0500, Tom Lane wrote: > Dimitri Fontaine <dfontaine@hi-media.com> writes: > > Dave Page <dpage@pgadmin.org> writes: > >> Why not just use pgAgent? It's far more flexible than the design > >> you've suggested, and already exists. > > > What would it take to have it included in core, > > I don't think this really makes sense. There's basically no argument > for having it in core other than "I'm too lazy to install a separate > package". Unlike the case for autovacuum, there isn't anything an > in-core implementation could do that an external one doesn't do as well > or better. So I'm not eager to take on additional maintenance burden > for such a thing. There is zero technical reason for this to be in core. That doesn't mean it isn't a really good idea. It would be nice to have a comprehensive job scheduling solution that allows me to continue abstract away from external solutions and operating system dependencies. Joshua D. Drake > > regards, tom lane > -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564 Consulting, Training, Support, Custom Development, Engineering Respect is earned, not gained through arbitrary and repetitive use or Mr. or Sir.
"Joshua D. Drake" <jd@commandprompt.com> writes: > On Sat, 2010-02-20 at 18:19 -0500, Tom Lane wrote: >> Dimitri Fontaine <dfontaine@hi-media.com> writes: >> > What would it take to have it included in core, >> >> I don't think this really makes sense. There's basically no argument >> for having it in core other than "I'm too lazy to install a separate >> package". Unlike the case for autovacuum, there isn't anything an >> in-core implementation could do that an external one doesn't do as well >> or better. So I'm not eager to take on additional maintenance burden >> for such a thing. > > There is zero technical reason for this to be in core. > > That doesn't mean it isn't a really good idea. It would be nice to have > a comprehensive job scheduling solution that allows me to continue > abstract away from external solutions and operating system dependencies. Maybe what we need, on the technical level, is a way to distribute this code with the main product but without draining too much effort from core members there. Like we do with contribs I guess, but on a larger scale. I guess git submodules, PGAN, extensions and all that jazz are going to help. Meanwhile I'll have to learn enough of pgAgent to figure out how much it's tied to pgadmin, and we'll have to make those other facilities something real. Regards, -- dim
Greg Stark <gsstark@mit.edu> writes: > It'll always be another (set of) processes even if it's "in core". All > it means to be "in core" is that it will be harder to make > modifications and you'll be tied to the Postgres release cycle. Another set of processes all right, but that postmaster is responsible of, that it starts and ends at the right time. >> Main advantage over cron or another scheduler being that it'd be part of >> my transactional backups, of course. > > All you need for that is to store the schedule in a database table. > This has nothing to do with where the scheduler code lives. Not true. You need custom scripts that will read what's in this database table and run it at the right timing, care about running more than one job at the same time when necessary, reports what the outcome was somewhere, etc. The simplest would be a query that writes out in cron format the setup you've made in the database, I suppose. When do you run that query? You need an untrusted trigger? What happens if your query or script writes a file cron will not be able to read, or on a server where cron is not running? I'm not saying this is any harder that other admin sys stuff we have to do to operate the systems, just that it seems it would be simpler, easier and less error prone to be able to schedule database maintenance from within the database itself, in such a way that the classic dump and restore process restores the maintenance scripts too. That would allow for automatic creation of partitions in dev and pre-prod environments where you install more than one copy of the same database at once, but would like to avoid maintaining one set of cron entries per copy. As said Tom, technically, it's obviously possible not to depend on a PostgreSQL integrated scheduler. As said JD, it still is a pretty good idea to provide one in core. Regards, -- dim
On Feb 20, 2010, at 8:06 PM, "Joshua D. Drake" <jd@commandprompt.com> wrote: > There is zero technical reason for this to be in core. > > That doesn't mean it isn't a really good idea. It would be nice to > have > a comprehensive job scheduling solution that allows me to continue > abstract away from external solutions and operating system > dependencies. Well put. That pretty much sums up my feelings on this perfectly. ...Robert
Pavel Stehule wrote: >> This reasoning just doesn't fly in the PostgreSQL world. PostgreSQL is >> designed to be extensible, not a monolithic product. We're not going to >> change that because some companies have insane corporate policies. The >> answer, as Jefferson said in another context, is to "inform their >> ignorance." >> >> That isn't to say that there isn't a case for an in core scheduler, but this >> at least isn't a good reason for it. >> > > What I remember - this is exactly same discus like was about > replication thre years ago > > fiirst strategy - we doesn't need it in core > next we was last with replacation > > That's a pretty poor analogy IMNSHO. There are very good technical reasons to have replication in the core. That is much less clear for a scheduler. But in any case, I didn't say that we shouldn't have a scheduler. I specifically said there might be a case for it - read the first clause of my last sentence. What I said was that the reason given, namely that Corporations didn't want to use add-on modules, was not a good reason. cheers andrew
Pavel Stehule wrote: > 2010/2/21 Andrew Dunstan <andrew@dunslane.net>: > >> ? ?I believe that "in core" may be "installed by default" in case of > >> ? ?the pgAgent or similar solution... > >> > >> ? ?Many big companies does not allow the developers to configure and > >> ? ?install components.... we need to request everthing in 10 copies > >> ? ?of forms... > >> > >> ? ?By making it "in core" or "installed by default" means that we > >> ? ?have more chance that the db scheduler would be widely accepted... > >> > > > > This reasoning just doesn't fly in the PostgreSQL world. PostgreSQL is > > designed to be extensible, not a monolithic product. We're not going to > > change that because some companies have insane corporate policies. ?The > > answer, as Jefferson said in another context, is to "inform their > > ignorance." > > > > That isn't to say that there isn't a case for an in core scheduler, but this > > at least isn't a good reason for it. > > What I remember - this is exactly same discus like was about > replication thre years ago > > fiirst strategy - we doesn't need it in core > next we was last with replacation We resisted putting replication into the core until we needed some facilities that were only available from the core. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.comPG East: http://www.enterprisedb.com/community/nav-pg-east-2010.do + If your life is a hard drive,Christ can be your backup. +
2010/2/20 Andrew Dunstan <andrew@dunslane.net> > > We're not going to change that because some companies have > insane corporate policies. I agree, Andrew... This is an outside benefit... not a reason or justification... I believe that a general purpose scheduler is similar to the autovacuum... it is not really needed, we can always configurean external scheduler. But I liked a LOT... For me is not a question of "must be in core" is a question of cost/benefit. I do not see much cost, but a lot of benefits: Like Joshua said "abstract away from external solutions and operating system dependencies". Like Dimitri said "Main advantage over cron or another scheduler being that it'd be part of my transactional backups". To me is the reliability of having the partition creation/removal being part of the database, be able of make consolidations,cleanups and periodic consistency checks and diagnostics without external dependencies. I wonder if the scheduler already existed before the implementation of the autovacuum, its implementation would not be afunction executed by the in-core scheduler? - - Lucas
Lucas wrote: > I believe that "in core" may be "installed by default" in case of Those seem like totally orthogonal concepts to me. A feature may be "in core" but not "installed by default" (like many PLs). A feature might not be "in core" but "installed" by many installers (say postgis). It seems like half the people here are arguing for the former concept. It seems the other half are arguing against the latter concept. Is the real need here for a convenient way to enable and/or recommend packagers to install some non-core modules by default?
Ron Mayer <rm_pg@cheapcomplexdevices.com> writes: > Is the real need here for a convenient way to enable and/or > recommend packagers to install some non-core modules by default? It would certainly help us resist assorted requests to put everything including the kitchen sink into core. regards, tom lane
On Sun, Feb 21, 2010 at 12:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Ron Mayer <rm_pg@cheapcomplexdevices.com> writes: >> Is the real need here for a convenient way to enable and/or >> recommend packagers to install some non-core modules by default? > > It would certainly help us resist assorted requests to put everything > including the kitchen sink into core. If you don't want people to keep requesting more features in core, you should stop doing such a good job making the functionality that gets put into core awesome. That's partly tongue-in-cheek, but there's some real truth to it. Stuff doesn't go into core unless it just works. And having things in core is appealing because it means they're available everywhere, they work the same way everywhere, and they can be fully managed within the database without a lot of futzing around. Having an extensible system is a good thing and I'm glad we do, but having a rich feature set available in core is also a very good thing for a lot of reasons, at least IMHO. ...Robert
On Sun, Feb 21, 2010 at 10:17 AM, Lucas <lucas75@gmail.com> wrote: > I wonder if the scheduler already existed before the > implementation of the autovacuum, its implementation would > not be a function executed by the in-core scheduler? The real genius of autovacuum is that it works out when there has been enough activity in particular tables that they need to be vacuumed. We might be able to use an in-core scheduler to wake it up every minute to look at the stats, or whatever it is that we do, but that's not all that exciting. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > On Sun, Feb 21, 2010 at 10:17 AM, Lucas <lucas75@gmail.com> wrote: >> I wonder if the scheduler already existed before the >> �implementation of the autovacuum, its implementation would >> �not be a function executed by the in-core scheduler? > The real genius of autovacuum is that it works out when there has been > enough activity in particular tables that they need to be vacuumed. > We might be able to use an in-core scheduler to wake it up every > minute to look at the stats, or whatever it is that we do, but that's > not all that exciting. The wake-up-every-N-seconds part of it is actually the weakest part (search the archives for questions about autovacuum_naptime). To my mind, the killer reason why autovac needed to be integrated is so that the system itself could trigger autovac runs in response to threatened XID wraparound conditions. A facility for scheduling user jobs, almost by definition, won't have any system-internal trigger conditions. regards, tom lane
On Sat, 2010-02-20 at 18:19 -0500, Tom Lane wrote: > Dimitri Fontaine <dfontaine@hi-media.com> writes: > > Dave Page <dpage@pgadmin.org> writes: > >> Why not just use pgAgent? It's far more flexible than the design > >> you've suggested, and already exists. > > > What would it take to have it included in core, > > I don't think this really makes sense. There's basically no argument > for having it in core other than "I'm too lazy to install a separate > package". Unlike the case for autovacuum, there isn't anything an > in-core implementation could do that an external one doesn't do as well > or better. So I'm not eager to take on additional maintenance burden > for such a thing. There is currently no way to run a separate daemon process that runs user code as part of Postgres, so that the startup code gets run immediately we startup, re-run if we crash and shut down cleanly when the server does. If there were some way to run arbitrary code in a daemon using an extensibility API then we wouldn't ever get any requests for the scheduler, cos you could write it yourself without troubling anybody here. -- Simon Riggs www.2ndQuadrant.com
On Sun, Feb 21, 2010 at 1:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Sun, Feb 21, 2010 at 10:17 AM, Lucas <lucas75@gmail.com> wrote: >>> I wonder if the scheduler already existed before the >>> implementation of the autovacuum, its implementation would >>> not be a function executed by the in-core scheduler? > >> The real genius of autovacuum is that it works out when there has been >> enough activity in particular tables that they need to be vacuumed. >> We might be able to use an in-core scheduler to wake it up every >> minute to look at the stats, or whatever it is that we do, but that's >> not all that exciting. > > The wake-up-every-N-seconds part of it is actually the weakest part > (search the archives for questions about autovacuum_naptime). To my > mind, the killer reason why autovac needed to be integrated is so that > the system itself could trigger autovac runs in response to threatened > XID wraparound conditions. A facility for scheduling user jobs, almost > by definition, won't have any system-internal trigger conditions. Right. Without prejudice to my earlier statements that I think this might possibly be a good thing to do anyway, the case for it would be a lot stronger if it provided some genuine additional functionality. ...Robert
On Sun, Feb 21, 2010 at 1:11 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > On Sat, 2010-02-20 at 18:19 -0500, Tom Lane wrote: >> Dimitri Fontaine <dfontaine@hi-media.com> writes: >> > Dave Page <dpage@pgadmin.org> writes: >> >> Why not just use pgAgent? It's far more flexible than the design >> >> you've suggested, and already exists. >> >> > What would it take to have it included in core, >> >> I don't think this really makes sense. There's basically no argument >> for having it in core other than "I'm too lazy to install a separate >> package". Unlike the case for autovacuum, there isn't anything an >> in-core implementation could do that an external one doesn't do as well >> or better. So I'm not eager to take on additional maintenance burden >> for such a thing. > > There is currently no way to run a separate daemon process that runs > user code as part of Postgres, so that the startup code gets run > immediately we startup, re-run if we crash and shut down cleanly when > the server does. Good point. > If there were some way to run arbitrary code in a > daemon using an extensibility API then we wouldn't ever get any requests > for the scheduler, cos you could write it yourself without troubling > anybody here. That might be a little overly optimistic, but I get the point. ...Robert
Simon Riggs <simon@2ndQuadrant.com> writes: > There is currently no way to run a separate daemon process that runs > user code as part of Postgres, so that the startup code gets run > immediately we startup, re-run if we crash and shut down cleanly when > the server does. If there were some way to run arbitrary code in a > daemon using an extensibility API then we wouldn't ever get any requests > for the scheduler, cos you could write it yourself without troubling > anybody here. Please do include the Skytools / PGQ ticker as one use case in the design discussion, and pgbouncer too. Having user daemons as part as the PostgreSQL extensibility would be awesome indeed! Bonus point if you build them with PGXS and install them from SQL, so that the current extension packaging design applies. I guess we can say that the archive and restore command are precursors of managed user "daemons", or say, integrated processes. So adding them to the use cases to cover would make sense. Regards, -- dim
On Sun, 2010-02-21 at 20:46 +0100, Dimitri Fontaine wrote: > Simon Riggs <simon@2ndQuadrant.com> writes: > > There is currently no way to run a separate daemon process that runs > > user code as part of Postgres, so that the startup code gets run > > immediately we startup, re-run if we crash and shut down cleanly when > > the server does. If there were some way to run arbitrary code in a > > daemon using an extensibility API then we wouldn't ever get any requests > > for the scheduler, cos you could write it yourself without troubling > > anybody here. > > Please do include the Skytools / PGQ ticker as one use case in the > design discussion, and pgbouncer too. Having user daemons as part as the > PostgreSQL extensibility would be awesome indeed! > > Bonus point if you build them with PGXS and install them from SQL, so > that the current extension packaging design applies. > > I guess we can say that the archive and restore command are precursors > of managed user "daemons", or say, integrated processes. So adding them > to the use cases to cover would make sense. Yes, I think so. Rough design... integrated_user_processes = 'x, y, z' would run x(), y() and z() in their own processes. These would execute after startup, or at consistent point in recovery. The code for these would come from preload_libraries etc. They would not block smart shutdown, though their shudown sequence might delay it. User code would be executed last at startup and first thing at shutdown. API would be user_process_startup(), user_process_shutdown(). -- Simon Riggs www.2ndQuadrant.com
On Sun, Feb 21, 2010 at 1:11 PM, Simon Riggs <simon@2ndquadrant.com> wrote: [...] >> Dimitri Fontaine <dfontaine@hi-media.com> writes: >> > Dave Page <dpage@pgadmin.org> writes: >> >> Why not just use pgAgent? It's far more flexible than the design >> >> you've suggested, and already exists. >> >> > What would it take to have it included in core, >> [...] > > There is currently no way to run a separate daemon process that runs > user code as part of Postgres, so that the startup code gets run > immediately we startup, re-run if we crash and shut down cleanly when > the server does. If there were some way to run arbitrary code in a > daemon using an extensibility API then we wouldn't ever get any requests > for the scheduler, cos you could write it yourself without troubling > anybody here. > ah! that could get rid of one of my complaints, and then i could just work the rest in pgAgent... so, is this idea (having some user processes be "tied" to postmaster start/stop) going to somewhere? it also could help if we you have processes LISTENing for NOTIFYs -- Atentamente, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. +59387171157
Jaime Casanova wrote: > On Sun, Feb 21, 2010 at 1:11 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >> There is currently no way to run a separate daemon process that runs >> user code as part of Postgres, so that the startup code gets run >> immediately we startup, re-run if we crash and shut down cleanly when >> the server does. If there were some way to run arbitrary code in a >> daemon using an extensibility API then we wouldn't ever get any requests >> for the scheduler, cos you could write it yourself without troubling >> anybody here. > > ah! that could get rid of one of my complaints, and then i could just > work the rest in pgAgent... Yeah, seems like a good idea. Slon daemon and similar daemons could also use it. > so, is this idea (having some user processes be "tied" to postmaster > start/stop) going to somewhere? I've added this to the TODO list. Now we just need someone to write it. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
2010/2/22 Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>: > Jaime Casanova wrote: >> On Sun, Feb 21, 2010 at 1:11 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >>> There is currently no way to run a separate daemon process that runs >>> user code as part of Postgres, so that the startup code gets run >>> immediately we startup, re-run if we crash and shut down cleanly when >>> the server does. If there were some way to run arbitrary code in a >>> daemon using an extensibility API then we wouldn't ever get any requests >>> for the scheduler, cos you could write it yourself without troubling >>> anybody here. >> >> ah! that could get rid of one of my complaints, and then i could just >> work the rest in pgAgent... > > Yeah, seems like a good idea. Slon daemon and similar daemons could also > use it. > I like it. I thought about some workflow system integrated with scheduler. Regards Pavel >> so, is this idea (having some user processes be "tied" to postmaster >> start/stop) going to somewhere? > > I've added this to the TODO list. Now we just need someone to write it. > > -- > Heikki Linnakangas > EnterpriseDB http://www.enterprisedb.com > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >
On Sat, Feb 20, 2010 at 8:06 PM, Joshua D. Drake <jd@commandprompt.com> wrote: > > That doesn't mean it isn't a really good idea. It would be nice to have > a comprehensive job scheduling solution that allows me to continue > abstract away from external solutions and operating system dependencies. +1! Aa scheduler is an extremely common thing to have to integrate with the database. All of our commercial competitors have them, and they are heavily used. Like I noted above, what people want to schedule is going to be stored procedures. Having both would virtually eliminate the need for scripting outside the database, which is a pretty big deal since external scripts are a real pain to keep cross platform. Since there's probably a lot of overlapping problems in those two features, why not tackle both at once? merlin
Merlin Moncure escribió: > Like I noted above, what people want to schedule is going to be stored > procedures. Having both would virtually eliminate the need for > scripting outside the database, which is a pretty big deal since > external scripts are a real pain to keep cross platform. Since > there's probably a lot of overlapping problems in those two features, > why not tackle both at once? Divide and conquer? -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Mon, Feb 22, 2010 at 2:29 PM, Alvaro Herrera <alvherre@commandprompt.com> wrote: > Merlin Moncure escribió: > >> Like I noted above, what people want to schedule is going to be stored >> procedures. Having both would virtually eliminate the need for >> scripting outside the database, which is a pretty big deal since >> external scripts are a real pain to keep cross platform. Since >> there's probably a lot of overlapping problems in those two features, >> why not tackle both at once? > > Divide and conquer? When I meant 'tackle', it is more of a 'come to an understanding' thing. Normally I would agree with you anyways, but I think what most people would want to schedule would be stored procedures (sorry to continually repeat myself here, but I really think this should be critical to any scheduling proposal), not functions or ad hoc scripts. merlin
On Sat, Feb 20, 2010 at 4:41 PM, Merlin Moncure <mmoncure@gmail.com> wrote: > IMNSHO, an 'in core' scheduler would be useful. however, I think > before you tackle a scheduler, we need proper stored procedures. Our > existing functions don't cut it because you can manage the transaction > state yourself. Did you mean that you "can't" manage the transaction state yourself? Has anyone given any thought to what would be required to relax this restriction? Is this totally impossible given our architecture, or just a lack of round tuits? See also: http://www.postgresql.org/docs/current/static/plpgsql-porting.html#PLPGSQL-PORTING-EXCEPTIONS ...Robert
2010/3/1 Robert Haas <robertmhaas@gmail.com>: > On Sat, Feb 20, 2010 at 4:41 PM, Merlin Moncure <mmoncure@gmail.com> wrote: >> IMNSHO, an 'in core' scheduler would be useful. however, I think >> before you tackle a scheduler, we need proper stored procedures. Our >> existing functions don't cut it because you can manage the transaction >> state yourself. > > Did you mean that you "can't" manage the transaction state yourself? > > Has anyone given any thought to what would be required to relax this > restriction? Is this totally impossible given our architecture, or > just a lack of round tuits? I thing so it is very hard restriction based on using and architecture of our SPI interface. Our stored procedures are executed inside one SELECT statement - it is reason for limit. There cannot be two or more outer transactions. Different implementations has different place of runtime - it is more near to top of pipeline. Pavel > > See also: http://www.postgresql.org/docs/current/static/plpgsql-porting.html#PLPGSQL-PORTING-EXCEPTIONS > > ...Robert > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >
On Mon, Mar 1, 2010 at 4:43 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Sat, Feb 20, 2010 at 4:41 PM, Merlin Moncure <mmoncure@gmail.com> wrote: >> IMNSHO, an 'in core' scheduler would be useful. however, I think >> before you tackle a scheduler, we need proper stored procedures. Our >> existing functions don't cut it because you can manage the transaction >> state yourself. > > Did you mean that you "can't" manage the transaction state yourself? > > Has anyone given any thought to what would be required to relax this > restriction? Is this totally impossible given our architecture, or > just a lack of round tuits? yeah...that's what I meant. plpgsql exceptions are no help because there are many cases where you simply don't want the whole sequence of operations to run in a single transaction. loading lots of data to many tables is one. any operation that depends on transaction commit to do something (like notifications) and then hook on the results is another. you always have the heavy hitting administrative functions like vacuum, etc. another case is if you want a procedure to simply run forever...trivially done in a procedure, impossible in a function. The way people do this stuff now is to involve an 1) external scheduler such as cron and 2) .sql scripts for relatively simple things and/or a external scripting language like bash/perl. The external scheduler has a couple of annoying issues...completely not portable to code against and scheduling sub minute accuracy is a big headache. Also, adjusting the scheduling based on database events is, while not impossible, more difficult than it should be. External .sql scripts are portable but extremely limited. Involving something like perl just so I can jump outside the database to do manual transaction management is fine but ISTM these type of things are much better when done inside the database IMNSHO. Another factor here is that a sizable percentage of our user base is bargain hunters coming in from other systems like oracle and ms sql and having to rely in o/s scheduler is very distasteful to them. It's a hole, one of the last remaining IMO, in postgres being able to provide a complete server side development environment without having to deal with the o/s at all. I stand by my statements earlier. Any moderate level and up complexity database has all kinds of scheduling and scripting going on supporting it. These things really should be part of the database, dump with it, and run in a regular way irregardless of platform and server environment etc. With that, 90% of the code I have to write outside of the database goes away. merlin
Robert Haas <robertmhaas@gmail.com> writes: > On Sat, Feb 20, 2010 at 4:41 PM, Merlin Moncure <mmoncure@gmail.com> wrote: >> IMNSHO, an 'in core' scheduler would be useful. however, I think >> before you tackle a scheduler, we need proper stored procedures. �Our >> existing functions don't cut it because you can manage the transaction >> state yourself. > Did you mean that you "can't" manage the transaction state yourself? > Has anyone given any thought to what would be required to relax this > restriction? Is this totally impossible given our architecture, or > just a lack of round tuits? There is lots and lots of discussion of that in the archives. It's fundamentally impossible for PL functions done in the current style to start or commit transactions, unless you resort to dblink-style kluges. What's been discussed is some sort of structure that would allow a chunk of PL code to execute "outside" a transaction and thus issue its own begin and commit commands. This idea is what Merlin is calling a stored procedure, though personally I dislike that terminology. Anyway, nothing's got past the arm-waving stage as yet. regards, tom lane