Thread: scheduler in core

scheduler in core

From
Jaime Casanova
Date:
Hi,

I'm trying to figure out how difficult is this

What we need:
- a shared catalog
- an API for filling the catalog
- a scheduler daemon
- pg_dump support


A shared catalog
-------------------------
Why shared? obviously because we don't want to scan all database's
pg_job every time the daemon wake up.
Maybe something like:

pg_job (   oid                -- use the oid as pk   jobname   jobdatoid       -- job database oid   jobowner       --
forpermission's checking   jobstarttime   -- year to minute   jobfrequency  -- an interval?   jobnexttime or
joblasttime  jobtype          -- if we are going to allow plain sql or 
executable/shell job types   jobexecute or jobscript
)

comments about the catalog?


An API for filling the catalog
-----------------------------------------
do we want a CREATE JOB SQL synatx? FWIW, Oracle uses functions to
create/remove jobs.


An scheduler daemon
--------------------------------
I think we can use 8.3's autovacuum daemon as a reference for this...
AFAIK, it's a child of postmaster that sleep for $naptime and then
looks for something to do (it also looks in a
catalog) and the send a worker to do it
that's what we need to do but...

for the $naptime i think we can autoconfigure it, when we execute a
job look for the next job in queue and sleep
until we are going to reach the time to execute it

i don't think we need a max_worker parameter, it should launch as many
workers as it needs


pg_dump support
--------------------------
dump every entry of the pg_job catalog as a CREATE JOB SQL statement
or a create_job() function depending
on what we decided

ideas? comments?

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157


Re: scheduler in core

From
Dave Page
Date:
On Sat, Feb 20, 2010 at 9:33 PM, Jaime Casanova
<jcasanov@systemguards.com.ec> wrote:
> Hi,
>
> I'm trying to figure out how difficult is this

Why not just use pgAgent? It's far more flexible than the design
you've suggested, and already exists.

-- 
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com


Re: scheduler in core

From
Merlin Moncure
Date:
On Sat, Feb 20, 2010 at 4:33 PM, Jaime Casanova
<jcasanov@systemguards.com.ec> wrote:
> Hi,
>
> I'm trying to figure out how difficult is this
>
> What we need:
> - a shared catalog
> - an API for filling the catalog
> - a scheduler daemon
> - pg_dump support
>
>
> A shared catalog
> -------------------------
> Why shared? obviously because we don't want to scan all database's
> pg_job every time the daemon wake up.
> Maybe something like:
>
> pg_job (
>    oid                -- use the oid as pk
>    jobname
>    jobdatoid       -- job database oid
>    jobowner       -- for permission's checking
>    jobstarttime   -- year to minute
>    jobfrequency  -- an interval?
>    jobnexttime or joblasttime
>    jobtype          -- if we are going to allow plain sql or
> executable/shell job types
>    jobexecute or jobscript
> )
>
> comments about the catalog?
>
>
> An API for filling the catalog
> -----------------------------------------
> do we want a CREATE JOB SQL synatx? FWIW, Oracle uses functions to
> create/remove jobs.
>
>
> An scheduler daemon
> --------------------------------
> I think we can use 8.3's autovacuum daemon as a reference for this...
> AFAIK, it's a child of postmaster that sleep for $naptime and then
> looks for something to do (it also looks in a
> catalog) and the send a worker to do it
> that's what we need to do but...
>
> for the $naptime i think we can autoconfigure it, when we execute a
> job look for the next job in queue and sleep
> until we are going to reach the time to execute it
>
> i don't think we need a max_worker parameter, it should launch as many
> workers as it needs
>
>
> pg_dump support
> --------------------------
> dump every entry of the pg_job catalog as a CREATE JOB SQL statement
> or a create_job() function depending
> on what we decided
>
> ideas? comments?

IMNSHO, an 'in core' scheduler would be useful. however, I think
before you tackle a scheduler, we need proper stored procedures.  Our
existing functions don't cut it because you can manage the transaction
state yourself.

merlin


Re: scheduler in core

From
Dimitri Fontaine
Date:
Dave Page <dpage@pgadmin.org> writes:
> Why not just use pgAgent? It's far more flexible than the design
> you've suggested, and already exists.

What would it take to have it included in core, so that it's not a
separate install to do? I'd love to have some support for running my
maintenance pl functions directly from the database. I mean without
installing, running and monitoring another (set of) process.

Main advantage over cron or another scheduler being that it'd be part of
my transactional backups, of course.

Use cases, in case it's needed already, include creating new partitions,
materializing views at known intervals, more general maintenance like
vacuum and clusters operations, some reporting that could be done in the
database itself, etc.

Regards,
-- 
dim


Re: scheduler in core

From
Pavel Stehule
Date:
>
> pg_job (
>    oid                -- use the oid as pk
>    jobname
>    jobdatoid       -- job database oid
>    jobowner       -- for permission's checking
>    jobstarttime   -- year to minute
>    jobfrequency  -- an interval?
>    jobnexttime or joblasttime
>    jobtype          -- if we are going to allow plain sql or
> executable/shell job types
>    jobexecute or jobscript
> )
>
> comments about the catalog?
>

+ success_action
+failure_action


>
> An API for filling the catalog
> -----------------------------------------
> do we want a CREATE JOB SQL synatx? FWIW, Oracle uses functions to
> create/remove jobs.
>
>
> An scheduler daemon
> --------------------------------
> I think we can use 8.3's autovacuum daemon as a reference for this...
> AFAIK, it's a child of postmaster that sleep for $naptime and then
> looks for something to do (it also looks in a
> catalog) and the send a worker to do it
> that's what we need to do but...
>
> for the $naptime i think we can autoconfigure it, when we execute a
> job look for the next job in queue and sleep
> until we are going to reach the time to execute it
>
> i don't think we need a max_worker parameter, it should launch as many
> workers as it needs
>
>
> pg_dump support
> --------------------------
> dump every entry of the pg_job catalog as a CREATE JOB SQL statement
> or a create_job() function depending
> on what we decided
>
> ideas? comments?
>
> --
> Atentamente,
> Jaime Casanova
> Soporte y capacitación de PostgreSQL
> Asesoría y desarrollo de sistemas
> Guayaquil - Ecuador
> Cel. +59387171157
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: scheduler in core

From
Greg Stark
Date:
On Sat, Feb 20, 2010 at 10:03 PM, Dimitri Fontaine
<dfontaine@hi-media.com> wrote:
> What would it take to have it included in core, so that it's not a
> separate install to do? I'd love to have some support for running my
> maintenance pl functions directly from the database. I mean without
> installing, running and monitoring another (set of) process.

It'll always be another (set of) processes even if it's "in core". All
it means to be "in core" is that it will be harder to make
modifications and you'll be tied to the Postgres release cycle.

> Main advantage over cron or another scheduler being that it'd be part of
> my transactional backups, of course.

All you need for that is to store the schedule in a database table.
This has nothing to do with where the scheduler code lives.



-- 
greg


Re: scheduler in core

From
Tom Lane
Date:
Dimitri Fontaine <dfontaine@hi-media.com> writes:
> Dave Page <dpage@pgadmin.org> writes:
>> Why not just use pgAgent? It's far more flexible than the design
>> you've suggested, and already exists.

> What would it take to have it included in core,

I don't think this really makes sense.  There's basically no argument
for having it in core other than "I'm too lazy to install a separate
package".  Unlike the case for autovacuum, there isn't anything an
in-core implementation could do that an external one doesn't do as well
or better.  So I'm not eager to take on additional maintenance burden
for such a thing.
        regards, tom lane


Re: scheduler in core

From
Lucas
Date:
Tom,

I believe that "in core" may be "installed by default" in case of the pgAgent or similar solution...

Many big companies does not allow the developers to configure and install components.... we need to request everthing in 10 copies of forms...

By making it "in core" or "installed by default" means that we have more chance that the db scheduler would be widely accepted... 

And more important... we would not have to check its availability on the setup and provide an alternate scheduler if the database scheduler is off...

I believe that a database scheduler would allow me to drop 20 thousand lines of java code in my server...



2010/2/20 Tom Lane <tgl@sss.pgh.pa.us>
Dimitri Fontaine <dfontaine@hi-media.com> writes:
> Dave Page <dpage@pgadmin.org> writes:
>> Why not just use pgAgent? It's far more flexible than the design
>> you've suggested, and already exists.

> What would it take to have it included in core,

I don't think this really makes sense.  There's basically no argument
for having it in core other than "I'm too lazy to install a separate
package".  Unlike the case for autovacuum, there isn't anything an
in-core implementation could do that an external one doesn't do as well
or better.  So I'm not eager to take on additional maintenance burden
for such a thing.

                       regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers



--
Lucas

Re: scheduler in core

From
Jaime Casanova
Date:
On Sat, Feb 20, 2010 at 4:37 PM, Dave Page <dpage@pgadmin.org> wrote:
> On Sat, Feb 20, 2010 at 9:33 PM, Jaime Casanova
> <jcasanov@systemguards.com.ec> wrote:
>> Hi,
>>
>> I'm trying to figure out how difficult is this
>
> Why not just use pgAgent? It's far more flexible than the design
> you've suggested, and already exists.
>

- it's not that easy if you don't have pgadmin
- i need to backup postgres database to backup the schedules
- the use pgagent here is not very extended but the few a know have
tried desisted because they
said: "not always executed the jobs"... i don't have any real evidence
of that and probably what happens
was that the pgagent daemon wasn't working (error prone), but being it
started by the postmaster get rid of that
problem...


The first one could be rid out with a set of functions in pgagent and
clear docs...
i can live with the other two at some degree... but getting rid of
the third one should be nice :)

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157


Re: scheduler in core

From
Dave Page
Date:
On Sun, Feb 21, 2010 at 12:03 AM, Jaime Casanova
<jcasanov@systemguards.com.ec> wrote:
> On Sat, Feb 20, 2010 at 4:37 PM, Dave Page <dpage@pgadmin.org> wrote:
>> On Sat, Feb 20, 2010 at 9:33 PM, Jaime Casanova
>> <jcasanov@systemguards.com.ec> wrote:
>>> Hi,
>>>
>>> I'm trying to figure out how difficult is this
>>
>> Why not just use pgAgent? It's far more flexible than the design
>> you've suggested, and already exists.
>>
>
> - it's not that easy if you don't have pgadmin

That's easily changed. EDB's Advanced Server emulates Oracles DBMS_JOB
interface with it for example.

> - i need to backup postgres database to backup the schedules

Only if you put the control schema in that database. If you don't want
to do that, stick it somewhere else. With your proposed scheme, you'd
probably have to use pg_dumpall --backup-globals (or whatever it's
called)

> - the use pgagent here is not very extended but the few a know have
> tried desisted because they
> said: "not always executed the jobs"... i don't have any real evidence
> of that and probably what happens
> was that the pgagent daemon wasn't working (error prone), but being it
> started by the postmaster get rid of that
> problem...

Noone has ever reported such a bug that I'm aware of.


-- 
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com


Re: scheduler in core

From
Dave Page
Date:
On Sat, Feb 20, 2010 at 11:55 PM, Lucas <lucas75@gmail.com> wrote:

> I believe that a database scheduler would allow me to drop 20 thousand lines
> of java code in my server...

How does that work? If you don't have a scheduler in the database, or
pgAgent, why aren't you using cron or Windows task scheduler, neither
of which would require 20K lines of Java code.

-- 
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com


Re: scheduler in core

From
Jaime Casanova
Date:
On Sat, Feb 20, 2010 at 7:32 PM, Dave Page <dpage@pgadmin.org> wrote:
> On Sun, Feb 21, 2010 at 12:03 AM, Jaime Casanova
> <jcasanov@systemguards.com.ec> wrote:
>> On Sat, Feb 20, 2010 at 4:37 PM, Dave Page <dpage@pgadmin.org> wrote:
>>> On Sat, Feb 20, 2010 at 9:33 PM, Jaime Casanova
>>> <jcasanov@systemguards.com.ec> wrote:
>>>> Hi,
>>>>
>>>> I'm trying to figure out how difficult is this
>>>
>>> Why not just use pgAgent? It's far more flexible than the design
>>> you've suggested, and already exists.
>>>
>>
>> - it's not that easy if you don't have pgadmin
>
> That's easily changed. EDB's Advanced Server emulates Oracles DBMS_JOB
> interface with it for example.
>

maybe i can work on that, then

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157


Re: scheduler in core

From
Dave Page
Date:
On Sat, Feb 20, 2010 at 10:03 PM, Dimitri Fontaine
<dfontaine@hi-media.com> wrote:
> Dave Page <dpage@pgadmin.org> writes:
>> Why not just use pgAgent? It's far more flexible than the design
>> you've suggested, and already exists.
>
> What would it take to have it included in core, so that it's not a
> separate install to do? I'd love to have some support for running my
> maintenance pl functions directly from the database. I mean without
> installing, running and monitoring another (set of) process.

It's currently written in C++/pl/pgsql and uses wxWidgets, none of
which couldn't be changed with a little work. Having it in core will
almost certainly result in reduced functionality though - there are
use cases in which you may have multiple agents running against one
control database, or executing jobs on remote databases for example.

We originally wrote the code such that it might be easily included in
core in the future, but every time this topic comes up in -hackers,
there are a significant number of people who don't think a scheduler
should be tied to the core code so we stopped assuming it ever would
be.

-- 
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com


Re: scheduler in core

From
Andrew Dunstan
Date:

Lucas wrote:
> Tom,
>
>     I believe that "in core" may be "installed by default" in case of
>     the pgAgent or similar solution...
>
>     Many big companies does not allow the developers to configure and
>     install components.... we need to request everthing in 10 copies
>     of forms...
>
>     By making it "in core" or "installed by default" means that we
>     have more chance that the db scheduler would be widely accepted... 
>
>

This reasoning just doesn't fly in the PostgreSQL world. PostgreSQL is 
designed to be extensible, not a monolithic product. We're not going to 
change that because some companies have insane corporate policies.  The 
answer, as Jefferson said in another context, is to "inform their 
ignorance."

That isn't to say that there isn't a case for an in core scheduler, but 
this at least isn't a good reason for it.

cheers

andrew


Re: scheduler in core

From
Dave Page
Date:
On Sun, Feb 21, 2010 at 12:38 AM, Jaime Casanova
<jcasanov@systemguards.com.ec> wrote:
> On Sat, Feb 20, 2010 at 7:32 PM, Dave Page <dpage@pgadmin.org> wrote:
>> On Sun, Feb 21, 2010 at 12:03 AM, Jaime Casanova
>> <jcasanov@systemguards.com.ec> wrote:
>>> On Sat, Feb 20, 2010 at 4:37 PM, Dave Page <dpage@pgadmin.org> wrote:
>>>> On Sat, Feb 20, 2010 at 9:33 PM, Jaime Casanova
>>>> <jcasanov@systemguards.com.ec> wrote:
>>>>> Hi,
>>>>>
>>>>> I'm trying to figure out how difficult is this
>>>>
>>>> Why not just use pgAgent? It's far more flexible than the design
>>>> you've suggested, and already exists.
>>>>
>>>
>>> - it's not that easy if you don't have pgadmin
>>
>> That's easily changed. EDB's Advanced Server emulates Oracles DBMS_JOB
>> interface with it for example.
>>
>
> maybe i can work on that, then

I'd love to add a management API to pgAgent if you'd like to work on it.



-- 
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com


Re: scheduler in core

From
"Joshua D. Drake"
Date:
On Sat, 2010-02-20 at 18:19 -0500, Tom Lane wrote:
> Dimitri Fontaine <dfontaine@hi-media.com> writes:
> > Dave Page <dpage@pgadmin.org> writes:
> >> Why not just use pgAgent? It's far more flexible than the design
> >> you've suggested, and already exists.
>
> > What would it take to have it included in core,
>
> I don't think this really makes sense.  There's basically no argument
> for having it in core other than "I'm too lazy to install a separate
> package".  Unlike the case for autovacuum, there isn't anything an
> in-core implementation could do that an external one doesn't do as well
> or better.  So I'm not eager to take on additional maintenance burden
> for such a thing.

There is zero technical reason for this to be in core.

That doesn't mean it isn't a really good idea. It would be nice to have
a comprehensive job scheduling solution that allows me to continue
abstract away from external solutions and operating system dependencies.

Joshua D. Drake

>
>             regards, tom lane
>


--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564
Consulting, Training, Support, Custom Development, Engineering
Respect is earned, not gained through arbitrary and repetitive use or Mr. or Sir.

Re: scheduler in core

From
Robert Haas
Date:
On Feb 20, 2010, at 8:06 PM, "Joshua D. Drake" <jd@commandprompt.com>
wrote:
> There is zero technical reason for this to be in core.
>
> That doesn't mean it isn't a really good idea. It would be nice to
> have
> a comprehensive job scheduling solution that allows me to continue
> abstract away from external solutions and operating system
> dependencies.

Well put.  That pretty much sums up my feelings on this perfectly.

...Robert

Re: scheduler in core

From
Jaime Casanova
Date:
Ah! wxWidgets... Yes, i knew there was something i didn't like about
pgAgent. So is not as simple as installing it

2010/2/20, Dave Page <dpage@pgadmin.org>:
> On Sat, Feb 20, 2010 at 10:03 PM, Dimitri Fontaine
> <dfontaine@hi-media.com> wrote:
>> Dave Page <dpage@pgadmin.org> writes:
>>> Why not just use pgAgent? It's far more flexible than the design
>>> you've suggested, and already exists.
>>
>> What would it take to have it included in core, so that it's not a
>> separate install to do? I'd love to have some support for running my
>> maintenance pl functions directly from the database. I mean without
>> installing, running and monitoring another (set of) process.
>
> It's currently written in C++/pl/pgsql and uses wxWidgets, none of
> which couldn't be changed with a little work. Having it in core will
> almost certainly result in reduced functionality though - there are
> use cases in which you may have multiple agents running against one
> control database, or executing jobs on remote databases for example.
>
> We originally wrote the code such that it might be easily included in
> core in the future, but every time this topic comes up in -hackers,
> there are a significant number of people who don't think a scheduler
> should be tied to the core code so we stopped assuming it ever would
> be.
>
> --
> Dave Page
> EnterpriseDB UK: http://www.enterprisedb.com
>

--
Enviado desde mi dispositivo móvil

Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157


Re: scheduler in core

From
Pavel Stehule
Date:
2010/2/21 Andrew Dunstan <andrew@dunslane.net>:
>
>
> Lucas wrote:
>>
>> Tom,
>>
>>    I believe that "in core" may be "installed by default" in case of
>>    the pgAgent or similar solution...
>>
>>    Many big companies does not allow the developers to configure and
>>    install components.... we need to request everthing in 10 copies
>>    of forms...
>>
>>    By making it "in core" or "installed by default" means that we
>>    have more chance that the db scheduler would be widely accepted...
>>
>
> This reasoning just doesn't fly in the PostgreSQL world. PostgreSQL is
> designed to be extensible, not a monolithic product. We're not going to
> change that because some companies have insane corporate policies.  The
> answer, as Jefferson said in another context, is to "inform their
> ignorance."
>
> That isn't to say that there isn't a case for an in core scheduler, but this
> at least isn't a good reason for it.

What I remember - this is exactly same discus like was about
replication thre years ago

fiirst strategy - we doesn't need it in core
next we was last with replacation

Regards
Pavel  Stehule

>
> cheers
>
> andrew
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: scheduler in core

From
"Joshua D. Drake"
Date:
On Sat, 2010-02-20 at 18:19 -0500, Tom Lane wrote:
> Dimitri Fontaine <dfontaine@hi-media.com> writes:
> > Dave Page <dpage@pgadmin.org> writes:
> >> Why not just use pgAgent? It's far more flexible than the design
> >> you've suggested, and already exists.
> 
> > What would it take to have it included in core,
> 
> I don't think this really makes sense.  There's basically no argument
> for having it in core other than "I'm too lazy to install a separate
> package".  Unlike the case for autovacuum, there isn't anything an
> in-core implementation could do that an external one doesn't do as well
> or better.  So I'm not eager to take on additional maintenance burden
> for such a thing.

There is zero technical reason for this to be in core.

That doesn't mean it isn't a really good idea. It would be nice to have
a comprehensive job scheduling solution that allows me to continue
abstract away from external solutions and operating system dependencies.

Joshua D. Drake

> 
>             regards, tom lane
> 


-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564
Consulting, Training, Support, Custom Development, Engineering
Respect is earned, not gained through arbitrary and repetitive use or Mr. or Sir.



Re: scheduler in core

From
Dimitri Fontaine
Date:
"Joshua D. Drake" <jd@commandprompt.com> writes:
> On Sat, 2010-02-20 at 18:19 -0500, Tom Lane wrote:
>> Dimitri Fontaine <dfontaine@hi-media.com> writes:
>> > What would it take to have it included in core,
>> 
>> I don't think this really makes sense.  There's basically no argument
>> for having it in core other than "I'm too lazy to install a separate
>> package".  Unlike the case for autovacuum, there isn't anything an
>> in-core implementation could do that an external one doesn't do as well
>> or better.  So I'm not eager to take on additional maintenance burden
>> for such a thing.
>
> There is zero technical reason for this to be in core.
>
> That doesn't mean it isn't a really good idea. It would be nice to have
> a comprehensive job scheduling solution that allows me to continue
> abstract away from external solutions and operating system dependencies.

Maybe what we need, on the technical level, is a way to distribute this
code with the main product but without draining too much effort from
core members there. Like we do with contribs I guess, but on a larger
scale.

I guess git submodules, PGAN, extensions and all that jazz are going to
help. Meanwhile I'll have to learn enough of pgAgent to figure out how
much it's tied to pgadmin, and we'll have to make those other facilities
something real.

Regards,
-- 
dim


Re: scheduler in core

From
Dimitri Fontaine
Date:
Greg Stark <gsstark@mit.edu> writes:
> It'll always be another (set of) processes even if it's "in core". All
> it means to be "in core" is that it will be harder to make
> modifications and you'll be tied to the Postgres release cycle.

Another set of processes all right, but that postmaster is responsible
of, that it starts and ends at the right time.

>> Main advantage over cron or another scheduler being that it'd be part of
>> my transactional backups, of course.
>
> All you need for that is to store the schedule in a database table.
> This has nothing to do with where the scheduler code lives.

Not true. You need custom scripts that will read what's in this database
table and run it at the right timing, care about running more than one
job at the same time when necessary, reports what the outcome was
somewhere, etc.

The simplest would be a query that writes out in cron format the setup
you've made in the database, I suppose. When do you run that query? You
need an untrusted trigger? What happens if your query or script writes a
file cron will not be able to read, or on a server where cron is not
running?

I'm not saying this is any harder that other admin sys stuff we have to
do to operate the systems, just that it seems it would be simpler,
easier and less error prone to be able to schedule database maintenance
from within the database itself, in such a way that the classic dump and
restore process restores the maintenance scripts too.

That would allow for automatic creation of partitions in dev and
pre-prod environments where you install more than one copy of the same
database at once, but would like to avoid maintaining one set of cron
entries per copy.

As said Tom, technically, it's obviously possible not to depend on a
PostgreSQL integrated scheduler. As said JD, it still is a pretty good
idea to provide one in core.

Regards,
-- 
dim


Re: scheduler in core

From
Robert Haas
Date:
On Feb 20, 2010, at 8:06 PM, "Joshua D. Drake" <jd@commandprompt.com>
wrote:
> There is zero technical reason for this to be in core.
>
> That doesn't mean it isn't a really good idea. It would be nice to
> have
> a comprehensive job scheduling solution that allows me to continue
> abstract away from external solutions and operating system
> dependencies.

Well put.  That pretty much sums up my feelings on this perfectly.

...Robert


Re: scheduler in core

From
Andrew Dunstan
Date:

Pavel Stehule wrote:
>> This reasoning just doesn't fly in the PostgreSQL world. PostgreSQL is
>> designed to be extensible, not a monolithic product. We're not going to
>> change that because some companies have insane corporate policies.  The
>> answer, as Jefferson said in another context, is to "inform their
>> ignorance."
>>
>> That isn't to say that there isn't a case for an in core scheduler, but this
>> at least isn't a good reason for it.
>>     
>
> What I remember - this is exactly same discus like was about
> replication thre years ago
>
> fiirst strategy - we doesn't need it in core
> next we was last with replacation
>
>   

That's a pretty poor analogy IMNSHO. There are very good technical 
reasons to have replication in the core. That is much less clear for a 
scheduler. But in any case, I didn't say that we shouldn't have a 
scheduler. I specifically said there might be a case for it - read the 
first clause of my last sentence. What I said was that the reason given, 
namely that Corporations didn't want to use add-on modules, was not a 
good reason.

cheers

andrew


Re: scheduler in core

From
Bruce Momjian
Date:
Pavel Stehule wrote:
> 2010/2/21 Andrew Dunstan <andrew@dunslane.net>:
> >> ? ?I believe that "in core" may be "installed by default" in case of
> >> ? ?the pgAgent or similar solution...
> >>
> >> ? ?Many big companies does not allow the developers to configure and
> >> ? ?install components.... we need to request everthing in 10 copies
> >> ? ?of forms...
> >>
> >> ? ?By making it "in core" or "installed by default" means that we
> >> ? ?have more chance that the db scheduler would be widely accepted...
> >>
> >
> > This reasoning just doesn't fly in the PostgreSQL world. PostgreSQL is
> > designed to be extensible, not a monolithic product. We're not going to
> > change that because some companies have insane corporate policies. ?The
> > answer, as Jefferson said in another context, is to "inform their
> > ignorance."
> >
> > That isn't to say that there isn't a case for an in core scheduler, but this
> > at least isn't a good reason for it.
> 
> What I remember - this is exactly same discus like was about
> replication thre years ago
> 
> fiirst strategy - we doesn't need it in core
> next we was last with replacation

We resisted putting replication into the core until we needed some
facilities that were only available from the core.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.comPG East:  http://www.enterprisedb.com/community/nav-pg-east-2010.do + If your life is a hard
drive,Christ can be your backup. +
 


Re: scheduler in core

From
Lucas
Date:
2010/2/20 Andrew Dunstan <andrew@dunslane.net>
>
> We're not going to change that because some companies have
> insane corporate policies.

I agree, Andrew...
This is an outside benefit...
not a reason or justification...

I believe that a general purpose scheduler is similar to the autovacuum... it is not really needed, we can always
configurean external scheduler. But I liked a LOT... 

For me is not a question of "must be in core" is a question of cost/benefit. I do not see much cost, but a lot of
benefits:

Like Joshua said "abstract away from external solutions and operating system dependencies".
Like Dimitri said "Main advantage over cron or another scheduler being that it'd be part of my transactional backups".
To me is the reliability of having the partition creation/removal being part of the database, be able of make
consolidations,cleanups and periodic consistency checks and diagnostics without external dependencies. 

I wonder if the scheduler already existed before the implementation of the autovacuum, its implementation would not be
afunction executed by the in-core scheduler? 

- -
Lucas


Re: scheduler in core

From
Ron Mayer
Date:
Lucas wrote:
>     I believe that "in core" may be "installed by default" in case of

Those seem like totally orthogonal concepts to me.

A feature may be "in core" but not "installed by default" (like many PLs).
A feature might not be "in core" but "installed" by many installers (say postgis).

It seems like half the people here are arguing for the former concept.
It seems the other half are arguing against the latter concept.


Is the real need here for a convenient way to enable and/or
recommend packagers to install some non-core modules by default?


Re: scheduler in core

From
Tom Lane
Date:
Ron Mayer <rm_pg@cheapcomplexdevices.com> writes:
> Is the real need here for a convenient way to enable and/or
> recommend packagers to install some non-core modules by default?

It would certainly help us resist assorted requests to put everything
including the kitchen sink into core.
        regards, tom lane


Re: scheduler in core

From
Robert Haas
Date:
On Sun, Feb 21, 2010 at 12:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Ron Mayer <rm_pg@cheapcomplexdevices.com> writes:
>> Is the real need here for a convenient way to enable and/or
>> recommend packagers to install some non-core modules by default?
>
> It would certainly help us resist assorted requests to put everything
> including the kitchen sink into core.

If you don't want people to keep requesting more features in core, you
should stop doing such a good job making the functionality that gets
put into core awesome.

That's partly tongue-in-cheek, but there's some real truth to it.
Stuff doesn't go into core unless it just works.  And having things in
core is appealing because it means they're available everywhere, they
work the same way everywhere, and they can be fully managed within the
database without a lot of futzing around.  Having an extensible system
is a good thing and I'm glad we do, but having a rich feature set
available in core is also a very good thing for a lot of reasons, at
least IMHO.

...Robert


Re: scheduler in core

From
Robert Haas
Date:
On Sun, Feb 21, 2010 at 10:17 AM, Lucas <lucas75@gmail.com> wrote:
> I wonder if the scheduler already existed before the
>  implementation of the autovacuum, its implementation would
>  not be a function executed by the in-core scheduler?

The real genius of autovacuum is that it works out when there has been
enough activity in particular tables that they need to be vacuumed.
We might be able to use an in-core scheduler to wake it up every
minute to look at the stats, or whatever it is that we do, but that's
not all that exciting.

...Robert


Re: scheduler in core

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Sun, Feb 21, 2010 at 10:17 AM, Lucas <lucas75@gmail.com> wrote:
>> I wonder if the scheduler already existed before the
>> �implementation of the autovacuum, its implementation would
>> �not be a function executed by the in-core scheduler?

> The real genius of autovacuum is that it works out when there has been
> enough activity in particular tables that they need to be vacuumed.
> We might be able to use an in-core scheduler to wake it up every
> minute to look at the stats, or whatever it is that we do, but that's
> not all that exciting.

The wake-up-every-N-seconds part of it is actually the weakest part
(search the archives for questions about autovacuum_naptime).  To my
mind, the killer reason why autovac needed to be integrated is so that
the system itself could trigger autovac runs in response to threatened
XID wraparound conditions.  A facility for scheduling user jobs, almost
by definition, won't have any system-internal trigger conditions.
        regards, tom lane


Re: scheduler in core

From
Simon Riggs
Date:
On Sat, 2010-02-20 at 18:19 -0500, Tom Lane wrote:
> Dimitri Fontaine <dfontaine@hi-media.com> writes:
> > Dave Page <dpage@pgadmin.org> writes:
> >> Why not just use pgAgent? It's far more flexible than the design
> >> you've suggested, and already exists.
> 
> > What would it take to have it included in core,
> 
> I don't think this really makes sense.  There's basically no argument
> for having it in core other than "I'm too lazy to install a separate
> package".  Unlike the case for autovacuum, there isn't anything an
> in-core implementation could do that an external one doesn't do as well
> or better.  So I'm not eager to take on additional maintenance burden
> for such a thing.

There is currently no way to run a separate daemon process that runs
user code as part of Postgres, so that the startup code gets run
immediately we startup, re-run if we crash and shut down cleanly when
the server does. If there were some way to run arbitrary code in a
daemon using an extensibility API then we wouldn't ever get any requests
for the scheduler, cos you could write it yourself without troubling
anybody here.

-- Simon Riggs           www.2ndQuadrant.com



Re: scheduler in core

From
Robert Haas
Date:
On Sun, Feb 21, 2010 at 1:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Sun, Feb 21, 2010 at 10:17 AM, Lucas <lucas75@gmail.com> wrote:
>>> I wonder if the scheduler already existed before the
>>>  implementation of the autovacuum, its implementation would
>>>  not be a function executed by the in-core scheduler?
>
>> The real genius of autovacuum is that it works out when there has been
>> enough activity in particular tables that they need to be vacuumed.
>> We might be able to use an in-core scheduler to wake it up every
>> minute to look at the stats, or whatever it is that we do, but that's
>> not all that exciting.
>
> The wake-up-every-N-seconds part of it is actually the weakest part
> (search the archives for questions about autovacuum_naptime).  To my
> mind, the killer reason why autovac needed to be integrated is so that
> the system itself could trigger autovac runs in response to threatened
> XID wraparound conditions.  A facility for scheduling user jobs, almost
> by definition, won't have any system-internal trigger conditions.

Right.  Without prejudice to my earlier statements that I think this
might possibly be a good thing to do anyway, the case for it would be
a lot stronger if it provided some genuine additional functionality.

...Robert


Re: scheduler in core

From
Robert Haas
Date:
On Sun, Feb 21, 2010 at 1:11 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Sat, 2010-02-20 at 18:19 -0500, Tom Lane wrote:
>> Dimitri Fontaine <dfontaine@hi-media.com> writes:
>> > Dave Page <dpage@pgadmin.org> writes:
>> >> Why not just use pgAgent? It's far more flexible than the design
>> >> you've suggested, and already exists.
>>
>> > What would it take to have it included in core,
>>
>> I don't think this really makes sense.  There's basically no argument
>> for having it in core other than "I'm too lazy to install a separate
>> package".  Unlike the case for autovacuum, there isn't anything an
>> in-core implementation could do that an external one doesn't do as well
>> or better.  So I'm not eager to take on additional maintenance burden
>> for such a thing.
>
> There is currently no way to run a separate daemon process that runs
> user code as part of Postgres, so that the startup code gets run
> immediately we startup, re-run if we crash and shut down cleanly when
> the server does.

Good point.

> If there were some way to run arbitrary code in a
> daemon using an extensibility API then we wouldn't ever get any requests
> for the scheduler, cos you could write it yourself without troubling
> anybody here.

That might be a little overly optimistic, but I get the point.

...Robert


Re: scheduler in core

From
Dimitri Fontaine
Date:
Simon Riggs <simon@2ndQuadrant.com> writes:
> There is currently no way to run a separate daemon process that runs
> user code as part of Postgres, so that the startup code gets run
> immediately we startup, re-run if we crash and shut down cleanly when
> the server does. If there were some way to run arbitrary code in a
> daemon using an extensibility API then we wouldn't ever get any requests
> for the scheduler, cos you could write it yourself without troubling
> anybody here.

Please do include the Skytools / PGQ ticker as one use case in the
design discussion, and pgbouncer too. Having user daemons as part as the
PostgreSQL extensibility would be awesome indeed!

Bonus point if you build them with PGXS and install them from SQL, so
that the current extension packaging design applies.

I guess we can say that the archive and restore command are precursors
of managed user "daemons", or say, integrated processes. So adding them
to the use cases to cover would make sense.

Regards,
-- 
dim


Re: scheduler in core

From
Simon Riggs
Date:
On Sun, 2010-02-21 at 20:46 +0100, Dimitri Fontaine wrote:
> Simon Riggs <simon@2ndQuadrant.com> writes:
> > There is currently no way to run a separate daemon process that runs
> > user code as part of Postgres, so that the startup code gets run
> > immediately we startup, re-run if we crash and shut down cleanly when
> > the server does. If there were some way to run arbitrary code in a
> > daemon using an extensibility API then we wouldn't ever get any requests
> > for the scheduler, cos you could write it yourself without troubling
> > anybody here.
> 
> Please do include the Skytools / PGQ ticker as one use case in the
> design discussion, and pgbouncer too. Having user daemons as part as the
> PostgreSQL extensibility would be awesome indeed!
> 
> Bonus point if you build them with PGXS and install them from SQL, so
> that the current extension packaging design applies.
> 
> I guess we can say that the archive and restore command are precursors
> of managed user "daemons", or say, integrated processes. So adding them
> to the use cases to cover would make sense.

Yes, I think so. Rough design...

integrated_user_processes = 'x, y, z'

would run x(), y() and z() in their own processes. These would execute
after startup, or at consistent point in recovery. The code for these
would come from preload_libraries etc.

They would not block smart shutdown, though their shudown sequence might
delay it. User code would be executed last at startup and first thing at
shutdown.

API would be user_process_startup(), user_process_shutdown().

-- Simon Riggs           www.2ndQuadrant.com



Re: scheduler in core

From
Jaime Casanova
Date:
On Sun, Feb 21, 2010 at 1:11 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
[...]
>> Dimitri Fontaine <dfontaine@hi-media.com> writes:
>> > Dave Page <dpage@pgadmin.org> writes:
>> >> Why not just use pgAgent? It's far more flexible than the design
>> >> you've suggested, and already exists.
>>
>> > What would it take to have it included in core,
>>
[...]
>
> There is currently no way to run a separate daemon process that runs
> user code as part of Postgres, so that the startup code gets run
> immediately we startup, re-run if we crash and shut down cleanly when
> the server does. If there were some way to run arbitrary code in a
> daemon using an extensibility API then we wouldn't ever get any requests
> for the scheduler, cos you could write it yourself without troubling
> anybody here.
>

ah! that could get rid of one of my complaints, and then i could just
work the rest in pgAgent...
so, is this idea (having some user processes be "tied" to postmaster
start/stop) going to somewhere?

it also could help if we you have processes LISTENing for NOTIFYs

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157


Re: scheduler in core

From
Heikki Linnakangas
Date:
Jaime Casanova wrote:
> On Sun, Feb 21, 2010 at 1:11 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> There is currently no way to run a separate daemon process that runs
>> user code as part of Postgres, so that the startup code gets run
>> immediately we startup, re-run if we crash and shut down cleanly when
>> the server does. If there were some way to run arbitrary code in a
>> daemon using an extensibility API then we wouldn't ever get any requests
>> for the scheduler, cos you could write it yourself without troubling
>> anybody here.
> 
> ah! that could get rid of one of my complaints, and then i could just
> work the rest in pgAgent...

Yeah, seems like a good idea. Slon daemon and similar daemons could also
use it.

> so, is this idea (having some user processes be "tied" to postmaster
> start/stop) going to somewhere?

I've added this to the TODO list. Now we just need someone to write it.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: scheduler in core

From
Pavel Stehule
Date:
2010/2/22 Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>:
> Jaime Casanova wrote:
>> On Sun, Feb 21, 2010 at 1:11 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>> There is currently no way to run a separate daemon process that runs
>>> user code as part of Postgres, so that the startup code gets run
>>> immediately we startup, re-run if we crash and shut down cleanly when
>>> the server does. If there were some way to run arbitrary code in a
>>> daemon using an extensibility API then we wouldn't ever get any requests
>>> for the scheduler, cos you could write it yourself without troubling
>>> anybody here.
>>
>> ah! that could get rid of one of my complaints, and then i could just
>> work the rest in pgAgent...
>
> Yeah, seems like a good idea. Slon daemon and similar daemons could also
> use it.
>

I like it. I thought about some workflow system integrated with scheduler.

Regards
Pavel


>> so, is this idea (having some user processes be "tied" to postmaster
>> start/stop) going to somewhere?
>
> I've added this to the TODO list. Now we just need someone to write it.
>
> --
>  Heikki Linnakangas
>  EnterpriseDB   http://www.enterprisedb.com
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: scheduler in core

From
Merlin Moncure
Date:
On Sat, Feb 20, 2010 at 8:06 PM, Joshua D. Drake <jd@commandprompt.com> wrote:
>
> That doesn't mean it isn't a really good idea. It would be nice to have
> a comprehensive job scheduling solution that allows me to continue
> abstract away from external solutions and operating system dependencies.

+1!

Aa scheduler is an extremely common thing to have to integrate with
the database.  All of our commercial competitors have them, and they
are heavily used.

Like I noted above, what people want to schedule is going to be stored
procedures.  Having both would virtually eliminate the need for
scripting outside the database, which is a pretty big deal since
external scripts are a real pain to keep cross platform.  Since
there's probably a lot of overlapping problems in those two features,
why not tackle both at once?

merlin


Re: scheduler in core

From
Alvaro Herrera
Date:
Merlin Moncure escribió:

> Like I noted above, what people want to schedule is going to be stored
> procedures.  Having both would virtually eliminate the need for
> scripting outside the database, which is a pretty big deal since
> external scripts are a real pain to keep cross platform.  Since
> there's probably a lot of overlapping problems in those two features,
> why not tackle both at once?

Divide and conquer?

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: scheduler in core

From
Merlin Moncure
Date:
On Mon, Feb 22, 2010 at 2:29 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> Merlin Moncure escribió:
>
>> Like I noted above, what people want to schedule is going to be stored
>> procedures.  Having both would virtually eliminate the need for
>> scripting outside the database, which is a pretty big deal since
>> external scripts are a real pain to keep cross platform.  Since
>> there's probably a lot of overlapping problems in those two features,
>> why not tackle both at once?
>
> Divide and conquer?

When I meant 'tackle', it is more of a 'come to an understanding'
thing.  Normally I would agree with you anyways, but I think what most
people would want to schedule would be stored procedures (sorry to
continually repeat myself here, but I really think this should be
critical to any scheduling proposal), not functions or ad hoc scripts.

merlin


Re: scheduler in core

From
Robert Haas
Date:
On Sat, Feb 20, 2010 at 4:41 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> IMNSHO, an 'in core' scheduler would be useful. however, I think
> before you tackle a scheduler, we need proper stored procedures.  Our
> existing functions don't cut it because you can manage the transaction
> state yourself.

Did you mean that you "can't" manage the transaction state yourself?

Has anyone given any thought to what would be required to relax this
restriction?  Is this totally impossible given our architecture, or
just a lack of round tuits?

See also: http://www.postgresql.org/docs/current/static/plpgsql-porting.html#PLPGSQL-PORTING-EXCEPTIONS

...Robert


Re: scheduler in core

From
Pavel Stehule
Date:
2010/3/1 Robert Haas <robertmhaas@gmail.com>:
> On Sat, Feb 20, 2010 at 4:41 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> IMNSHO, an 'in core' scheduler would be useful. however, I think
>> before you tackle a scheduler, we need proper stored procedures.  Our
>> existing functions don't cut it because you can manage the transaction
>> state yourself.
>
> Did you mean that you "can't" manage the transaction state yourself?
>
> Has anyone given any thought to what would be required to relax this
> restriction?  Is this totally impossible given our architecture, or
> just a lack of round tuits?

I thing so it is very hard restriction based on using and architecture
of our SPI interface. Our stored procedures are executed inside one
SELECT statement - it is reason for limit. There cannot be two or more
outer transactions. Different implementations has different place of
runtime - it is more near to top of pipeline.

Pavel

>
> See also: http://www.postgresql.org/docs/current/static/plpgsql-porting.html#PLPGSQL-PORTING-EXCEPTIONS
>
> ...Robert
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: scheduler in core

From
Merlin Moncure
Date:
On Mon, Mar 1, 2010 at 4:43 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Sat, Feb 20, 2010 at 4:41 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> IMNSHO, an 'in core' scheduler would be useful. however, I think
>> before you tackle a scheduler, we need proper stored procedures.  Our
>> existing functions don't cut it because you can manage the transaction
>> state yourself.
>
> Did you mean that you "can't" manage the transaction state yourself?
>
> Has anyone given any thought to what would be required to relax this
> restriction?  Is this totally impossible given our architecture, or
> just a lack of round tuits?

yeah...that's what I meant.  plpgsql exceptions are no help because
there are many cases where you simply don't want the whole sequence of
operations to run in a single transaction.  loading lots of data to
many tables is one.  any operation that depends on transaction commit
to do something (like notifications) and then hook on the results is
another. you always have the heavy hitting administrative functions
like vacuum, etc.   another case is if you want a procedure to simply
run forever...trivially done in a procedure, impossible in a function.

The way people do this stuff now is to involve an 1) external
scheduler such as cron and 2) .sql scripts for relatively simple
things and/or a external scripting language like bash/perl.

The external scheduler has a couple of annoying issues...completely
not portable to code against and scheduling sub minute accuracy is a
big headache.  Also, adjusting the scheduling based on database events
is, while not impossible, more difficult than it should be.  External
.sql scripts are portable but extremely limited.  Involving something
like perl just so I can jump outside the database to do manual
transaction management is fine but ISTM these type of things are much
better when done inside the database IMNSHO.

Another factor here is that a sizable percentage of our user base is
bargain hunters coming in from other systems like oracle and ms sql
and having to rely in o/s scheduler is very distasteful to them.  It's
a hole, one of the last remaining IMO, in postgres being able to
provide a complete server side development environment without having
to deal with the o/s at all.

I stand by my statements earlier.  Any moderate level and up
complexity database has all kinds of scheduling and scripting going on
supporting it. These things really should be part of the database,
dump with it, and run in a regular way irregardless of platform and
server environment etc.  With that, 90% of the code I have to write
outside of the database goes away.

merlin


Re: scheduler in core

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Sat, Feb 20, 2010 at 4:41 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> IMNSHO, an 'in core' scheduler would be useful. however, I think
>> before you tackle a scheduler, we need proper stored procedures. �Our
>> existing functions don't cut it because you can manage the transaction
>> state yourself.

> Did you mean that you "can't" manage the transaction state yourself?

> Has anyone given any thought to what would be required to relax this
> restriction?  Is this totally impossible given our architecture, or
> just a lack of round tuits?

There is lots and lots of discussion of that in the archives.  It's
fundamentally impossible for PL functions done in the current style to
start or commit transactions, unless you resort to dblink-style kluges.
What's been discussed is some sort of structure that would allow a chunk
of PL code to execute "outside" a transaction and thus issue its own
begin and commit commands.  This idea is what Merlin is calling a stored
procedure, though personally I dislike that terminology.  Anyway,
nothing's got past the arm-waving stage as yet.
        regards, tom lane