Thread: Job control in sql
Hi. <br /><br /> I am building a system, where we have jobs that run at different times (and takes widely different lengthsof time). <br /><br /> Basically I have a jobs table: <br /><br /> create table jobs(<br /> id serial, <br /> ready boolean, <br /> job_begun timestamptz, <br /> job_done timestamptz, <br /> primary key (id)<br />);<br /><br /> This should run by cron, at it is my intention that the cronjob (basically) consists of<br /><i><br /> psql-c "select run_jobs()"</i><br /><br /> My problem is, that the job should ensure that it is not running already, whichwould be to set job_begun when the job starts". That can easily happen as jobs should be started every 15 minutes (tolower latency from ready to done) but some jobs can run for hours.. <br /><br /> The problem is that a later run of run_jobs()will not see the job_begun has been set by a prior run (that is unfinished - as all queries from the plpgsql-functionruns in a single, huge transaction). <br /><br /> My intitial idea was to set the isolation level to "readuncommitted" while doing the is-somebody-else-running-lookup, but I cannot change that in the plpgsql function (it complainsthat the session has to be empty - even when I have run nothing before it). <br /><br /> Any ideas on how to solvethe issue?<br /><br /> I run it on Pgsql 9.1.<br /><br /> Svenne<br />
On Fri, 25 May 2012 10:28:03 +0200, Svenne Krap wrote: [...] > The problem is that a later run of run_jobs() will not see the > job_begun has been set by a prior run (that is unfinished - as all > queries from the plpgsql-function runs in a single, huge > transaction). > > > My intitial idea was to set the isolation level to "read > uncommitted" > while doing the is-somebody-else-running-lookup, but I cannot change > that in the plpgsql function (it complains that the session has to be > empty - even when I have run nothing before it). > > Any ideas on how to solve the issue? Add a sort of status table where you insert your unique job identifer at the start of the function and remove it in the end? As seperate transactions of course. Jan -- professional: http://www.oscar-consult.de private: http://neslonek.homeunix.org/drupal/
W dniu 2012-05-25 14:52, Jan Lentfer pisze: > Add a sort of status table where you insert your unique job identifer at the start of the function > and remove it in the end? As seperate transactions of course. That might leave status set on forever in a case when a job crashes and does not reach the point where it removes the identifier.
W dniu 2012-05-25 10:28, Svenne Krap pisze: > Hi. > > I am building a system, where we have jobs that run at different times (and takes widely different > lengths of time). > > Basically I have a jobs table: > > create table jobs( > id serial, > ready boolean, > job_begun timestamptz, > job_done timestamptz, > primary key (id) > ); > > This should run by cron, at it is my intention that the cronjob (basically) consists of > / > psql -c "select run_jobs()"/ > > My problem is, that the job should ensure that it is not running already, which would be to set > job_begun when the job starts". That can easily happen as jobs should be started every 15 minutes > (to lower latency from ready to done) but some jobs can run for hours.. > > The problem is that a later run of run_jobs() will not see the job_begun has been set by a prior > run (that is unfinished - as all queries from the plpgsql-function runs in a single, huge > transaction). > > My intitial idea was to set the isolation level to "read uncommitted" while doing the > is-somebody-else-running-lookup, but I cannot change that in the plpgsql function (it complains > that the session has to be empty - even when I have run nothing before it). > > Any ideas on how to solve the issue? > > I run it on Pgsql 9.1. > > Svenne I think you might try in your run_jobs() SELECT job_begun FROM jobs WHERE id = job_to_run FOR UPDATE NOWAIT; This in case of conflict would throw the exception: 55P03 could not obtain lock on row in relation "jobs" and you handle it (or not, which might be OK too) in EXCEPTION block.
On 29-05-2012 12:32, Ireneusz Pluta wrote: > W dniu 2012-05-25 10:28, Svenne Krap pisze: >> Hi. >> >> I am building a system, where we have jobs that run at different >> times (and takes widely different lengths of time). >> >> Basically I have a jobs table: >> >> create table jobs( >> id serial, >> ready boolean, >> job_begun timestamptz, >> job_done timestamptz, >> primary key (id) >> ); >> >> This should run by cron, at it is my intention that the cronjob >> (basically) consists of >> / >> psql -c "select run_jobs()"/ >> >> My problem is, that the job should ensure that it is not running >> already, which would be to set job_begun when the job starts". That >> can easily happen as jobs should be started every 15 minutes (to >> lower latency from ready to done) but some jobs can run for hours.. >> >> The problem is that a later run of run_jobs() will not see the >> job_begun has been set by a prior run (that is unfinished - as all >> queries from the plpgsql-function runs in a single, huge transaction). >> >> My intitial idea was to set the isolation level to "read uncommitted" >> while doing the is-somebody-else-running-lookup, but I cannot change >> that in the plpgsql function (it complains that the session has to be >> empty - even when I have run nothing before it). >> >> Any ideas on how to solve the issue? >> >> I run it on Pgsql 9.1. >> >> Svenne > > I think you might try in your run_jobs() > SELECT job_begun FROM jobs WHERE id = job_to_run FOR UPDATE NOWAIT; > This in case of conflict would throw the exception: > 55P03 could not obtain lock on row in relation "jobs" > and you handle it (or not, which might be OK too) in EXCEPTION block. > Hehe.. good idea... In the mean time I had thought about using advisory locks for the same thing, but the old-fashioned locks work fine too. Svenne
Svenne Krap wrote: > On 29-05-2012 12:32, Ireneusz Pluta wrote: > > W dniu 2012-05-25 10:28, Svenne Krap pisze: > >> Hi. > >> > >> I am building a system, where we have jobs that run at different > >> times (and takes widely different lengths of time). > >> > >> Basically I have a jobs table: > >> > >> create table jobs( > >> id serial, > >> ready boolean, > >> job_begun timestamptz, > >> job_done timestamptz, > >> primary key (id) > >> ); > >> > >> This should run by cron, at it is my intention that the cronjob > >> (basically) consists of > >> / > >> psql -c "select run_jobs()"/ > >> > >> My problem is, that the job should ensure that it is not running > >> already, which would be to set job_begun when the job starts". That > >> can easily happen as jobs should be started every 15 minutes (to > >> lower latency from ready to done) but some jobs can run for hours.. > >> > >> The problem is that a later run of run_jobs() will not see the > >> job_begun has been set by a prior run (that is unfinished - as all > >> queries from the plpgsql-function runs in a single, huge transaction). > >> > >> My intitial idea was to set the isolation level to "read uncommitted" > >> while doing the is-somebody-else-running-lookup, but I cannot change > >> that in the plpgsql function (it complains that the session has to be > >> empty - even when I have run nothing before it). > >> > >> Any ideas on how to solve the issue? > >> > >> I run it on Pgsql 9.1. > >> > >> Svenne > > > > I think you might try in your run_jobs() > > SELECT job_begun FROM jobs WHERE id = job_to_run FOR UPDATE NOWAIT; > > This in case of conflict would throw the exception: > > 55P03 could not obtain lock on row in relation "jobs" > > and you handle it (or not, which might be OK too) in EXCEPTION block. > > > Hehe.. good idea... > > In the mean time I had thought about using advisory locks for the same > thing, but the old-fashioned locks work fine too.
Sorry about the earlier unfinished post - premature click. Svenne Krap wrote: > Ireneusz Pluta wrote: >> Svenne Krap pisze: > >> I am building a system, where we have jobs that run at different > >> times (and takes widely different lengths of time). > >> > >> Basically I have a jobs table: > >> > >> create table jobs( > >> id serial, > >> ready boolean, > >> job_begun timestamptz, > >> job_done timestamptz, > >> primary key (id) > >> ); > >> > >> This should run by cron, at it is my intention that the cronjob > >> (basically) consists of > >> / > >> psql -c "select run_jobs()"/ > >> > >> My problem is, that the job should ensure that it is not running > >> already, which would be to set job_begun when the job starts". That > >> can easily happen as jobs should be started every 15 minutes (to > >> lower latency from ready to done) but some jobs can run for hours.. > >> > >> The problem is that a later run of run_jobs() will not see the > >> job_begun has been set by a prior run (that is unfinished - as all > >> queries from the plpgsql-function runs in a single, huge transaction). > >> > >> My intitial idea was to set the isolation level to "read uncommitted" > >> while doing the is-somebody-else-running-lookup, but I cannot change > >> that in the plpgsql function (it complains that the session has to be > >> empty - even when I have run nothing before it). > >> > >> Any ideas on how to solve the issue? Use a database to hold data. Use run-time constructs in the program or script to handle run-time considerations. How about using a shell script that uses "ps" to determine if a job is already running, or using a lock file in the file system known to the control script? > >> I run it on Pgsql 9.1. >> >> I think you might try in your run_jobs() >> SELECT job_begun FROM jobs WHERE id = job_to_run FOR UPDATE NOWAIT; >> This in case of conflict would throw the exception: >> 55P03 could not obtain lock on row in relation "jobs" >> and you handle it (or not, which might be OK too) in EXCEPTION block. >> > Hehe.. good idea... > > In the mean time I had thought about using advisory locks for the same > thing, but the old-fashioned locks work fine too. Or don't use the DBMS that way at all. -- Lew