Thread: using SQL for multi-machine job management?

using SQL for multi-machine job management?

From
jebjeb
Date:
I'm considering using PostgreSQL as part of the implementation of a
multi-machine job management system.  Here is an overview of the system:

-jobs are submitted by an API and stored to a SQL database.  Jobs contain a
list of source filenames and a description of the operations to perform on
the files (compress file, add it to an archive, encrypt it, compare with
other file, etc).

-multiple machines (up to 50?) look at the database and grabs a job.  It
will update the database to indicate that it will be the machine running
this job.  It will also update the database with the current completion
progress (%) of this job.

1)Is this something done often using SQL databases?
2)The jobs will be quite CPU intensive: will I run into trouble if the
database is located on one of the machine which will be executing the jobs?
3)I would like to have a backup ready to take over if the machine with the
database fails.  Only some of the info I store (data about a job, machine
that is executing the job) is important to back up.  Things like job
progress doesn't have to be backed up.  Any tips on how I should set up the
database to accomplish this?

Thanks!
--
View this message in context:
http://www.nabble.com/using-SQL-for-multi-machine-job-management--tp25418254p25418254.html
Sent from the PostgreSQL - novice mailing list archive at Nabble.com.


Re: using SQL for multi-machine job management?

From
Sean Davis
Date:


On Sat, Sep 12, 2009 at 5:05 PM, jebjeb <martin.belleau@yahoo.com> wrote:

I'm considering using PostgreSQL as part of the implementation of a
multi-machine job management system.  Here is an overview of the system:

-jobs are submitted by an API and stored to a SQL database.  Jobs contain a
list of source filenames and a description of the operations to perform on
the files (compress file, add it to an archive, encrypt it, compare with
other file, etc).

-multiple machines (up to 50?) look at the database and grabs a job.  It
will update the database to indicate that it will be the machine running
this job.  It will also update the database with the current completion
progress (%) of this job.

1)Is this something done often using SQL databases?
2)The jobs will be quite CPU intensive: will I run into trouble if the
database is located on one of the machine which will be executing the jobs?
3)I would like to have a backup ready to take over if the machine with the
database fails.  Only some of the info I store (data about a job, machine
that is executing the job) is important to back up.  Things like job
progress doesn't have to be backed up.  Any tips on how I should set up the
database to accomplish this?


You might want to look into using something like SLURM or SGE or Condor for doing this type of thing. 

Sean