Thread: Is there such a thing as a 'background database job'?

Is there such a thing as a 'background database job'?

From
Mike Nolan
Date:
In a recent discussion with an academician friend of mine regarding how
to improve performance on a system, he came up with the idea of taking what
is now a monthly purge/cleanup job that takes about 24 hours (and growing)
and splitting it up into a series of smaller tasks.

That part's fairly easy, but his next idea was to run those tasks more or
less continuously in the background.

As I understand most database back-ends, including Postgresql, there
really isn't a way to run queries for an application in the background
(ie, at a lower priority), especially for an application that does updates.

I suppose I could 'nice' the program itself, but would that have any
positive impact?
--
Mike Nolan

Re: Is there such a thing as a 'background database job'?

From
Vivek Khera
Date:
On Aug 22, 2005, at 10:53 PM, Mike Nolan wrote:

> In a recent discussion with an academician friend of mine regarding
> how
> to improve performance on a system, he came up with the idea of
> taking what
> is now a monthly purge/cleanup job that takes about 24 hours (and
> growing)
> and splitting it up into a series of smaller tasks.
>

Well, the purge/cleanup requires a finite amount of work X.  Either
you make that a more efficient process requiring Y < X amount of
time, or you split up the process so that X/N time is used for each
of the N runs of the cleanup, or you do both...

Just splitting the job  doesn't mean it will take less time overall,
since it still must do the same total amount of work.

I'm guessing that when you're cleanup is running, it impacts the
performance of the rest of the system, and that is the "faster" you
want to achieve.

My recommendation of what you want to do (and this is what I do) for
your cleanup process is to throttle it and let it run over many days
but pause between smaller tasks, or change your procedure to let you
run your cleanup once per week or per day incrementally.

In my case, I need to purge about 6000 rows from one table, but the
cascading deletes end up removing anywhere from 500 to 200k rows in
other tables per row deleted in the main table.  Since I know how
many referenced rows will be removed, I keep a tally and when the
total reaches > 150k rows, I pause for a while.  If an individual
rows results in > 10k related rows being deleted, I pause for a
smaller amount of time.

This keeps everything moving along, and *nobody* notices.  So what if
it takes 3 days to finish...


Vivek Khera, Ph.D.
+1-301-869-4449 x806