Re: Is there such a thing as a 'background database job'? - Mailing list pgsql-general

From Vivek Khera
Subject Re: Is there such a thing as a 'background database job'?
Date
Msg-id 70666D1F-AB2E-49CB-B61B-815B8EB01B7C@khera.org
Whole thread Raw
In response to Is there such a thing as a 'background database job'?  (Mike Nolan <nolan@gw.tssi.com>)
List pgsql-general
On Aug 22, 2005, at 10:53 PM, Mike Nolan wrote:

> In a recent discussion with an academician friend of mine regarding
> how
> to improve performance on a system, he came up with the idea of
> taking what
> is now a monthly purge/cleanup job that takes about 24 hours (and
> growing)
> and splitting it up into a series of smaller tasks.
>

Well, the purge/cleanup requires a finite amount of work X.  Either
you make that a more efficient process requiring Y < X amount of
time, or you split up the process so that X/N time is used for each
of the N runs of the cleanup, or you do both...

Just splitting the job  doesn't mean it will take less time overall,
since it still must do the same total amount of work.

I'm guessing that when you're cleanup is running, it impacts the
performance of the rest of the system, and that is the "faster" you
want to achieve.

My recommendation of what you want to do (and this is what I do) for
your cleanup process is to throttle it and let it run over many days
but pause between smaller tasks, or change your procedure to let you
run your cleanup once per week or per day incrementally.

In my case, I need to purge about 6000 rows from one table, but the
cascading deletes end up removing anywhere from 500 to 200k rows in
other tables per row deleted in the main table.  Since I know how
many referenced rows will be removed, I keep a tally and when the
total reaches > 150k rows, I pause for a while.  If an individual
rows results in > 10k related rows being deleted, I pause for a
smaller amount of time.

This keeps everything moving along, and *nobody* notices.  So what if
it takes 3 days to finish...


Vivek Khera, Ph.D.
+1-301-869-4449 x806



pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Help with a subselect inside a view
Next
From: Vivek Khera
Date:
Subject: Re: ctid access is slow