Thread: Rollover of tables?

Rollover of tables?

From
Steve Wampler
Date:
I've got a (very simple) postgresql database that records logging
information from a group of distributed applications.
It's already beginning to get somewhat large (and the projected
lifetime of the application set is 25 years...).  Some
sort of rollover is going to be needed archive old log information.

I can see two easy approaches (feel free to suggest better ones!):

(a) rename the table as an 'archive' log table and then recreate the
    'active' log table.

(b) extract the old log information into an archive table,
    removing it from the original

In both cases, the archive table will later be unloaded (first
compressed and then to tape) to conserve disk space.  This will be
done infrequently.

Any feelings on which is a better way to go?  (a) should be
nice and fast (right?), but (b) has the advantage of allowing partial
extractions - so only log information over a week or month old
would be archived each time, plus there should be no problem with
insertions happening during the rollover process (right?).

I like (b), personally, but would like to know if anyone
sees any "gotcha's", especially w.r.t. postgresql as the
database system.

Thanks!

--
Steve Wampler-  SOLIS Project, National Solar Observatory
swampler@noao.edu

Re: Rollover of tables?

From
Jeff Hoffmann
Date:
Steve Wampler wrote:

> I can see two easy approaches (feel free to suggest better ones!):
>
> (a) rename the table as an 'archive' log table and then recreate the
>     'active' log table.
>
> (b) extract the old log information into an archive table,
>     removing it from the original
>

> I like (b), personally, but would like to know if anyone
> sees any "gotcha's", especially w.r.t. postgresql as the
> database system.
>

if you vacuum on a regular basis, (b) shouldn't be a problem.  right
now, postgres won't reuse the deleted pages unless you vacuum it.  your
table will balloon until you do a vacuum.  ideally you should be able to
do (a) in a transaction, but i'd have some concern if table
renaming/creation is in fact isolated in the transaction.

if you're worried about having the table unavailable, though (even for a
split second), i don't know if either one would work well for you if
you're needing to update the table during a vacuum.  you may want to
approach it in the application where you have the application log to
tables named for the date or week # (e.g., log_20000526).  you can then
play with the log files at your leisure.

jeff