Thread: updating database

updating database

From
Charles Day
Date:
We have a knowledge base that needs updated every night from a text file.
Currently a script deletes everything from a table and adds the new data.
This procedure seems to slowly cause the database to loose performance
daily.  It also leaves our database useless for several hours starting at
midnight.  I thought about using 2 databases, updating one while using the
other, then swapping them every day.  I also thought about having one
db.out, then the other in.

Is there a better way to do this?

Thanks in advance.

Charles Day

Re: updating database

From
Tom Lane
Date:
Charles Day <chaday@symix.com> writes:
> We have a knowledge base that needs updated every night from a text file.
> Currently a script deletes everything from a table and adds the new data.
> This procedure seems to slowly cause the database to loose performance
> daily.

It would, if you neglect to vacuum the table afterward --- vacuum is
needed to reclaim discarded space inside the table.

The fastest way to bulk-load a table in PostgreSQL is:
    DROP table (yes, really)
    CREATE table again
    COPY table from stdin
    CREATE any indexes defined on table
    VACUUM ANALYZE table
Note that the indexes are best built after the loading step,
so that they are computed "en masse" instead of piecemeal as
each tuple is loaded.  You can see examples of this procedure
in the reload scripts generated by pg_dump.

I do, however, wonder whether bulk-loading is the right approach
for you at all.  Surely most of the table rows don't change from
one day to the next?  If you can teach your script to just delete
and add the tuples that actually changed, you'd probably find that
things are faster --- and you wouldn't have the problem of the
database being unusable during the update.

            regards, tom lane