Re: updating database - Mailing list pgsql-sql

From Tom Lane
Subject Re: updating database
Date
Msg-id 12362.910279487@sss.pgh.pa.us
Whole thread Raw
In response to updating database  (Charles Day <chaday@symix.com>)
List pgsql-sql
Charles Day <chaday@symix.com> writes:
> We have a knowledge base that needs updated every night from a text file.
> Currently a script deletes everything from a table and adds the new data.
> This procedure seems to slowly cause the database to loose performance
> daily.

It would, if you neglect to vacuum the table afterward --- vacuum is
needed to reclaim discarded space inside the table.

The fastest way to bulk-load a table in PostgreSQL is:
    DROP table (yes, really)
    CREATE table again
    COPY table from stdin
    CREATE any indexes defined on table
    VACUUM ANALYZE table
Note that the indexes are best built after the loading step,
so that they are computed "en masse" instead of piecemeal as
each tuple is loaded.  You can see examples of this procedure
in the reload scripts generated by pg_dump.

I do, however, wonder whether bulk-loading is the right approach
for you at all.  Surely most of the table rows don't change from
one day to the next?  If you can teach your script to just delete
and add the tuples that actually changed, you'd probably find that
things are faster --- and you wouldn't have the problem of the
database being unusable during the update.

            regards, tom lane

pgsql-sql by date:

Previous
From: Barracuda
Date:
Subject: Updating database
Next
From: "Jackson, DeJuan"
Date:
Subject: RE: [SQL] Updating database