David Mitchell <david.mitchell@telogis.com> writes:
> However, occasionally we need to import data, and this involves
> inserting several million rows into a table, but this just *cripples*
> postgres. After the import has been running for a while, simple selects
> take a long time, and strangely, the query planner starts using a
> sequential scan on a table with millions of items in it. Surely inserts
> shouldn't be doing this? Should I be running a vacuum during the import?
PG 8.0 and up should handle this a bit more nicely. Before 8.0 the
planner looked only at the table size recorded in pg_catalog (the
relpages and reltuples fields) and so of course it wouldn't react to a
still-in-progress mass insert. Now it looks at the current actual
physical table size, and so it should be willing to switch to indexscan
if needed before the import completes.
If you *are* using 8.0 then we need to look closer.
> After how many operations should I be calling vacuum? We tried using
> pg_autovacuum but it didn't seem to do a very good job since we would
> often find the query planner getting it wrong, and a vacuum analyze
> would fix it.
autovacuum has a problem with responding to transactions that are
still-in-progress too :-(
> Part of the import is to update items in a table to keep track of where
> the import is up to. The update is just setting an integer column, so
> does this create a dead tuple or will postgres alter the column in-place?
Postgres doesn't *ever* update anything in-place.
regards, tom lane