Hello.
I've got a database with a very large table (currently holding 23.5
billion rows, the output of various data loggers over the course of my
PhD so far). The table itself has a trivial structure (see below) and is
partitioned by data time/date and has quite acceptable INSERT/SELECT
performance.
CREATE TABLE rawdata (
value REAL NOT NULL,
sample_time TIMESTAMP WITH TIME ZONE NOT NULL,
-- Dataset reference.
dataset INTEGER NOT NULL
);
The data loggers are collected every month or so and uploaded into the
database, resulting in another 1-2 billion rows in the table each time.
Data is never deleted from the table and so it's essentially read-only.
My problem is that the autovacuum system isn't keeping up with INSERTs
and I keep running out of transaction IDs. SELECT performance also drops
off the more I insert, which from looking at the output of iostat seems
to be because the autovacuum tasks are taking up a lot of the disk
bandwidth - the disks are commodity items in a software RAID and not
terribly fast.
My current workaround is to stop the server after a batch of inserts and
then manually VACUUM FULL all the databases, letting this run over a
weekend.
I'm a complete newby when it comes to PostgreSQL system settings and it
isn't obvious to me what I'd need to change to improve the autovacuum.
Is there any way to manually freeze the rows of the table after they've
been inserted so that the autovacuum doesn't need to touch the table?
The rest of the database contains metadata about the main data, about
250 million rows at the moment, which I guess is a small enough amount
that I can let PostgreSQL handle it automatically.
The server is running PostgreSQL 8.4.2 (under FreeBSD) at the moment but
it wouldn't be a problem to upgrade to 9.1 if that was helpful.
Asher.