Re: Table growing faster than autovacuum can vacuum - Mailing list pgsql-general

From Marti Raudsepp
Subject Re: Table growing faster than autovacuum can vacuum
Date
Msg-id CABRT9RB59ehYuOVYyZmkvHrdO6+-gLJ9dJ7pUfCdSnt2Wa=99g@mail.gmail.com
Whole thread Raw
In response to Table growing faster than autovacuum can vacuum  (Asher Hoskins <asher@piceur.com>)
Responses Re: Table growing faster than autovacuum can vacuum  (Marti Raudsepp <marti@juffo.org>)
List pgsql-general
On Wed, Feb 15, 2012 at 18:46, Asher Hoskins <asher@piceur.com> wrote:
> My problem is that the autovacuum system isn't keeping up with INSERTs and I
> keep running out of transaction IDs.

This is usually not a problem with vacuum, but a problem with
consuming too many transaction IDs. I suspect you're loading that data
with individual INSERT statements with no explicit transaction
management -- which means every data-modifying query gets its own
transaction ID.

In most applications, the simplest solution is batching up lots of
INSERTs (say, 10k) and run them all in a single transaction between
BEGIN and COMMIT -- thus consuming just 1 transaction ID for 10k rows.
You could also look at multi-row inserts or the COPY command to get
better performance. Here's an overview of different data loading
approaches: http://www.depesz.com/2007/07/05/how-to-insert-data-to-database-as-fast-as-possible/

But it's also possible that transaction IDs are consumed by other
queries. Any PL/pgSQL procedure that makes use of exception handling
consumes a transaction ID. So does every SAVEPOINT command.

> I'm a complete newby when it comes to PostgreSQL system settings and it
> isn't obvious to me what I'd need to change to improve the autovacuum. Is
> there any way to manually freeze the rows of the table after they've been
> inserted so that the autovacuum doesn't need to touch the table?

VACUUM FREEZE is the command. :)

You may need to tune the vacuum settings to be more aggressive if you
want it to run faster (reducing cost_delay, increasing cost_limit).
But I don't have much experience with these myself.
http://www.postgresql.org/docs/8.4/static/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-VACUUM-COST

Note that vacuum freeze is only necessary when you have a shortage of
transaction IDs. Normally Postgres doesn't vacuum insert-only tables
at all (it just analyzes them).

VACUUM FULL is extremely inefficient in PostgreSQL 8.4 and older.

> The server is running PostgreSQL 8.4.2 (under FreeBSD) at the moment but it
> wouldn't be a problem to upgrade to 9.1 if that was helpful.

Upgrading to 9.1 certainly lets you take advantage of several
performance improvements, but I don't know how useful those are in
your workload.

----

Additionally, it's often a good idea to partition large tables into
smaller partitions (e.g. separate partition for each day/week worth of
data). This way maintenance commands (VACUUM, ANALYZE) don't have to
scan the whole huge table, but work with smaller individual tables,
and you can drop chunks using a simple DROP TABLE.

BUT it also has its problems: some types of query optimizations are
impossible with partitioned tables (version 9.1 relaxed those
limitations somewhat). So if you find your current performance to be
sufficient, then this might not be worth doing.

Regards,
Marti

pgsql-general by date:

Previous
From: Yangyang
Date:
Subject: How to make DBT-3 workload work?
Next
From: Marti Raudsepp
Date:
Subject: Re: Table growing faster than autovacuum can vacuum