Re: Performance query about large tables, lots of concurrent access - Mailing list pgsql-performance

From Gregory Stark
Subject Re: Performance query about large tables, lots of concurrent access
Date
Msg-id 87bqfc83bp.fsf@oxford.xeocode.com
Whole thread Raw
In response to Re: Performance query about large tables, lots of concurrent access  (Karl Wright <kwright@metacarta.com>)
Responses Re: Performance query about large tables, lots of concurrent access  (Karl Wright <kwright@metacarta.com>)
List pgsql-performance
"Karl Wright" <kwright@metacarta.com> writes:

> This particular run lasted four days before a VACUUM became essential. The
> symptom that indicates that VACUUM is needed seems to be that the CPU usage of
> any given postgresql query skyrockets.  Is this essentially correct?

Postgres is designed on the assumption that VACUUM is run regularly. By
"regularly" we're talking of an interval usually on the order of hours, or
even less. On some workloads some tables need to be vacuumed every 5 minutes,
for example.

VACUUM doesn't require shutting down the system, it doesn't lock any tables or
otherwise prevent other jobs from making progress. It does add extra i/o but
there are knobs to throttle its i/o needs. The intention is that VACUUM run in
the background more or less continually using spare i/o bandwidth.

The symptom of not having run vacuum regularly is that tables and indexes
bloat to larger sizes than necessary. If you run "VACUUM VERBOSE" it'll tell
you how much bloat your tables and indexes are suffering from (though the
output is a bit hard to interpret).

Table and index bloat slow things down but not generally by increasing cpu
usage. Usually they slow things down by causing queries to require more i/o.

It's only UPDATES and DELETES that create garbage tuples that need to be
vacuumed though. If some of your tables are mostly insert-only they might need
to be vacuumed as frequently or at all.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com


pgsql-performance by date:

Previous
From: Karl Wright
Date:
Subject: Re: Performance query about large tables, lots of concurrent access
Next
From: Alvaro Herrera
Date:
Subject: Re: Performance query about large tables, lots of concurrent access