Re: Performance query about large tables, lots of concurrent access - Mailing list pgsql-performance

From Karl Wright
Subject Re: Performance query about large tables, lots of concurrent access
Date
Msg-id 4677E1E4.903@metacarta.com
Whole thread Raw
In response to Re: Performance query about large tables, lots of concurrent access  (Gregory Stark <stark@enterprisedb.com>)
Responses Re: Performance query about large tables, lots of concurrent access
List pgsql-performance
Gregory Stark wrote:
> "Karl Wright" <kwright@metacarta.com> writes:
>
>> This particular run lasted four days before a VACUUM became essential. The
>> symptom that indicates that VACUUM is needed seems to be that the CPU usage of
>> any given postgresql query skyrockets.  Is this essentially correct?
>
> Postgres is designed on the assumption that VACUUM is run regularly. By
> "regularly" we're talking of an interval usually on the order of hours, or
> even less. On some workloads some tables need to be vacuumed every 5 minutes,
> for example.

Fine - but what if the previous vacuum is still in progress, and does
not finish in 5 minutes?

>
> VACUUM doesn't require shutting down the system, it doesn't lock any tables or
> otherwise prevent other jobs from making progress. It does add extra i/o but
> there are knobs to throttle its i/o needs. The intention is that VACUUM run in
> the background more or less continually using spare i/o bandwidth.
>

This spare bandwidth is apparently hard to come by in my particular
application.  That's the only way I can reconcile your information with
it taking 4 days to complete.

> The symptom of not having run vacuum regularly is that tables and indexes
> bloat to larger sizes than necessary. If you run "VACUUM VERBOSE" it'll tell
> you how much bloat your tables and indexes are suffering from (though the
> output is a bit hard to interpret).
>
> Table and index bloat slow things down but not generally by increasing cpu
> usage. Usually they slow things down by causing queries to require more i/o.
>

Yes, that's what I understood, which is why I was puzzled by the effects
I was seeing.

> It's only UPDATES and DELETES that create garbage tuples that need to be
> vacuumed though. If some of your tables are mostly insert-only they might need
> to be vacuumed as frequently or at all.
>

Well, the smaller tables don't change much, but the bigger tables have a
  lively mix of inserts and updates, so I would expect these would need
vacuuming often.

I'll post again when I can find a vacuum schedule that seems to work.

Karl

pgsql-performance by date:

Previous
From: Kurt Overberg
Date:
Subject: Maintenance question / DB size anomaly...
Next
From: Karl Wright
Date:
Subject: Re: Performance query about large tables, lots of concurrent access