Speeding up aggregates - Mailing list pgsql-performance

From Josh Berkus
Subject Speeding up aggregates
Date
Msg-id web-2024223@davinci.ethosmedia.com
Whole thread Raw
Responses Re: Speeding up aggregates  (Joe Conway <mail@joeconway.com>)
Re: Speeding up aggregates  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
Folks,

One of Postgres' poorest performing areas is aggregates.   This is the
unfortunate side effect of our fully extensible aggregate and type
system.   However, I thought that the folks on this list might have  a
few tips on making aggregates perform faster.

Here's mine:  Aggregate Caching Table

This is a brute-force approach.   However, if you have a table with a
million records for which users *frequently* ask for grand totals or
counts, it can work fine.

A simple example:

Table client_case_counts (
    client_id INT NOT NULL REFERENCES clients(client_id) ON DELETE
CASCADE;
    no_cases INT NOT NULL DEFAULT 0
);

Then create triggers:

Function tf_maintain_client_counts ()
returns opaque as '
BEGIN
UPDATE client_case_counts SET no_cases = no_cases + 1
WHERE client_id = NEW.client_id;
INSERT INTO client_case_counts ( client_id, no_cases )
VALUES ( NEW.client_id, 1 )
WHERE NOT EXISTS (SELECT client_id FROM client_case_counts ccc2
    WHERE ccc2.client_id = NEW.client_id);
RETURN NEW;
END;' LANGUAGE 'plpgsql';

Trigger tg_maintain_client_counts ON INSERT INTO cases
FOR EACH ROW EXECUTE tf_maintain_client_counts();
etc.

While effective, this approach is costly in terms of update/insert
processing.  It is also limited to whatever aggregate requests you have
anticipated ... it does no good for aggregates over a user-defined
range.

What have other Postgres users done to speed up aggregates on large
tables?

-Josh Berkus






pgsql-performance by date:

Previous
From: "john cartmell"
Date:
Subject: Re: ORDER BY ... LIMIT.. performance
Next
From: Joe Conway
Date:
Subject: Re: Speeding up aggregates