Re: count(*) slow on large tables - Mailing list pgsql-performance

From Sean Chittenden
Subject Re: count(*) slow on large tables
Date
Msg-id 20031006170136.GB94718@perrin.nxad.com
Whole thread Raw
In response to Re: count(*) slow on large tables  (Shridhar Daithankar <shridhar_daithankar@persistent.co.in>)
List pgsql-performance
> How it will help? This is in addition to trigger proposal that came
> up earlier. With triggers it's not possible to make values visible
> across backends unless trigger updates a table, which eventually
> leads to vacuum/dead tuples problem.
>
> 1. User creates a trigger to check updates/inserts for certain conditions.
> 2. It updates the count as and when required.
> 3. If the trigger detects the count is not initialized, it would issue the
> same query first time. There is no avoiding this issue.
>
> Besides providing facility of resident variables could be used
> imaginatively as well.
>
> Does this make sense? IMO this is more generalised approach over all.

I do this _VERY_ frequently in my databases, only I have my stored
procs do the aggregate in a predefined MVCC table that's always there.
Here's a denormalized version for public consumption/thought:

CREATE TABLE global.dba_aggregate_cache (
  dbl TEXT NOT NULL,        -- The database location, doesn't need to be
                            -- qualified (ex: schema.table.col)
  op TEXT NOT NULL,         -- The operation, SUM, COUNT, etc.
  qual TEXT,                -- Any kind of conditional, such as a where clause
  val_int INT,              -- Whatever the value is, of type INT
  val_bigint BIGINT,        -- Whatever the value is, of type BIGINT
  val_text TEXT,            -- Whatever the value is, of type TEXT
  val_bytea BYTEA,          -- Whatever the value is, of type BYTEA
);
CREATE UNIQUE INDEX dba_aggregate_cache_dbl_op_udx ON global.dba_aggregate_cache(dbl,op);

Then, I use a function to retrieve this value instead of a SELECT
COUNT(*).

SELECT public.cache_count('dbl','qual');  -- In this case, the op is COUNT
SELECT public.cache_count('dbl');         -- Returns the COUNT for the table listed in the dbl

Then, I create 4 or 5 functions (depends on the op I'm performing):

1) A private function that _doesn't_ run as security definer, that
   populates the global.dba_aggregate_cache row if it's empty.
2) A STABLE function for SELECTs, if the row doesn't exist, then it
   calls function #1 to populate its existence.
3) A STABLE function for INSERTs, if the row doesn't exist, then it
   calls function #1 to populate its existence, then adds the
   necessary bits to make it accurate.
4) A STABLE function for DELETEs, if the row doesn't exist, then it
   calls function #1 to populate its existence, then deletes the
   necessary bits to make it accurate.
5) A STABLE function for UPDATEs, if the row doesn't exist, then it
   calls function #1 to populate its existence, then updates the
   necessary bits to make it accurate.  It's not uncommon for me to
   not have an UPDATE function/trigger.

Create triggers for functions 2-5, and test away.  It's MVCC,
searching through a table that's INDEX'ed for a single row is
obviously vastly faster than a seqscan/aggregate.  If I need any kind
of an aggregate to be fast, I use this system with a derivation of the
above table.  The problem with it being that I have to retrain others
to use cache_count(), or some other function instead of using
COUNT(*).

That said, it'd be nice if there were a way to tell PostgreSQL to do
the above for you and teach COUNT(*), SUM(*), or other aggregates to
use an MVCC backed cache similar to the above.  If people want their
COUNT's to be fast, then they have to live with the INSERT, UPDATE,
DELETE cost.  The above doesn't work with anything complex such as
join's, but it's certainly a start and I think satisfies everyone's
gripes other than the tuple churn that _does_ happen (*nudge nudge*,
pg_autovacuum could be integrated into the backend to handle this).
Those worried about performance, the pages that are constantly being
recycled would likely stay in disk cache (PG or the OS).  There's
still some commit overhead, but still... no need to over optimize by
requiring the table to be stored in the out dated, slow, and over used
shm (also, *nudge nudge*).

Anyway, let me throw that out there as a solution that I use and it
works quite well.  I didn't explain the use of the qual column, but I
think those who grasp the above way of handling things probably grok
how to use the qual column in a dynamically executed query.

CREATE AGGREGATE CACHE anyone?

-sc

--
Sean Chittenden

pgsql-performance by date:

Previous
From: James Rogers
Date:
Subject: Seqscan buffer promotion (was: reindex/vacuum locking/performance?)
Next
From: Jeff
Date:
Subject: SOlaris updates