Re: count(*) slow on large tables - Mailing list pgsql-performance

From Shridhar Daithankar
Subject Re: count(*) slow on large tables
Date
Msg-id 3F81066C.90402@persistent.co.in
Whole thread Raw
In response to Re: count(*) slow on large tables  (Bruce Momjian <pgman@candle.pha.pa.us>)
Responses Re: count(*) slow on large tables  (Sean Chittenden <sean@chittenden.org>)
List pgsql-performance
Bruce Momjian wrote:
> OK, I beefed up the TODO:
>
>     * Use a fixed row count and a +/- count with MVCC visibility rules
>       to allow fast COUNT(*) queries with no WHERE clause(?)
>
> I can always give the details if someone asks.  It doesn't seem complex
> enough for a separate TODO.detail item.

May I propose alternate approach for this optimisation?

- Postgresql allows to maintain user defined variables in shared memory.
- These variables obey transactions but do not get written to disk at all.
- There should be a facility to detect whether such a variable is initialized or
not.

How it will help? This is in addition to trigger proposal that came up earlier.
With  triggers it's not possible to make values visible across backends unless
trigger updates a table, which eventually leads to vacuum/dead tuples problem.

1. User creates a trigger to check updates/inserts for certain conditions.
2. It updates the count as and when required.
3. If the trigger detects the count is not initialized, it would issue the same
query first time. There is no avoiding this issue.

Besides providing facility of resident variables could be used imaginatively as
well.

Does this make sense? IMO this is more generalised approach over all.

Just a thought.

  Shridhar




pgsql-performance by date:

Previous
From: Andrew Sullivan
Date:
Subject: Re: reindex/vacuum locking/performance?
Next
From: Shridhar Daithankar
Date:
Subject: Re: Postgres low end processing.