Home > mailing lists

Re: count(*) performance improvement ideas - Mailing list pgsql-hackers

From	Pavan Deolasee
Subject	Re: count(*) performance improvement ideas
Date	March 20, 2008 07:00:47
Msg-id	2e78013d0803200000l34e306a3j515447017b58a58d@mail.gmail.com Whole thread Raw
In response to	Re: count(*) performance improvement ideas ("Stephen Denne" <Stephen.Denne@datamail.co.nz>)
Responses	Re: count(*) performance improvement ideas
List	pgsql-hackers

Tree view

On Thu, Mar 20, 2008 at 3:24 AM, Stephen Denne
<Stephen.Denne@datamail.co.nz> wrote:

>
>  Pavan also refers to deferred triggers, which has got me thinking about another possible solution:
>
>  Instead of inserting a delta row, that will be updated a lot of times, create an on commit drop temp table named
afterthe txid and the grouping value (which is an integer in my case, perhaps hash it if you're grouping by something
thatdoesn't easily convert to part of a table name),

>  create an after insert initially deferred constraint to call a function which will move the 'at commit' values of
therows in the temp table to the summary table.

>  The temp table will only have one row inserted, updated many times, then on commit the trigger is fired once, and
thetemp table is dropped.

>
>  Does anyone think this will or won't work for some reason?

I think this should work, although you may need to take some extra steps
to manage the summary table. Also, I think a single temp table per transaction
should suffice. The temp table would have one row per "group by" or "where"
condition on which you want to track the count. The corresponding row will
be updated as and when the corresponding count changes. You would need
INSERT/DELETE/UPDATE triggers to do that. If there are any subtransaction
aborts, that will be taken care by MVCC.

As you said, a single deferred trigger would then merge the temp table with
the summary table. Here we need to be extra careful because serializable
transactions may fail to update the same row in the summary table. One solution
I can think of is (and I haven't checked the archives, so somebody might have
already suggested this before):

Each summary table will have one summary row per "group by" or "where" condition
(same as temp table). In addition to that, it can have zero or more temporary
rows for the conditions.
  - select summary_rows from summary table for update nowait;  - if lock is not available, insert our deltas into the
summarytable as a new    row. These rows will be deleted as soon as some other transaction gets lock    on the summary
rowsand merge our deltas with them  - if lock is available     - merge our deltas with the summary rows     - check for
othertemporary deltas and merge them with the summary rows        and delete those temporary rows

I guess we can write the generic triggers as contrib module. What needs to done
is to let user specify the tables and the conditions on which they want to track
count(*) and then apply those conditions in the generic triggers.

Thanks,
Pavan
-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

pgsql-hackers by date:

From: Tatsuo Ishii
Date: 20 March 2008, 06:04:29
Subject: Re: Proposal: new large object API

From: Martin Pihlak
Date: 20 March 2008, 08:17:17
Subject: Re: stored procedure stats in collector

Re: count(*) performance improvement ideas - Mailing list pgsql-hackers

Previous

Next