Re: Improving count(*) - Mailing list pgsql-hackers

From Tino Wildenhain
Subject Re: Improving count(*)
Date
Msg-id 437DAC4C.8080105@wildenhain.de
Whole thread Raw
In response to Re: Improving count(*)  ("Zeugswetter Andreas DCP SD" <ZeugswetterA@spardat.at>)
List pgsql-hackers
Zeugswetter Andreas DCP SD schrieb:
>>>The instant someone touches a block it would no longer be marked as 
>>>frozen (vacuum or analyze or other is not required) and count(*)
> 
> would 
> 
>>>visit the tuples in the block making the correct decision at that
> 
> time.
> 
>>Hmm, so the idea would be that if a block no longer contained any
> 
> tuples hidden from any active transaction,
> 
>>you could store the count and skip reading that page.
> 
> 
> I like the approach of informix and maxdb, that can tell the count(*)
> instantly without looking at index leaf or data pages.
> 
> Imho we could do that with a central storage of count(*) even with mvcc.
> The idea is a base value for count(*) and corrective values per open
> xid.
> To tell the count you add all corrective values whose xid is visible in
> snapshot.
> Each backend is responsibe for compacting xid counters below min open
> xid.
> Periodically (e.g. at checkpoint time) you compact (aggregate committed
> xid counters 
> into the base value) and persist the count.
> 
> Since that costs, I guess I would make it optional and combine it with
> materialized 
> views that are automatically used at runtime, and can at the same time
> answer other 
> aggregates or aggregates for groups. 
> create materialized view xx_agg enable query rewrite as select count(*),
> sum (col1) from xx
> [group by col2];
> 

I wonder how many times you really need a count(*) w/o where clause.
If I understand you correctly you are trying to optimize just this
one case?

Regards
Tino


pgsql-hackers by date:

Previous
From: "Zeugswetter Andreas DCP SD"
Date:
Subject: Re: Improving count(*)
Next
From: "Zeugswetter Andreas DCP SD"
Date:
Subject: Re: MERGE vs REPLACE