Re: Slow count(*) again... - Mailing list pgsql-performance

From Vitalii Tymchyshyn
Subject Re: Slow count(*) again...
Date
Msg-id 4CB58FDB.3080505@gmail.com
Whole thread Raw
In response to Re: Slow count(*) again...  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
12.10.10 21:58, Tom Lane написав(ла):
>
> I'm less than convinced that that approach will result in a significant
> win.  It's certainly not going to do anything to convert COUNT(*) into
> an O(1) operation, which frankly is what the complainants are expecting.
> There's basically no hope of solving the "PR problem" without somehow
> turning COUNT(*) into a materialized-view reference.  We've discussed
> that in the past, and know how to do it in principle, but the complexity
> and distributed overhead are daunting.
>
>
I've though about "aggregate" indexes, something like
create index index_name on table_name(count(*) group by column1, column2);
OR
create index index_name on table_name(count(*));
for table-wide count

To make it usable one would need:
1) Allow third aggregate function SMERGE that can merge one aggregate
state to another
2) The index should be regular index (e.g. btree) on column1, column2
that for each pair has page list to which it's data may belong (in
past/current running transactions), and aggregate state for each page
that were frozen previously
When index is used, it can use precalculated values for "pages with all
tuples vacuumed" (I suspect this is information from visibility map) and
should do regular calculation for all non-frozen pages with visibility
checks and everything what's needed.
When vacuum processes the page, it should (in sync or async way)
calculate aggregate values for the page.

IMHO Such an indexes would make materialized views/triggers/high level
caches unneeded in most cases.

Best regards, Vitalii Tymchyshyn


pgsql-performance by date:

Previous
From: Vitalii Tymchyshyn
Date:
Subject: Re: Slow count(*) again...
Next
From: Craig Ringer
Date:
Subject: Re: Slow count(*) again...