Re: *sigh* - Mailing list pgsql-hackers
From | Simon Riggs |
---|---|
Subject | Re: *sigh* |
Date | |
Msg-id | 000501c3cdf7$44d71b70$7bc886d9@LaptopDellXP Whole thread Raw |
In response to | Re: *sigh* (Mark Kirkwood <markir@paradise.net.nz>) |
Responses |
Re: *sigh*
|
List | pgsql-hackers |
Can I chip in? I've had a look in the past at the way various databases perform this. Most just go and read the data, though Informix does seem to keep a permanent record of the number of rows in a table...which probably adds overhead you don't really want. Select count(*) could be evaluated against any available index sub-tables, since all that is required is to count the rows. That would be significantly faster than a full file scan and accurate too. You'd simply count the pointers, after evaluating any WHERE clause against the indexed col values - so it won't work except for fairly simple count(*)'s. Why not implement estimated_count as a dictionary lookup, directly using the value recorded there by the analyze? That would be the easiest way to reuse existing code and give you access to many previously calculated values. This whole area is a major performance improver, with lots of cross-overs with the materialized view sub-project. Could you say a little more about why you wanted to achieve this? Best Regards Simon Riggs 2nd Quadrant +44-7900-255520 -----Original Message----- From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Mark Kirkwood Sent: Monday, December 29, 2003 08:36 To: Randolf Richardson Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] *sigh* *growl* - it sounds like the business...and I was all set to code it, however after delving into Pg's aggregation structure a bit, it suffers a fatal flaw : There appears to be no way to avoid visiting every row when defining an aggregate (even if you do nothing on each one) -- which defeats the whole point of my suggestion (i.e avoiding the visit to every row) To make the original idea work requires amending the definition of Pg aggregates to introduce "fake" aggregates that don't actually get evaulated for every row. At this point I am not sure if this sort of modification is possible or reasonable - others who know feel free to chip in :-) regards Mark Randolf Richardson wrote: >"markir@paradise.net.nz (Mark Kirkwood)" wrote in >comp.databases.postgresql.hackers: > >[sNip] > > >>How about: >> >>Implement a function "estimated_count" that can be used instead of >>"count". It could use something like the algorithm in >>src/backend/commands/analyze.c to get a reasonably accurate psuedo count >>quickly. >> >>The advantage of this approach is that "count" still means (exact)count >>(for your xact snapshot anyway). Then the situation becomes: >> >>Want a fast count? - use estimated_count(*) >>Want an exact count - use count(*) >> >> > > I think this is an excellent solution. > > > ---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings
pgsql-hackers by date: