Thread: estimated_count() implementation
Hi, I'm trying to implement estimated_count() function that's mentioned in the TODO list. First of all, I wanted to learn if this TODO item is still valid? I looked at the related -hackers discussions, does anybody want to say more sth related with the implementation? Also I've some questions. I'd be appreciated if somebody would answer any of the below questions to help me find my way. 1. I'm planning to use same method as ExplainOneQuery() does in backend/commands/explain.c. (Using Plan->plan_rows thatwill be returned from planner(query, isCursor, cursorOptions, params) function.) Is this the way to go, or should Ilook for another method to aggregate estimated row count. 2. I've been also considering getting called from a nodeAgg. In such a case, it shouldn't be a problem for me to use sameway as above to retrieve Query, ParamListInfo and TupOutputState. Right? 3. I was looking at int8inc() and backend/executor/nodeAgg.c and couldn't find anything special to count() aggregate. AmI looking at the right place? For instance, for my case, I won't need any transition function call. How should I modifynodeAgg.c to skip transfn calls for estimated_count()? 4. Related with the problem, any question I missed. Regards.
On Sat, Oct 21, 2006 at 11:44:19PM +0300, Volkan YAZICI wrote: > I'm trying to implement estimated_count() function that's mentioned in > the TODO list. First of all, I wanted to learn if this TODO item is > still valid? I looked at the related -hackers discussions, does anybody > want to say more sth related with the implementation? I hadn't noticed the TODO item but about a year ago I posted a cursor_plan_rows() function and asked for comments. The only reply was from Tom, who said, "Given how far off it frequently is, I can't believe that any of the people who ask for the feature would find this a satisfactory answer :-(" http://archives.postgresql.org/pgsql-hackers/2005-11/msg00579.php http://archives.postgresql.org/pgsql-hackers/2005-11/msg00580.php -- Michael Fuhr
On Oct 21 05:09, Michael Fuhr wrote: > I hadn't noticed the TODO item but about a year ago I posted a > cursor_plan_rows() function and asked for comments. Ah! I didn't see this. > The only reply was from Tom, who said, "Given how far off it > frequently is, I can't believe that any of the people who ask for the > feature would find this a satisfactory answer :-(" AFAIU, cursor_plan_rows() has some serious limitations like requiring to be executed for a portal. I was planning to make estimated_count() work for nodeAgg and custom calls too - as count() does. But OTOH, Tom's complaints look like still applicable for my estimated_count() too. Does this TODO need a little bit more clarification or we can count is a redundant one? Regards.
On Sun, 2006-10-22 at 12:07 +0300, Volkan YAZICI wrote: > On Oct 21 05:09, Michael Fuhr wrote: > > I hadn't noticed the TODO item but about a year ago I posted a > > cursor_plan_rows() function and asked for comments. > > Ah! I didn't see this. > > > The only reply was from Tom, who said, "Given how far off it > > frequently is, I can't believe that any of the people who ask for the > > feature would find this a satisfactory answer :-(" > > AFAIU, cursor_plan_rows() has some serious limitations like requiring to > be executed for a portal. I was planning to make estimated_count() work > for nodeAgg and custom calls too - as count() does. > > But OTOH, Tom's complaints look like still applicable for my > estimated_count() too. Does this TODO need a little bit more > clarification or we can count is a redundant one? http://archives.postgresql.org/pgsql-hackers/2005-11/msg00943.php Is the source of the TODO item, though please read the upthread messages as to how we got there... I think it would be a useful function... -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
"Simon Riggs" <simon@2ndquadrant.com> writes: > On Sun, 2006-10-22 at 12:07 +0300, Volkan YAZICI wrote: >> But OTOH, Tom's complaints look like still applicable for my >> estimated_count() too. Does this TODO need a little bit more >> clarification or we can count is a redundant one? > http://archives.postgresql.org/pgsql-hackers/2005-11/msg00943.php > Is the source of the TODO item, though please read the upthread messages > as to how we got there... I think there is a use-case for something like select estimated_count('select * from ... where ...'); i.e. there are applications where a possibly-bad estimate is enough. What I'm concerned about is the level of complaints from newbies who'll expect it to be dead accurate all the time ... BTW, you can build estimated_count() today in a few lines of plpgsql: create or replace function estimated_count(text) returns float8 as $$ declare r text; begin for r in execute 'explain ' || $1 loop return substring(r from 'rows=([0-9]+) '); end loop; end$$ language plpgsql strict; I don't see that it really justifies any more work than that. regards, tom lane