Re: Passing arguments to views - Mailing list pgsql-hackers

From Mark Dilger
Subject Re: Passing arguments to views
Date
Msg-id 43E3A836.3050409@markdilger.com
Whole thread Raw
In response to Re: Passing arguments to views  (Josh Berkus <josh@agliodbs.com>)
Responses Re: Function Stats WAS: Passing arguments to views
List pgsql-hackers
Josh Berkus wrote:
> Tom,
> 
> 
>>As for the dependency issue, one man's bug is another man's feature.
>>I think the fact that we don't track the internal dependencies of
>>functions is not all bad.  We've certainly seen plenty of complaints
>>about how you can't easily change tables that a view is depending on
>>because the view dependencies block it...
> 
> 
> I'd agree with this.   I write about 150,000 lines of function code a year, 
> and if I had to rebuild all of the cascading functions every time I change 
> a table they way I have to with views, it would probably add 20% to my 
> overall application development time.
> 
> BTW, the other thing that we're still TODOing on SRFs (as far as I know) is 
> finding ways to change the row estimate for an SRF.  It's still a flat 
> 1000 in the code, which can cause a lot of bad query plans.  I proposed a 
> year ago that, as a first step, we allow the function owner to assign a 
> static estimate variable to the function (i.e. "average rows returned = 
> 5').  This doesn't solve the whole problem of SRF estimates but it would 
> be a significant step forwards in being able to use them in queries.
> 

This would only seem to work for trivial functions.  Most functions that I write 
are themselves dependent on underlying tables, and without any idea how many 
rows are in the tables, and without any idea of the statistical distribution of 
those rows, I can't really say anything like "average rows returned = 5".

What I have wanted for some time is a function pairing system.  For each set 
returning function F() I create, I would have the option of creating a 
statistics function S() which returns a single integer which represents the 
guess of how many rows will be returned.  S() would be called by the planner, 
and the return value of S() would be used to decide the plan.  S() would need 
access to the table statistics information.  I imagine that the system would 
want to prevent S() from running queries, and only allow it to call certain 
defined table statistics functions and some internal math functions, thereby 
avoiding any infinite recursion in the planner.  (If S() ran any queries, those 
queries would go yet again to the planner, and on down the infinite recursion 
you might go.)

Of course, some (possibly most) people could chose not to write an S() for their 
F(), and the default of 1000 rows would continue to be used.  As such, this new 
extension to the system would be backwards compatible to functions which don't 
have an S() defined.

Thoughts?


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Passing arguments to views
Next
From: Josh Berkus
Date:
Subject: Re: Function Stats WAS: Passing arguments to views