Thread: Re: [COMMITTERS] pgsql: Add missing optimizer hooks for function cost and number of rows.
Re: [COMMITTERS] pgsql: Add missing optimizer hooks for function cost and number of rows.
From
Tom Lane
Date:
sriggs@postgresql.org (Simon Riggs) writes: > Log Message: > ----------- > Add missing optimizer hooks for function cost and number of rows. > Closely follow design of other optimizer hooks: if hook exists > retrieve value from plugin; if still not set then get from cache. What exactly are we doing adding new features without discussion (or documentation, or known use cases) at this stage of the release cycle? regards, tom lane
Re: [COMMITTERS] pgsql: Add missing optimizer hooks for function cost and number of rows.
From
Robert Haas
Date:
On Fri, Apr 23, 2010 at 6:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > sriggs@postgresql.org (Simon Riggs) writes: >> Log Message: >> ----------- >> Add missing optimizer hooks for function cost and number of rows. >> Closely follow design of other optimizer hooks: if hook exists >> retrieve value from plugin; if still not set then get from cache. > > What exactly are we doing adding new features without discussion (or > documentation, or known use cases) at this stage of the release cycle? I'm confused, too. It seems like there have been a LOT of patches this week that were not posted to or discussed on -hackers. I thought that was not within the ground rules. ...Robert
Re: [COMMITTERS] pgsql: Add missing optimizer hooks for function cost and number of rows.
From
Simon Riggs
Date:
On Fri, 2010-04-23 at 18:55 -0400, Tom Lane wrote: > sriggs@postgresql.org (Simon Riggs) writes: > > Log Message: > > ----------- > > Add missing optimizer hooks for function cost and number of rows. > > Closely follow design of other optimizer hooks: if hook exists > > retrieve value from plugin; if still not set then get from cache. > > What exactly are we doing adding new features without discussion (or > documentation, or known use cases) at this stage of the release cycle? Existing hooks were not fully complete in their coverage. That has happened before, and we have discussed that before on hackers, so I took care not to deviate from that implementation. This is a very low impact change, isn't in a new area and similar optimizer related changes were made recently, so I saw little to object to in this particular change. No such hooks are documented, even ones with strong use cases. -- Simon Riggs www.2ndQuadrant.com
Re: Re: [COMMITTERS] pgsql: Add missing optimizer hooks for function cost and number of rows.
From
Robert Haas
Date:
On Sat, Apr 24, 2010 at 3:31 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > On Fri, 2010-04-23 at 18:55 -0400, Tom Lane wrote: >> sriggs@postgresql.org (Simon Riggs) writes: >> > Log Message: >> > ----------- >> > Add missing optimizer hooks for function cost and number of rows. >> > Closely follow design of other optimizer hooks: if hook exists >> > retrieve value from plugin; if still not set then get from cache. >> >> What exactly are we doing adding new features without discussion (or >> documentation, or known use cases) at this stage of the release cycle? > > Existing hooks were not fully complete in their coverage. That has > happened before, and we have discussed that before on hackers, so I took > care not to deviate from that implementation. This is a very low impact > change, isn't in a new area and similar optimizer related changes were > made recently, so I saw little to object to in this particular change. > No such hooks are documented, even ones with strong use cases. The point isn't whether the existing hooks are complete or not. The point is that we shouldn't be making undiscussed changes EVER, and particularly not a week before beta. Hooks are frequently proposed and rejected - every once in a while they are proposed and accepted. So it is not as if there is any reason to believe that no one could possibly object to this. And you carefully failed to answer Tom's other point about lack of use case. I think the use case for these hooks is pretty thin, but I don't really want to argue about it now. I want you to revert the patch and resubmit it for 9.1 when there is time to properly discuss it, and focus on the remaining open items so that we can put out a beta. ...Robert
Re: Re: [COMMITTERS] pgsql: Add missing optimizer hooks for function cost and number of rows.
From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes: > On Sat, Apr 24, 2010 at 3:31 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> Existing hooks were not fully complete in their coverage. That has >> happened before, and we have discussed that before on hackers, so I took >> care not to deviate from that implementation. This is a very low impact >> change, isn't in a new area and similar optimizer related changes were >> made recently, so I saw little to object to in this particular change. >> No such hooks are documented, even ones with strong use cases. > The point isn't whether the existing hooks are complete or not. The > point is that we shouldn't be making undiscussed changes EVER, and > particularly not a week before beta. I have a problem with not only the process (or lack of it) but the substance of the patch. I don't believe that a system-wide hook point has any great use for improving function estimation. You need function- specific knowledge, and this is just not a useful way to package it. When we put in the COST/ROWS options for functions, it was generally agreed that the way forward would be to generalize those, eg by allowing per-function estimator functions to be called instead of just inserting constants. (And I think the main reason we didn't just do that immediately was that we wanted a feature that could be used without doing C-level programming.) This patch doesn't do that, nor even lay any useful groundwork for doing it. It would be impossible for instance for this hook function to lay its hands on the arguments to the function to be estimated, which certainly begs the question as to how it's going to deliver any estimate more useful than the constant value. Please revert. regards, tom lane
Re: Re: [COMMITTERS] pgsql: Add missing optimizer hooks for function cost and number of rows.
From
Simon Riggs
Date:
On Sat, 2010-04-24 at 11:17 -0400, Tom Lane wrote: > Robert Haas <robertmhaas@gmail.com> writes: > > On Sat, Apr 24, 2010 at 3:31 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > >> Existing hooks were not fully complete in their coverage. That has > >> happened before, and we have discussed that before on hackers, so I took > >> care not to deviate from that implementation. This is a very low impact > >> change, isn't in a new area and similar optimizer related changes were > >> made recently, so I saw little to object to in this particular change. > >> No such hooks are documented, even ones with strong use cases. > > > The point isn't whether the existing hooks are complete or not. The > > point is that we shouldn't be making undiscussed changes EVER, and > > particularly not a week before beta. > > I have a problem with not only the process (or lack of it) but the > substance of the patch. I don't believe that a system-wide hook point > has any great use for improving function estimation. You need function- > specific knowledge, and this is just not a useful way to package it. I've revoked it. > When we put in the COST/ROWS options for functions, it was generally > agreed that the way forward would be to generalize those, eg by allowing > per-function estimator functions to be called instead of just inserting > constants. I completely agree that the above is the best way forwards. > (And I think the main reason we didn't just do that > immediately was that we wanted a feature that could be used without > doing C-level programming.) This patch doesn't do that, nor even lay > any useful groundwork for doing it. It would be impossible for instance > for this hook function to lay its hands on the arguments to the function > to be estimated, which certainly begs the question as to how it's going > to deliver any estimate more useful than the constant value. We can override table stats but not functions stats. I noticed that hole in the previous implementation, and rectified it with the intention of helping users with what I saw as a small, low risk patch that exactly followed existing code. It would seem I did that too quickly without realising that an objection would occur. For the record, it's possible to use the main optimizer hooks to get the same result, so I'm in no way personally blocked by this. I think that's harder, so others may not find it as easy. This may actually help obtain funding to implement the full approach, maybe. -- Simon Riggs www.2ndQuadrant.com
Re: Re: [COMMITTERS] pgsql: Add missing optimizer hooks for function cost and number of rows.
From
Tom Lane
Date:
Simon Riggs <simon@2ndQuadrant.com> writes: > We can override table stats but not functions stats. I noticed that hole > in the previous implementation, and rectified it with the intention of > helping users with what I saw as a small, low risk patch that exactly > followed existing code. It would seem I did that too quickly without > realising that an objection would occur. Well, you did it without much thought at all. I think this episode is a perfect demonstration of why we ask for concrete use-cases for proposed hooks. If you'd actually tried to write something that used the hook, you'd surely have noticed that it wasn't being passed the information that it would need to do anything useful, and you'd probably have recognized the problem that there's no good way for a single hook function to provide an extensible collection of function-specific knowledge. But the other point is that people aren't going to want to have to write C-language hook functions in order to provide estimators for user-defined functions. We need to think of something higher-level than that. I think there was some discussion of generalizing the COST/ROWS constants into SQL expressions using the function arguments, which the planner could evaluate if it could reduce the arguments to constants. I'm not sure if that would be adequate or even useful, but it seems more likely to be helpful than a bare hook function. regards, tom lane
Re: Re: [COMMITTERS] pgsql: Add missing optimizer hooks for function cost and number of rows.
From
Simon Riggs
Date:
On Sat, 2010-04-24 at 12:59 -0400, Tom Lane wrote: > Well, you did it without much thought at all. To the consequences, definitely. > I think this episode is a > perfect demonstration of why we ask for concrete use-cases for proposed > hooks. If you'd actually tried to write something that used the hook, > you'd surely have noticed that it wasn't being passed the information > that it would need to do anything useful, and you'd probably have > recognized the problem that there's no good way for a single hook > function to provide an extensible collection of function-specific > knowledge. To the value, no. The limitations of the hook approach were clear, but they do at least allow overriding values on a session by session basis, allowing you to write a program to estimate the result and then set the function costs accordingly. It's not clever or the best way, but it was the same situation as the other hooks currently provide and I imagined it would be accepted without question, wrongly. -- Simon Riggs www.2ndQuadrant.com