Thread: Hints (was Poor performance using CTE)
On Tue, Nov 20, 2012 at 3:53 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I'm probably beating a dead horse ... but isn't this just a hint? Except that it's worse than a hint, because it's a hint in disguise and is undocumented. As far as I can tell, there's no use for "OFFSET 0" except to act as an optimizer fence.
It's clearly an important need, given the nature of the dialog above (and many others that have passed through this mailing list).
Why not make an explicit hint syntax and document it? I've still don't understand why "hint" is a dirty word in Postgres. There are a half-dozen or so ways in common use to circumvent or correct sub-optimal plans.
Craig James
Craig Ringer <craig@2ndQuadrant.com> writes:
> On 11/21/2012 12:06 AM, Claudio Freire wrote:
>> I meant for postgres to do automatically. Rewriting as a join wouldn't
>> work as an optimization fence the way we're used to, but pushing
>> constraints upwards can only help (especially if highly selective).
> Because people are now used to using CTEs as query hints, it'd probably
> cause performance regressions in working queries. Perhaps more
> importantly, Pg would have to prove that doing so didn't change queries
> that invoked functions with side-effects to avoid changing the results
> of currently valid queries.
We could trivially arrange to keep the current semantics if the CTE
query contains any volatile functions (or of course if it's
INSERT/UPDATE/DELETE). I think we'd also need to not optimize if
it's invoked from more than one place in the outer query.
I think the more interesting question is what cases wouldn't be covered
by such a rule. Typically you need to use OFFSET 0 in situations where
the planner has guessed wrong about costs or rowcounts, and I think
people are likely using WITH for that as well. Should we be telling
people that they ought to insert OFFSET 0 in WITH queries if they want
to be sure there's an optimization fence?
I'm probably beating a dead horse ... but isn't this just a hint? Except that it's worse than a hint, because it's a hint in disguise and is undocumented. As far as I can tell, there's no use for "OFFSET 0" except to act as an optimizer fence.
It's clearly an important need, given the nature of the dialog above (and many others that have passed through this mailing list).
Why not make an explicit hint syntax and document it? I've still don't understand why "hint" is a dirty word in Postgres. There are a half-dozen or so ways in common use to circumvent or correct sub-optimal plans.
Craig James
regards, tom lane
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
On 11/21/2012 09:35 AM, Craig James wrote: > Why not make an explicit hint syntax and document it? I've still don't > understand why "hint" is a dirty word in Postgres. There are a > half-dozen or so ways in common use to circumvent or correct > sub-optimal plans. > The reason usually given is that hints provide easy workarounds for planner and stats issues, so people don't report problems or fix the underlying problem. Of course, if that's all there was to it, `OFFSET 0` would be made into an error or warning, or ignored and not fenced. The reality is, as you say, that there's a need, because the planner can never be perfect - or rather, if it were nearly perfect, it'd take so long to read the stats and calculate plans that everything would be glacially slow anyway. The planner has to compromise, and so cases will always arise where it needs a little help. I think it's time to admit that and get the syntax in place for CTEs so there's room to optimize them later, rather than cementing CTEs-as-fences in forever as a Pg quirk. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 11/20/2012 08:15 PM, Craig Ringer wrote: > I think it's time to admit that and get the syntax in place for CTEs so > there's room to optimize them later, rather than cementing > CTEs-as-fences in forever as a Pg quirk. I know I'm just some schmo, but I'd vote for this. I'm certainly guilty of using OFFSET 0. Undocumented hints are still hints. As much as I think they're a bad idea by cementing a certain plan that may not get the benefits of future versions, non-intuitive side-effects by using overloaded syntax are worse. I've been using CTEs as temp tables because I know that's how they work. But I'd be more than willing to modify my syntax one way or the other to adopt non-materialized CTEs, provided there's some way to get the current behavior. -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604 312-444-8534 sthomas@optionshouse.com ______________________________________________ See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email
Craig Ringer wrote: > On 11/21/2012 09:35 AM, Craig James wrote: >> Why not make an explicit hint syntax and document it? I've still >> don't understand why "hint" is a dirty word in Postgres. There are >> a half-dozen or so ways in common use to circumvent or correct >> sub-optimal plans. > > The reason usually given is that hints provide easy workarounds for > planner and stats issues, so people don't report problems or fix > the underlying problem. > > Of course, if that's all there was to it, `OFFSET 0` would be made > into an error or warning, or ignored and not fenced. > > The reality is, as you say, that there's a need, because the > planner can never be perfect - or rather, if it were nearly > perfect, it'd take so long to read the stats and calculate plans > that everything would be glacially slow anyway. The planner has to > compromise, and so cases will always arise where it needs a little > help. > > I think it's time to admit that and get the syntax in place for > CTEs so there's room to optimize them later, rather than cementing > CTEs-as-fences in forever as a Pg quirk. It's a tough problem. Disguising and not documenting the available optimizer hints leads to more reports on where the optimizer should be smarter, and has spurred optimizer improvements. And many type of hints would undoubtedly cause people to force what they *think* would be the best plan in many cases where they are wrong, or become wrong as data scales up. But it does seem odd every time I hear people saying that they don't want to eliminate some optimization fence because "they find it useful" while simultaneously arguing that we don't have or want hints. Having a way to coerce the optimizer from the plan it would take with straightforward coding *is* a hint, and one down-side of hiding the hints inside syntax mostly supported for other reasons is that people who don't know about these clever devices can't do reasonable refactoring of queries for readability without risking performance regressions. Another down-side is that perfectly reasonable queries ported from other databases that use hint syntax for hints run afoul of the secret hints when trying to run queries on PostgreSQL, and get performance potentially orders of magnitude worse than they expect. I'm not sure what the best answer is, but as long as we have hints, but only through OFFSET 0 or CTE usage, that should be documented. Better, IMV, would be to identify what sorts of hints people actually find useful, and use that as the basis for TODO items for optimizer improvement as well as inventing clear ways to specify the desired coercion. I liked the suggestion that a CTE which didn't need to be materialized because of side-effects or multiple references have a keyword. Personally, I think that AS MATERIALIZED x (SELECT ...) would be preferable to AS x (SELECT ... OFFSET 0) as the syntax to specify that. Regarding the above-mentioned benefits we would stand to lose by having clear and documented hints, perhaps we could occasionally solicit input on where people are finding hints useful to get ideas on where we might want to improve the optimizer. As far as worrying about people using hints to force a plan which is sub-optimal -- isn't that getting into nanny mode a bit too much? -Kevin
On 21.11.2012 15:42, Kevin Grittner wrote: > Better, IMV, would be to identify what sorts of hints people actually > find useful, and use that as the basis for TODO items for optimizer > improvement as well as inventing clear ways to specify the desired > coercion. I liked the suggestion that a CTE which didn't need to be > materialized because of side-effects or multiple references have a > keyword. Personally, I think that AS MATERIALIZED x (SELECT ...) > would be preferable to AS x (SELECT ... OFFSET 0) as the syntax to > specify that. Rather than telling the planner what to do or not to do, I'd much rather have hints that give the planner more information about the tables and quals involved in the query. A typical source of bad plans is when the planner gets its cost estimates wrong. So rather than telling the planner to use a nested loop join for "a INNER JOIN b ON a.id = b.id", the user could tell the planner that there are only 10 rows that match the "a.id = b.id" qual. That gives the planner the information it needs to choose the right plan on its own. That kind of hints would be much less implementation specific and much more likely to still be useful, or at least not outright counter-productive, in a future version with a smarter planner. You could also attach that kind of hints to tables and columns, which would be more portable and nicer than decorating all queries. - Heikki
On Wed, Nov 21, 2012 at 5:42 AM, Kevin Grittner <kgrittn@mail.com> wrote:
Toward that end, the hint documentation (which is almost always viewed as HTML) could be prefaced by a strong suggestion to post performance questions in this group first, with links to the "subscribe" page and the "how to report performance problems" FAQ. The hint documentation could even be minimalistic; suggest to developers that they should post their problematic queries here before resorting to hints. That would give the experts an opportunity to provide the normal advice. The correct hint syntax would be suggested only when all other avenues failed.
Craig James
It's a tough problem. Disguising and not documenting the available
optimizer hints leads to more reports on where the optimizer should
be smarter, and has spurred optimizer improvements. ...
Regarding the above-mentioned benefits we would stand to lose by
having clear and documented hints, perhaps we could occasionally
solicit input on where people are finding hints useful to get ideas
on where we might want to improve the optimizer. As far as worrying
about people using hints to force a plan which is sub-optimal --
isn't that getting into nanny mode a bit too much?
Toward that end, the hint documentation (which is almost always viewed as HTML) could be prefaced by a strong suggestion to post performance questions in this group first, with links to the "subscribe" page and the "how to report performance problems" FAQ. The hint documentation could even be minimalistic; suggest to developers that they should post their problematic queries here before resorting to hints. That would give the experts an opportunity to provide the normal advice. The correct hint syntax would be suggested only when all other avenues failed.
Craig James
-Kevin
On 11/21/2012 08:05 AM, Heikki Linnakangas wrote: > Rather than telling the planner what to do or not to do, I'd much rather > have hints that give the planner more information about the tables and > quals involved in the query. A typical source of bad plans is when the > planner gets its cost estimates wrong. So rather than telling the > planner to use a nested loop join for "a INNER JOIN b ON a.id = b.id", > the user could tell the planner that there are only 10 rows that match > the "a.id = b.id" qual. That gives the planner the information it needs > to choose the right plan on its own. That kind of hints would be much > less implementation specific and much more likely to still be useful, or > at least not outright counter-productive, in a future version with a > smarter planner. > > You could also attach that kind of hints to tables and columns, which > would be more portable and nicer than decorating all queries. I like this idea, but also think that if we have a syntax to allow hints, it would be nice to have a simple way to ignore all hints (yes, I suppose I'm suggesting yet another GUC). That way after sprinkling your SQL with hints, you could easily periodically (e.g. after a Postgres upgrade) test what would happen if the hints were removed. Joe -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, & 24x7 Support
On Wed, Nov 21, 2012 at 9:25 AM, Joe Conway <mail@joeconway.com> wrote:
Or a three-way choice: Allow, ignore, or generate an error. That would allow developers to identify where hints are being used.
Craig
On 11/21/2012 08:05 AM, Heikki Linnakangas wrote:
> Rather than telling the planner what to do or not to do, I'd much rather
> have hints that give the planner more information about the tables and
> quals involved in the query. A typical source of bad plans is when the
> planner gets its cost estimates wrong. So rather than telling the
> planner to use a nested loop join for "a INNER JOIN b ON a.id = b.id",
> the user could tell the planner that there are only 10 rows that match
> the "a.id = b.id" qual. That gives the planner the information it needs
> to choose the right plan on its own. That kind of hints would be much
> less implementation specific and much more likely to still be useful, or
> at least not outright counter-productive, in a future version with a
> smarter planner.
>
> You could also attach that kind of hints to tables and columns, which
> would be more portable and nicer than decorating all queries.
I like this idea, but also think that if we have a syntax to allow
hints, it would be nice to have a simple way to ignore all hints (yes, I
suppose I'm suggesting yet another GUC). That way after sprinkling your
SQL with hints, you could easily periodically (e.g. after a Postgres
upgrade) test what would happen if the hints were removed.
Or a three-way choice: Allow, ignore, or generate an error. That would allow developers to identify where hints are being used.
Craig
Joe
--
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & 24x7 Support
On 11/21/2012 09:28 AM, Craig James wrote: > > > On Wed, Nov 21, 2012 at 9:25 AM, Joe Conway <mail@joeconway.com > <mailto:mail@joeconway.com>> wrote: > > On 11/21/2012 08:05 AM, Heikki Linnakangas wrote: > > Rather than telling the planner what to do or not to do, I'd much > rather > > have hints that give the planner more information about the tables and > > quals involved in the query. A typical source of bad plans is when the > > planner gets its cost estimates wrong. So rather than telling the > > planner to use a nested loop join for "a INNER JOIN b ON a.id > <http://a.id> = b.id <http://b.id>", > > the user could tell the planner that there are only 10 rows that match > > the "a.id <http://a.id> = b.id <http://b.id>" qual. That gives the > planner the information it needs > > to choose the right plan on its own. That kind of hints would be much > > less implementation specific and much more likely to still be > useful, or > > at least not outright counter-productive, in a future version with a > > smarter planner. > > > > You could also attach that kind of hints to tables and columns, which > > would be more portable and nicer than decorating all queries. > > I like this idea, but also think that if we have a syntax to allow > hints, it would be nice to have a simple way to ignore all hints (yes, I > suppose I'm suggesting yet another GUC). That way after sprinkling your > SQL with hints, you could easily periodically (e.g. after a Postgres > upgrade) test what would happen if the hints were removed. > > > Or a three-way choice: Allow, ignore, or generate an error. That would > allow developers to identify where hints are being used. +1 Joe -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, & 24x7 Support
Craig James <cjames@emolecules.com> writes: > On Wed, Nov 21, 2012 at 9:25 AM, Joe Conway <mail@joeconway.com> wrote: >> I like this idea, but also think that if we have a syntax to allow >> hints, it would be nice to have a simple way to ignore all hints (yes, I >> suppose I'm suggesting yet another GUC). That way after sprinkling your >> SQL with hints, you could easily periodically (e.g. after a Postgres >> upgrade) test what would happen if the hints were removed. > Or a three-way choice: Allow, ignore, or generate an error. That would > allow developers to identify where hints are being used. Throwing errors would likely prevent you from reaching all parts of your application, thus preventing complete testing. Much more sensible to just log such queries. regards, tom lane
Craig James <cjames@emolecules.com> wrote: > On 11/21/2012 08:05 AM, Heikki Linnakangas wrote: > > Rather than telling the planner what to do or not to do, I'd much rather > > have hints that give the planner more information about the tables and > > quals involved in the query. A typical source of bad plans is when the > > planner gets its cost estimates wrong. So rather than telling the > > planner to use a nested loop join for "a INNER JOIN b ON a.id = b.id", > > the user could tell the planner that there are only 10 rows that match > > the "a.id = b.id" qual. That gives the planner the information it needs > > to choose the right plan on its own. That kind of hints would be much > > less implementation specific and much more likely to still be useful, or > > at least not outright counter-productive, in a future version with a > > smarter planner. > > > > You could also attach that kind of hints to tables and columns, which > > would be more portable and nicer than decorating all queries. > > I like this idea, but also think that if we have a syntax to allow > hints, it would be nice to have a simple way to ignore all hints (yes, I > suppose I'm suggesting yet another GUC). That way after sprinkling your > SQL with hints, you could easily periodically (e.g. after a Postgres > upgrade) test what would happen if the hints were removed. > > > Or a three-way choice: Allow, ignore, or generate an error. That would allow > developers to identify where hints are being used. > > Craig +1 I think, we HAVE a smart planner, but hints in this direction are okay, and we need a simple way to make such hints obsolete - for/in the future. Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) "If I was god, I would recompile penguin with --enable-fly." (unknown) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°
>
> Rather than telling the planner what to do or not to do, I'd much rather
> have hints that give the planner more information about the tables and
> quals involved in the query. A typical source of bad plans is when the
> planner gets its cost estimates wrong. So rather than telling the
> planner to use a nested loop join for "a INNER JOIN b ON a.id = b.id",
> the user could tell the planner that there are only 10 rows that match
> the "a.id = b.id" qual.
Instead of gathering statistics for all possible joins ( and join orders) , in Oracle there is a functionality that can be switched on where the optimizer is given cardinality feedback for the chosen plans, so it can choose another plan if the same statement comes around.
Secondly, there is functionality to insert a hint into an SQL statement. That's very good for COTS apps where the statement can't be altered. Now I know that there's relatively not much COTS for the Postgresql, ( hence arguments like 'we should not implement hints so we're forcing people to solve the underlying problem' ), but as Postgresql will replace oracle in the lower end of the market, this functionality is usefull.
> Rather than telling the planner what to do or not to do, I'd much rather
> have hints that give the planner more information about the tables and
> quals involved in the query. A typical source of bad plans is when the
> planner gets its cost estimates wrong. So rather than telling the
> planner to use a nested loop join for "a INNER JOIN b ON a.id = b.id",
> the user could tell the planner that there are only 10 rows that match
> the "a.id = b.id" qual.
Instead of gathering statistics for all possible joins ( and join orders) , in Oracle there is a functionality that can be switched on where the optimizer is given cardinality feedback for the chosen plans, so it can choose another plan if the same statement comes around.
Secondly, there is functionality to insert a hint into an SQL statement. That's very good for COTS apps where the statement can't be altered. Now I know that there's relatively not much COTS for the Postgresql, ( hence arguments like 'we should not implement hints so we're forcing people to solve the underlying problem' ), but as Postgresql will replace oracle in the lower end of the market, this functionality is usefull.
On Wed, Nov 21, 2012 at 8:05 AM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > On 21.11.2012 15:42, Kevin Grittner wrote: >> >> Better, IMV, would be to identify what sorts of hints people actually >> find useful, and use that as the basis for TODO items for optimizer >> improvement as well as inventing clear ways to specify the desired >> coercion. I liked the suggestion that a CTE which didn't need to be >> materialized because of side-effects or multiple references have a >> keyword. Personally, I think that AS MATERIALIZED x (SELECT ...) >> would be preferable to AS x (SELECT ... OFFSET 0) as the syntax to >> specify that. > > > Rather than telling the planner what to do or not to do, I'd much rather > have hints that give the planner more information about the tables and quals > involved in the query. A typical source of bad plans is when the planner > gets its cost estimates wrong. So rather than telling the planner to use a > nested loop join for "a INNER JOIN b ON a.id = b.id", the user could tell > the planner that there are only 10 rows that match the "a.id = b.id" qual. For each a.id there are 10 b.id, or for each b.id there are 10 a.id? > That gives the planner the information it needs to choose the right plan on > its own. That kind of hints would be much less implementation specific and > much more likely to still be useful, or at least not outright > counter-productive, in a future version with a smarter planner. When I run into unexpectedly poor performance, I have an intuitive enough feel for my own data that I know what plan it ought to be using. Figuring out why it is not using it is very hard. For one thing, EXPLAIN tells you about the "winning" plan, but there is no visibility into what ought to be the winning plan but isn't, so no way to see why it isn't. So you first have to use our existing non-hint hints (enable_*, doing weird things with cost_*, CTE stuff) to trick it into using the plan I want it to use, before I can figure out why it isn't using it, before I could figure out what hints of the style you are suggesting to supply to get it to use it. So I think the type of hints you are suggesting would be about as hard for the user to use as debugging the planner for the particular case would be. While the more traditional type of hint is easy to use, because the end user understands their data more than they understand the guts of the planner. Cheers, Jeff
On 22.11.2012 02:53, Jeff Janes wrote: >> That gives the planner the information it needs to choose the right plan on >> its own. That kind of hints would be much less implementation specific and >> much more likely to still be useful, or at least not outright >> counter-productive, in a future version with a smarter planner. > > When I run into unexpectedly poor performance, I have an intuitive > enough feel for my own data that I know what plan it ought to be > using. Figuring out why it is not using it is very hard. For one > thing, EXPLAIN tells you about the "winning" plan, but there is no > visibility into what ought to be the winning plan but isn't, so no way > to see why it isn't. So you first have to use our existing non-hint > hints (enable_*, doing weird things with cost_*, CTE stuff) to trick > it into using the plan I want it to use, before I can figure out why > it isn't using it, before I could figure out what hints of the style > you are suggesting to supply to get it to use it. I'm sure that happens too, but my gut feeling is that more often the EXPLAIN ANALYZE output reveals a bad estimate somewhere in the plan, and the planner chooses a bad plan based on the bad estimate. If you hint the planner by giving a better estimate for where the estimator got it wrong, the planner will choose the desired plan. - Heikki
On 22/11/12 06:28, Craig James wrote:
Or perhaps hints should have the pg version attached, so that they are automatically ignored when the pg version changed? Problem may then become people reluctant to upgrade because their hints relate to a previous version! Sigh...On Wed, Nov 21, 2012 at 9:25 AM, Joe Conway <mail@joeconway.com> wrote:On 11/21/2012 08:05 AM, Heikki Linnakangas wrote:
> Rather than telling the planner what to do or not to do, I'd much rather
> have hints that give the planner more information about the tables and
> quals involved in the query. A typical source of bad plans is when the
> planner gets its cost estimates wrong. So rather than telling the
> planner to use a nested loop join for "a INNER JOIN b ON a.id = b.id",
> the user could tell the planner that there are only 10 rows that match
> the "a.id = b.id" qual. That gives the planner the information it needs
> to choose the right plan on its own. That kind of hints would be much
> less implementation specific and much more likely to still be useful, or
> at least not outright counter-productive, in a future version with a
> smarter planner.
>
> You could also attach that kind of hints to tables and columns, which
> would be more portable and nicer than decorating all queries.
I like this idea, but also think that if we have a syntax to allow
hints, it would be nice to have a simple way to ignore all hints (yes, I
suppose I'm suggesting yet another GUC). That way after sprinkling your
SQL with hints, you could easily periodically (e.g. after a Postgres
upgrade) test what would happen if the hints were removed.
Or a three-way choice: Allow, ignore, or generate an error. That would allow developers to identify where hints are being used.
Craig
Joe
--
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & 24x7 Support
Even requiring registration of hints and expiring them after a limited time period would not work - as people would simply automate the process of registration & application...
Cheers,
Gavin
Le mercredi 21 novembre 2012 17:34:02, Craig James a écrit : > On Wed, Nov 21, 2012 at 5:42 AM, Kevin Grittner <kgrittn@mail.com> wrote: > > It's a tough problem. Disguising and not documenting the available > > optimizer hints leads to more reports on where the optimizer should > > be smarter, and has spurred optimizer improvements. ... > > Regarding the above-mentioned benefits we would stand to lose by > > having clear and documented hints, perhaps we could occasionally > > solicit input on where people are finding hints useful to get ideas > > on where we might want to improve the optimizer. As far as worrying > > about people using hints to force a plan which is sub-optimal -- > > isn't that getting into nanny mode a bit too much? > > Toward that end, the hint documentation (which is almost always viewed as > HTML) could be prefaced by a strong suggestion to post performance > questions in this group first, with links to the "subscribe" page and the > "how to report performance problems" FAQ. The hint documentation could even > be minimalistic; suggest to developers that they should post their > problematic queries here before resorting to hints. That would give the > experts an opportunity to provide the normal advice. The correct hint > syntax would be suggested only when all other avenues failed. We have hooks in PostgreSQL. We already have at least one extension which is using that to change the planner behavior. We can have a bit more hooks and try to improve the cost estimate, this part of the code is known to be built by reports and human estimations, also the 9.2 version got heavy modifications in this area. Let the 'Hints' be inside an extension thus we are able to track them and fix the planner/costestimate issues. I don't see why PostgreSQL needs 'Hints' *in-core*. -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation
Attachment
On Fri, Nov 23, 2012 at 3:05 AM, Cédric Villemain <cedric@2ndquadrant.com> wrote: > Le mercredi 21 novembre 2012 17:34:02, Craig James a écrit : >> On Wed, Nov 21, 2012 at 5:42 AM, Kevin Grittner <kgrittn@mail.com> wrote: >> > It's a tough problem. Disguising and not documenting the available >> > optimizer hints leads to more reports on where the optimizer should >> > be smarter, and has spurred optimizer improvements. ... >> > Regarding the above-mentioned benefits we would stand to lose by >> > having clear and documented hints, perhaps we could occasionally >> > solicit input on where people are finding hints useful to get ideas >> > on where we might want to improve the optimizer. As far as worrying >> > about people using hints to force a plan which is sub-optimal -- >> > isn't that getting into nanny mode a bit too much? >> >> Toward that end, the hint documentation (which is almost always viewed as >> HTML) could be prefaced by a strong suggestion to post performance >> questions in this group first, with links to the "subscribe" page and the >> "how to report performance problems" FAQ. The hint documentation could even >> be minimalistic; suggest to developers that they should post their >> problematic queries here before resorting to hints. That would give the >> experts an opportunity to provide the normal advice. The correct hint >> syntax would be suggested only when all other avenues failed. > > We have hooks in PostgreSQL. We already have at least one extension which is > using that to change the planner behavior. > > We can have a bit more hooks and try to improve the cost estimate, this part > of the code is known to be built by reports and human estimations, also the > 9.2 version got heavy modifications in this area. > > Let the 'Hints' be inside an extension thus we are able to track them and fix > the planner/costestimate issues. > > I don't see why PostgreSQL needs 'Hints' *in-core*. Here here! PostgreSQL is well known for its extensibility and this is the perfect place for hints. That way they can get worked on without becoming a crutch for every user and forcing the backend developers to support what may or may not be a good idea syntax wise. After a few different people have banged some code out to make workable hint syntaxes for their own use maybe then it will be time to revisit adding hints to core.
On 27/11/2012 3:42 PM, Scott Marlowe wrote: > Here here! PostgreSQL is well known for its extensibility and this is > the perfect place for hints. I agree with the sentiment and your concerns. However, this doesn't solve the CTE problem. Some people are relying on the planner's inability to push conditions into / pull conditions out of CTEs, and otherwise re-arrange them. If support for optimising into eligible CTEs (ie CTE terms that contain only SELECT or VALUES and call no VOLATILE functions) then these applications will potentially encounter serious performance regressions. Should this feature never be added to Pg, making it different and incompatible with other DBs that implement CTE optimisation, just because some people are using it for a hacky hint like OFFSET 0? Should these applications just be broken by the update, with people told to add `OFFSET 0` or load some not-yet-existing hints module after reporting the performance issue to the list? I don't think either of those are acceptable. Sooner or later somebody's going to want to add CTE optimisation, and I don't think that "you can't" or "great, we'll do it and break everything" are acceptable responses to any proposed patch someone might come up with to add that. A GUC might be OK, as apps can always SET it before problem queries or not-yet-ported code. It'd probably reduce the rate at which people fixed their code considerably, though, going by past experience with standard_conforming_strings, etc, but it'd work. -- Craig Ringer
On Tue, Nov 27, 2012 at 7:17 PM, Craig Ringer <craig@2ndquadrant.com> wrote: > On 27/11/2012 3:42 PM, Scott Marlowe wrote: > >> Here here! PostgreSQL is well known for its extensibility and this is >> the perfect place for hints. > > I agree with the sentiment and your concerns. However, this doesn't solve > the CTE problem. > > Some people are relying on the planner's inability to push conditions into / > pull conditions out of CTEs, and otherwise re-arrange them. If support for > optimising into eligible CTEs (ie CTE terms that contain only SELECT or > VALUES and call no VOLATILE functions) then these applications will > potentially encounter serious performance regressions. > > Should this feature never be added to Pg, making it different and > incompatible with other DBs that implement CTE optimisation, just because > some people are using it for a hacky hint like OFFSET 0? I'm strictly talking about any hinting mechanism being added being an extension. Fixing the planner so that optimizations can get cross the CTE boundary seems the domain of back end hackers, not extensions.
On 28/11/12 15:17, Craig Ringer wrote:
On 27/11/2012 3:42 PM, Scott Marlowe wrote:I think it would be best to be something in the SQL for SELECT, as:Here here! PostgreSQL is well known for its extensibility and this isI agree with the sentiment and your concerns. However, this doesn't solve the CTE problem.
the perfect place for hints.
Some people are relying on the planner's inability to push conditions into / pull conditions out of CTEs, and otherwise re-arrange them. If support for optimising into eligible CTEs (ie CTE terms that contain only SELECT or VALUES and call no VOLATILE functions) then these applications will potentially encounter serious performance regressions.
Should this feature never be added to Pg, making it different and incompatible with other DBs that implement CTE optimisation, just because some people are using it for a hacky hint like OFFSET 0?
Should these applications just be broken by the update, with people told to add `OFFSET 0` or load some not-yet-existing hints module after reporting the performance issue to the list?
I don't think either of those are acceptable. Sooner or later somebody's going to want to add CTE optimisation, and I don't think that "you can't" or "great, we'll do it and break everything" are acceptable responses to any proposed patch someone might come up with to add that.
A GUC might be OK, as apps can always SET it before problem queries or not-yet-ported code. It'd probably reduce the rate at which people fixed their code considerably, though, going by past experience with standard_conforming_strings, etc, but it'd work.
--
Craig Ringer
- One is more likely to find it by looking up the documentation for SELECT
- It could allow selective application within a SELECT: one could have several queries within the WITH clause: where all except one might benefit for optimisation, and the exception might cause problems
Cheers,
Gavin