Re: Hints (was Poor performance using CTE) - Mailing list pgsql-performance
From | Kevin Grittner |
---|---|
Subject | Re: Hints (was Poor performance using CTE) |
Date | |
Msg-id | 20121121134232.68590@gmx.com Whole thread Raw |
In response to | Hints (was Poor performance using CTE) (Craig James <cjames@emolecules.com>) |
Responses |
Re: Hints (was Poor performance using CTE)
Re: Hints (was Poor performance using CTE) |
List | pgsql-performance |
Craig Ringer wrote: > On 11/21/2012 09:35 AM, Craig James wrote: >> Why not make an explicit hint syntax and document it? I've still >> don't understand why "hint" is a dirty word in Postgres. There are >> a half-dozen or so ways in common use to circumvent or correct >> sub-optimal plans. > > The reason usually given is that hints provide easy workarounds for > planner and stats issues, so people don't report problems or fix > the underlying problem. > > Of course, if that's all there was to it, `OFFSET 0` would be made > into an error or warning, or ignored and not fenced. > > The reality is, as you say, that there's a need, because the > planner can never be perfect - or rather, if it were nearly > perfect, it'd take so long to read the stats and calculate plans > that everything would be glacially slow anyway. The planner has to > compromise, and so cases will always arise where it needs a little > help. > > I think it's time to admit that and get the syntax in place for > CTEs so there's room to optimize them later, rather than cementing > CTEs-as-fences in forever as a Pg quirk. It's a tough problem. Disguising and not documenting the available optimizer hints leads to more reports on where the optimizer should be smarter, and has spurred optimizer improvements. And many type of hints would undoubtedly cause people to force what they *think* would be the best plan in many cases where they are wrong, or become wrong as data scales up. But it does seem odd every time I hear people saying that they don't want to eliminate some optimization fence because "they find it useful" while simultaneously arguing that we don't have or want hints. Having a way to coerce the optimizer from the plan it would take with straightforward coding *is* a hint, and one down-side of hiding the hints inside syntax mostly supported for other reasons is that people who don't know about these clever devices can't do reasonable refactoring of queries for readability without risking performance regressions. Another down-side is that perfectly reasonable queries ported from other databases that use hint syntax for hints run afoul of the secret hints when trying to run queries on PostgreSQL, and get performance potentially orders of magnitude worse than they expect. I'm not sure what the best answer is, but as long as we have hints, but only through OFFSET 0 or CTE usage, that should be documented. Better, IMV, would be to identify what sorts of hints people actually find useful, and use that as the basis for TODO items for optimizer improvement as well as inventing clear ways to specify the desired coercion. I liked the suggestion that a CTE which didn't need to be materialized because of side-effects or multiple references have a keyword. Personally, I think that AS MATERIALIZED x (SELECT ...) would be preferable to AS x (SELECT ... OFFSET 0) as the syntax to specify that. Regarding the above-mentioned benefits we would stand to lose by having clear and documented hints, perhaps we could occasionally solicit input on where people are finding hints useful to get ideas on where we might want to improve the optimizer. As far as worrying about people using hints to force a plan which is sub-optimal -- isn't that getting into nanny mode a bit too much? -Kevin
pgsql-performance by date: