Re: How to retain lesser paths at add_path()? - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: How to retain lesser paths at add_path()? |
Date | |
Msg-id | 20190801102454.qwvfkloue4w67pyk@development Whole thread Raw |
In response to | Re: How to retain lesser paths at add_path()? (Kohei KaiGai <kaigai@heterodb.com>) |
Responses |
Re: How to retain lesser paths at add_path()?
|
List | pgsql-hackers |
On Thu, Aug 01, 2019 at 06:28:08PM +0900, Kohei KaiGai wrote: >2019年8月1日(木) 16:19 Richard Guo <riguo@pivotal.io>: >> >> On Thu, Aug 1, 2019 at 2:12 PM Kohei KaiGai <kaigai@heterodb.com> wrote: >>> >>> 2019年8月1日(木) 1:41 Tom Lane <tgl@sss.pgh.pa.us>: >>> > >>> > Robert Haas <robertmhaas@gmail.com> writes: >>> > > Yeah, but I have to admit that this whole design makes me kinda >>> > > uncomfortable. Every time somebody comes up with a new figure of >>> > > merit, it increases not only the number of paths retained but also the >>> > > cost of comparing two paths to possibly reject one of them. A few >>> > > years ago, you came up with the (good) idea of rejecting some join >>> > > paths before actually creating the paths, and I wonder if we ought to >>> > > try to go further with that somehow. Or maybe, as Peter Geoghegan, has >>> > > been saying, we ought to think about planning top-down with >>> > > memoization instead of bottom up (yeah, I know that's a huge change). >>> > > It just feels like the whole idea of a list of paths ordered by cost >>> > > breaks down when there are so many ways that a not-cheapest path can >>> > > still be worth keeping. Not sure exactly what would be better, though. >>> > >>> > Yeah, I agree that add_path is starting to feel creaky. I don't >>> > know what to do instead though. Changing to a top-down design >>> > sounds like it would solve some problems while introducing others >>> > (not to mention the amount of work and breakage involved). >>> > >>> Hmm... It looks the problem we ought to revise about path construction >>> is much larger than my expectation, and uncertain for me how much works >>> are needed. >>> >>> Although it might be a workaround until fundamental reconstruction, >>> how about to have a margin of estimated cost to reject paths? >>> Current add_path() immediately rejects lesser paths if its cost is >>> even a little more expensive than the compared one. One the other hands, >> >> >> Hmm.. I don't think so. Currently add_path() uses fuzzy comparisons on >> costs of two paths, although the fuzz factor (1%) is hard coded and not >> user-controllable. >> >Ah, sorry, I oversight this logic... > FWIW I doubt adding larger "fuzz factor" is unlikely to be a reliable solution, because how would you know what value is the right one? Why ould 10% be the right threshold, for example? In my experience these these hard-coded coefficients imply behavior that's difficult to predict and explain to users. >>> I understand it is not an essential re-design of path-construction logic, and >>> may have limitation. However, amount of works are reasonable and no side- >>> effect. (current behavior = 0% threshold). >>> How about your opinions? >>> >> >> How's about Tom's suggestion on adding another dimension in add_path() >> to be considered, just like how it considers paths of better sort order >> or parallel-safe? >> >Robert also mentioned it makes comparison operation more complicated. >If we try to have another dimension here, a callback function in Path node >may be able to tell the core optimizer whether "dominated path" shall be >dropped or not, without further complexity. It is just an idea. > I think adding a hook to add_path() allowing to override the decidion should be OK. The chance of getting that committed in the near future seems much higher than for a patch that completely reworks add_path(). There's one caveat, though - AFAICS various places in the planner use things like cheapest_total_path, cheapest_startup_path and even get_cheapest_path_for_pathkeys() which kinda assumes add_path() only considers startup/total cost. It might happen that even after keeping additional paths, the planner still won't use them :-( regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: