Thread: disfavoring unparameterized nested loops
From time to time, someone tells me that they've configured enable_nestloop=false on postgresql.conf, which is a pretty bad idea since there are a significant number of cases where such plans are the only reasonable way of executing some query. However, it's no great secret that PostgreSQL's optimizer sometimes produces nested loops that are very, very, very slow, generally because it has grossly underestimated the cardinality of the inner side of the nested loop. The frustration which users experience as a result of such events is understandable. I read https://15721.courses.cs.cmu.edu/spring2020/papers/22-costmodels/p204-leis.pdf today and found out that the authors of that paper did something a bit more nuanced which, in their experiments, was very successful. It sounds like what they basically did is disabled unparameterized nested loops. They argue that such nested loops figure to gain very little as compared with a hash join, but that they might turn out to lose a lot if the cardinality estimation is inaccurate, and they present experimental results to back up those claims. One observation that the paper makes along the way is that every system they tested is more likely to underestimate the cardinality of joins than to overestimate it, and that this tendency becomes more pronounced as the size of the join planning problem increases. On reflection, I believe this matches my experience, and it makes sense that it should be so, since it occasionally happens that the join selectivity estimate is essentially zero, and a bigger join problem is more likely to have at least one such case. On the other hand, the join selectivity estimate can never be +infinity. Hence, it's more important in general for a database system to be resilient against underestimates than to be resilient against overestimates. Being less willing to choose unparameterized nested loops is one way to move in that direction. How to do that is not very clear. One very simple thing we could do would be to introduce enable_nestloop=parameterized or enable_parameterized_nestloop=false. That is a pretty blunt instrument but the authors of the paper seem to have done that with positive results, so maybe it's actually not a dumb idea. Another approach would be to introduce a large fuzz factor for such nested loops e.g. keep them only if the cost estimate is better than the comparable hash join plan by at least a factor of N (or if no such plan is being generated for some reason). I'm not very confident that this would actually do what we want, though. In the problematic cases, a nested loop is likely to look extremely cheap, so just imagining that the cost might be higher is not very protective. Perhaps a better approach would be something like: if the estimated number of inner rows is less than 100, then re-estimate the cost of this approach and of the best available hash join on the assumption that there are 100 inner rows. If the hash join still wins, keep it; if it loses under that assumption, throw it out. I think it's likely that this approach would eliminate a large number of highly risky nested loop plans, probably even with s/100/10/g, without causing many other problems (except perhaps increased planner CPU consumption ... but maybe that's not too bad). Just to be clear, I do understand that there are cases where no Hash Join is possible, but anything we do here could be made to apply only when a hash join is in fact possible. We could also think about making the point of comparison the best other plans of any sort rather than a hash join specifically, which feels a little more principled but might actually be worse. When a Nested Loop is a stupid idea, it's stupid precisely because the inner side is big and we could've avoided recomputing it over and over by using a Hash Join instead, not because some Merge Join based plan turns out to be better. I mean, it is possible that some Merge Join plan does turn out to be better, but that's not rage-inducing in the same way. Nobody looks at a complicated join plan that happened to use a Nested Loop and says "obviously, this is inferior to a merge join," or if they do, they're probably full of hot air. But people look at complicated join plans that happen to use a Nested Loop and say "obviously, this is inferior to a hash join" *all the time* and assuming the inner path is unparameterized, they are very often correct. Thoughts? I'd be particularly curious to hear about any cases anyone knows about where an unparameterized nested loop and a hash join with batches=1 are both possible and where the unparameterized nested loop is *much* cheaper than the hash join. Thanks, -- Robert Haas EDB: http://www.enterprisedb.com
On Tue, Jun 15, 2021 at 10:09 AM Robert Haas <robertmhaas@gmail.com> wrote: > How to do that is not very clear. One very simple thing we could do > would be to introduce enable_nestloop=parameterized or > enable_parameterized_nestloop=false. That is a pretty blunt instrument > but the authors of the paper seem to have done that with positive > results, so maybe it's actually not a dumb idea. I think that it's probably a good idea as-is. Simple heuristics that are very frequently wrong when considered in a naive way can work very well in practice. This seems to happen when they capture some kind of extreme naturally occuring cost/benefit asymmetry -- especially one with fixed well understood costs and unlimited benefits (this business with unparameterized nestloop joins is about *avoiding* the inverse asymmetry, but that seems very similar). My go to example of such an asymmetry is the rightmost page split heuristic of applying leaf fillfactor regardless of any of the other specifics; we effectively assume that all indexes are on columns with ever-increasing values. Which is obviously wrong. We're choosing between two alternatives (unparameterized nested loop vs hash join) that are really very similar when things go as expected, but diverge sharply when there is a misestimation -- who wouldn't take the "conservative" choice here? I guess that there is a hesitation to not introduce heuristics like this because it doesn't fit into some larger framework that captures risk -- it might be seen as an ugly special case. But isn't this already actually kind of special, whether or not we officially think so? -- Peter Geoghegan
On Tue, Jun 15, 2021 at 2:04 PM Peter Geoghegan <pg@bowt.ie> wrote: > I guess that there is a hesitation to not introduce heuristics like > this because it doesn't fit into some larger framework that captures > risk -- it might be seen as an ugly special case. But isn't this > already actually kind of special, whether or not we officially think > so? Yes, I think it is. Reading the paper really helped me crystallize my thoughts about this, because when I've studied the problems myself, I came, as you postulate here, to the conclusion that there's a lot of stuff the planner does where there is risk and uncertainty, and thus that a general framework would be necessary to deal with it. But the fact that an academic researcher called this problem out as the only one worth treating specially makes me think that perhaps it deserves special handling. In defense of that approach, note that this is a case where we know both that the Nested Loop is risky and that Hash Join is a similar alternative with probably similar cost. I am not sure there are any other cases where we can say quite so generally both that a certain thing is risky and what we could do instead. -- Robert Haas EDB: http://www.enterprisedb.com
On Tue, Jun 15, 2021 at 12:31 PM Robert Haas <robertmhaas@gmail.com> wrote: > Yes, I think it is. Reading the paper really helped me crystallize my > thoughts about this, because when I've studied the problems myself, I > came, as you postulate here, to the conclusion that there's a lot of > stuff the planner does where there is risk and uncertainty, and thus > that a general framework would be necessary to deal with it. It is an example (perhaps the only example in the optimizer) of an oasis of certainty in an ocean of uncertainty. As uncertain as everything is, we seemingly can make strong robust statements about the relative merits of each strategy *in general*, just in this particular instance. It's just not reasonable to make such a reckless choice, no matter what your general risk tolerance is. Goetz Graefe is interviewed here, and goes into his philosophy on robustness -- it seems really interesting to me: https://sigmodrecord.org/publications/sigmodRecord/2009/pdfs/05_Profiles_Graefe.pdf > In defense of that approach, note that this is a > case where we know both that the Nested Loop is risky and that Hash > Join is a similar alternative with probably similar cost. I am not > sure there are any other cases where we can say quite so generally > both that a certain thing is risky and what we could do instead. I tend to think of a hash join as like a nested loop join with an inner index scan where you build the index yourself, dynamically. That might be why I find it easy to make this mental leap. In theory you could do this by giving the nestloop join runtime smarts -- make it turn into a hash join adaptively. Like Graefe's G-Join design. That way you could do this in a theoretically pure way. I don't think that that's actually necessary just to deal with this case -- it probably really is as simple as it seems. I point this out because perhaps it's useful to have that theoretical anchoring. -- Peter Geoghegan
On Wed, 16 Jun 2021 at 05:09, Robert Haas <robertmhaas@gmail.com> wrote: > How to do that is not very clear. One very simple thing we could do > would be to introduce enable_nestloop=parameterized or > enable_parameterized_nestloop=false. That is a pretty blunt instrument > but the authors of the paper seem to have done that with positive > results, so maybe it's actually not a dumb idea. It's not great that people are having to use such blunt instruments to get the planner to behave. It might not be a terrible idea to provide them with something a bit sharper as you suggest. The GUC idea is certainly something that could be done without too much effort. There was some talk of doing that in [1]. > Another approach > would be to introduce a large fuzz factor for such nested loops e.g. > keep them only if the cost estimate is better than the comparable hash > join plan by at least a factor of N (or if no such plan is being > generated for some reason). In my experience, the most common reason that the planner chooses non-parameterized nested loops wrongly is when there's row underestimation that says there's just going to be 1 row returned by some set of joins. The problem often comes when some subsequent join is planned and the planner sees the given join rel only produces one row. The cheapest join method we have to join 1 row is Nested Loop. So the planner just sticks the 1-row join rel on the outer side thinking the executor will only need to scan the inner side of the join once. When the outer row count blows up, then we end up scanning that inner side many more times. The problem is compounded when you nest it a few joins deep Most of the time when I see that happen it's down to either the selectivity of some correlated base-quals being multiplied down to a number low enough that we clamp the estimate to be 1 row. The other case is similar, but with join quals. It seems to me that each time we multiply 2 selectivities together that the risk of the resulting selectivity being out increases. The risk is likely lower when we have some extended statistics which allows us to do fewer selectivity multiplications. For that 1-row case, doing an unparameterized nested loop is only the cheapest join method by a tiny amount. It really wouldn't be much more expensive to just put that single row into a hash table. If that 1 estimated row turns out to be 10 actual rows then it's likely not too big a deal for the hash join code to accommodate the 9 additional rows. This makes me think that it's insanely risky for the planner to be picking Nested Loop joins in this case. And that puts me down the path of thinking that this problem should be solved by having the planner take into account risk as well as costs when generating plans. I don't really have any concrete ideas on that, but a basic idea that I have been considering is that a Path has a risk_factor field that is decided somewhere like clauselist_selectivity(). Maybe the risk can go up by 1 each time we multiply an individual selectivity. (As of master, estimate_num_groups() allows the estimation to pass back some further information to the caller. I added that for Result Cache so I could allow the caller to get visibility about when the estimate fell back on DEFAULT_NUM_DISTINCT. clauselist_selectivity() maybe could get similar treatment to allow the risk_factor or number of nstats_used to be passed back). We'd then add a GUC, something like planner_risk_adversion which could be set to anything from 0.0 to some positive number. During add_path() we could do the cost comparison like: path1.cost * path1.risk_factor * (1.0 + planner_risk_adversion) < path2.cost * path2.risk_factor * (1.0 + planner_risk_adversion). That way, if you set planner_risk_adversion to 0.0, then the planner does as it does today, i.e takes big risks. The problem I have with this idea is that I really don't know how to properly calculate what the risk_factor should be set to. It seems easy at first to set it to something that has the planner avoid these silly 1-row estimate nested loop mistakes, but I think what we'd set the risk_factor to would become much more important when more and more Path types start using it. So if we did this and just guessed the risk_factor, that might be fine when only 1 of the paths being compared had a non-zero risk_factor, but as soon as both paths have one set, unless they're set to something sensible, then we just end up comparing garbage costs to garbage costs. Another common mistake the planner makes is around WHERE a = <value> ORDER BY b LIMIT n; where there are separate indexes on (a) and (b). Scanning the (b) index is pretty risky. All the "a" values you need might be right at the end of the index. It seems safer to scan the (a) index as we'd likely have statistics that tell us how many rows exist with <value>. We don't have any stats that tell us where in the (b) index are all the rows with a = <value>. I don't really think we should solve this by having nodeNestloop.c fall back on hashing when the going gets tough. Overloading our nodes that way is not a sustainable thing to do. I'd rather see the executor throw the plan back at the planner along with some hints about what was wrong with it. We could do that providing we've not sent anything back to the client yet. David [1] https://www.postgresql.org/message-id/CAKJS1f8nsm-T0KMvGJz_bskUjQ%3DyGmGUUtUdAcFoEaZ_tuTXjA%40mail.gmail.com
On Tue, Jun 15, 2021 at 5:00 PM David Rowley <dgrowleyml@gmail.com> wrote: > I don't really think we should solve this by having nodeNestloop.c > fall back on hashing when the going gets tough. Overloading our nodes > that way is not a sustainable thing to do. I'd rather see the > executor throw the plan back at the planner along with some hints > about what was wrong with it. We could do that providing we've not > sent anything back to the client yet. It wasn't a serious suggestion -- it was just a way of framing the issue at hand that I thought might be useful. If we did have something like that (which FWIW I think makes sense but is hard to do right in a general way) then it might be expected to preemptively refuse to even start down the road of using an unparameterized nestloop join very early, or even before execution time. Such an adaptive join algorithm/node might be expected to have a huge bias against this particular plan shape, that can be reduced to a simple heuristic. But you can have the simple heuristic without needing to build everything else. Whether or not we throw the plan back at the planner or "really change our minds at execution time" seems like a distinction without a difference. Either way we're changing our minds about the plan based on information that is fundamentally execution time information, not plan time information. Have I missed something? -- Peter Geoghegan
On Wed, 16 Jun 2021 at 12:11, Peter Geoghegan <pg@bowt.ie> wrote: > Whether or not we throw the plan back at the planner or "really change > our minds at execution time" seems like a distinction without a > difference. What is "really change our minds at execution time"? Is that switch to another plan without consulting the planner? If so what decides what that new plan should be? The planner is meant to be the expert in that department. The new information might cause the join order to completely change. It might not be as simple as swapping a Nested Loop for a Hash Join. > Either way we're changing our minds about the plan based > on information that is fundamentally execution time information, not > plan time information. Have I missed something? I don't really see why you think the number of rows that a given join might produce is execution information. It's exactly the sort of information the planner needs to make a good plan. It's just that today we get that information from statistics. Plenty of other DBMSs make decisions from sampling. FWIW, we do already have a minimalist sampling already in get_actual_variable_range(). I'm just trying to highlight that I don't think overloading nodes is a good path to go down. It's not a sustainable practice. It's a path towards just having a single node that does everything. If your suggestion was not serious then there's no point in discussing it further. David
On Tue, Jun 15, 2021 at 6:15 PM David Rowley <dgrowleyml@gmail.com> wrote: > On Wed, 16 Jun 2021 at 12:11, Peter Geoghegan <pg@bowt.ie> wrote: > > Whether or not we throw the plan back at the planner or "really change > > our minds at execution time" seems like a distinction without a > > difference. > > What is "really change our minds at execution time"? Is that switch > to another plan without consulting the planner? I don't know what it means. That was my point -- it all seems like semantics to me. The strong separation between plan time and execution time isn't necessarily a good thing, at least as far as solving some of the thorniest problems goes. It seems obvious to me that cardinality estimation is the main problem, and that the most promising solutions are all fundamentally about using execution time information to change course. Some problems with planning just can't be solved at plan time -- no model can ever be smart enough. Better to focus on making query execution more robust, perhaps by totally changing the plan when it is clearly wrong. But also by using more techniques that we've traditionally thought of as execution time techniques (e.g. role reversal in hash join). The distinction is blurry to me. There are no doubt practical software engineering issues with this -- separation of concerns and whatnot. But it seems premature to go into that now. > The new information might cause the join order to > completely change. It might not be as simple as swapping a Nested Loop > for a Hash Join. I agree that it might not be that simple at all. I think that Robert is saying that this is one case where it really does appear to be that simple, and so we really can expect to benefit from a simple plan-time heuristic that works within the confines of the current model. Why wouldn't we just take that easy win, once the general idea has been validated some more? Why let the perfect be the enemy of the good? I have perhaps muddied the waters by wading into the more general question of robust execution, the inherent uncertainty with cardinality estimation, and so on. Robert really didn't seem to be talking about that at all (though it is clearly related). > > Either way we're changing our minds about the plan based > > on information that is fundamentally execution time information, not > > plan time information. Have I missed something? > > I don't really see why you think the number of rows that a given join > might produce is execution information. If we're 100% sure a join will produce at least n rows because we executed it (with the express intention of actually doing real query processing that returns rows to the client), and it already produced n rows, then what else could it be called? Why isn't it that simple? > It's exactly the sort of > information the planner needs to make a good plan. It's just that > today we get that information from statistics. Plenty of other DBMSs > make decisions from sampling. > FWIW, we do already have a minimalist > sampling already in get_actual_variable_range(). I know, but that doesn't seem all that related -- it almost seems like the opposite idea. It isn't the executor balking when it notices that the plan is visibly wrong during execution, in some important way. It's more like the planner using the executor to get information about an index that is well within the scope of what we think of as plan time. To some degree the distinction gets really blurred due to nodes like hash join, where some important individual decisions are delayed until execution time already. It's really unclear when precisely it stops being that, and starts being more of a case of either partially or wholly replanning. I don't know how to talk about it without it being confusing. > I'm just trying to highlight that I don't think overloading nodes is a > good path to go down. It's not a sustainable practice. It's a path > towards just having a single node that does everything. If your > suggestion was not serious then there's no point in discussing it > further. As I said, it was a way of framing one particular issue that Robert is concerned about. -- Peter Geoghegan
On Tue, Jun 15, 2021 at 5:00 PM David Rowley <dgrowleyml@gmail.com> wrote: > Most of the time when I see that happen it's down to either the > selectivity of some correlated base-quals being multiplied down to a > number low enough that we clamp the estimate to be 1 row. The other > case is similar, but with join quals. > > It seems to me that each time we multiply 2 selectivities together > that the risk of the resulting selectivity being out increases. The > risk is likely lower when we have some extended statistics which > allows us to do fewer selectivity multiplications. It seems important to distinguish between risk and uncertainty -- they're rather different things. The short version is that you can model risk but you cannot model uncertainty. This seems like a problem of uncertainty to me. The example from the paper that Robert cited isn't interesting to me because it hints at a way of managing the uncertainty, exactly. It's interesting because it seems to emphasize the user's exposure to the problem, which is what really matters. Even if it was extremely unlikely that the user would have a problem, the downside of being wrong is still absurdly high, and the upside of being right is low (possibly even indistinguishable from zero). It's just not worth thinking about. Besides, we all know that selectivity estimates are very often quite wrong without the user ever noticing. It's amazing that database optimizers work as well as they do, really. -- Peter Geoghegan
> > The problem I have with this idea is that I really don't know how to > properly calculate what the risk_factor should be set to. It seems > easy at first to set it to something that has the planner avoid these > silly 1-row estimate nested loop mistakes, but I think what we'd set > the risk_factor to would become much more important when more and more > Path types start using it. So if we did this and just guessed the > risk_factor, that might be fine when only 1 of the paths being > compared had a non-zero risk_factor, but as soon as both paths have > one set, unless they're set to something sensible, then we just end up > comparing garbage costs to garbage costs. Risk factor is the inverse of confidence on estimate, lesser confidence, higher risk. If we associate confidence with the selectivity estimate, or computer confidence interval of the estimate instead of a single number, we can associate risk factor with each estimate. When we combine estimates to calculate new estimates, we also combine their confidences/confidence intervals. If my memory serves well, confidence intervals/confidences are calculated based on the sample size and method used for estimation, so we should be able to compute those during ANALYZE. I have not come across many papers which leverage this idea. Googling "selectivity estimation confidence interval", does not yield many papers. Although I found [1] to be using a similar idea. So may be there's not merit in this idea, thought theoretically it sounds fine to me. [1] https://pi3.informatik.uni-mannheim.de/~moer/Publications/vldb18_smpl_synop.pdf -- Best Wishes, Ashutosh Bapat
On Fri, Jun 18, 2021 at 6:20 AM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:
> "selectivity estimation confidence interval", does not yield many
> papers. Although I found [1] to be using a similar idea. So may be
> there's not merit in this idea, thought theoretically it sounds fine
> to me.
>
>
> [1] https://pi3.informatik.uni-mannheim.de/~moer/Publications/vldb18_smpl_synop.pdf
Well, that paper's title shows it's a bit too far forward for us, since we don't use samples during plan time (although that's a separate topic worth considering). From the references, however, this one gives some mathematical framing of the problem that lead to the thread subject, although I haven't read enough to see if we can get practical advice from it:
On Wed, 16 Jun 2021 at 15:08, Peter Geoghegan <pg@bowt.ie> wrote: > It seems important to distinguish between risk and uncertainty -- > they're rather different things. The short version is that you can > model risk but you cannot model uncertainty. This seems like a problem > of uncertainty to me. You might be right there. "Uncertainty" or "Certainty" seems more like a value that clauselist_selectivity() would be able to calculate itself. It would just be up to the planner to determine what to do with it. One thing I thought about is that if the costing modal was able to separate out a cost of additional (unexpected) rows then it would be easier for add_path() to take into account how bad things might go if we underestimate. For example, in an unparameterized Nested Loop that estimates the outer Path to have 1 row will cost an entire additional inner cost if there are 2 rows. With Hash Join the cost is just an additional hashtable lookup, which is dead cheap. I don't know exactly how add_path() would weigh all that up, but it seems to me that I wouldn't take the risk unless I was 100% certain that the Nested Loop's outer Path would only return 1 row exactly, if there was any chance at all it could return more, I'd be picking some other join method. David
> In my experience, the most common reason that the planner chooses
> non-parameterized nested loops wrongly is when there's row
> underestimation that says there's just going to be 1 row returned by
> some set of joins. The problem often comes when some subsequent join
> is planned and the planner sees the given join rel only produces one
> row. The cheapest join method we have to join 1 row is Nested Loop.
> So the planner just sticks the 1-row join rel on the outer side
> thinking the executor will only need to scan the inner side of the
> join once. When the outer row count blows up, then we end up scanning
> that inner side many more times. The problem is compounded when you
> nest it a few joins deep
>
> Most of the time when I see that happen it's down to either the
> selectivity of some correlated base-quals being multiplied down to a
> number low enough that we clamp the estimate to be 1 row. The other
> case is similar, but with join quals.
If an estimate is lower than 1, that should be a red flag that Something Is Wrong. This is kind of a crazy idea, but what if we threw it back the other way by computing 1 / est , and clamping that result to 2 <= res < 10 (or 100 or something)? The theory is, the more impossibly low it is, the more wrong it is. I'm attracted to the idea of dealing with it as an estimation problem and not needing to know about join types. Might have unintended consequences, though.
Long term, it would be great to calculate something about the distribution of cardinality estimates, so we can model risk in the estimates.
--
John Naylor
EDB: http://www.enterprisedb.com
> > > > Most of the time when I see that happen it's down to either the > > selectivity of some correlated base-quals being multiplied down to a > > number low enough that we clamp the estimate to be 1 row. The other > > case is similar, but with join quals. > > If an estimate is lower than 1, that should be a red flag that Something Is > Wrong. This is kind of a crazy idea, but what if we threw it back the other > way by computing 1 / est , and clamping that result to 2 <= res < 10 (or > 100 or something)? The theory is, the more impossibly low it is, the more > wrong it is. I'm attracted to the idea of dealing with it as an estimation > problem and not needing to know about join types. Might have unintended > consequences, though. > > Long term, it would be great to calculate something about the distribution > of cardinality estimates, so we can model risk in the estimates. > Hi, Laurenz suggested clamping to 2 in this thread in 2017: https://www.postgresql.org/message-id/1509611428.3268.5.camel%40cybertec.at Having been the victim of this problem in the past, I like the risk based approach to this. If the cost of being wrong in the estimate is high, use a merge join instead. In every case that I have encountered, that heuristic would have given the correct performant plan. Regards, Ken
On Mon, Jun 21, 2021 at 6:41 AM David Rowley <dgrowleyml@gmail.com> wrote: > For example, in an unparameterized Nested Loop that estimates the > outer Path to have 1 row will cost an entire additional inner cost if > there are 2 rows. With Hash Join the cost is just an additional > hashtable lookup, which is dead cheap. I don't know exactly how > add_path() would weigh all that up, but it seems to me that I wouldn't > take the risk unless I was 100% certain that the Nested Loop's outer > Path would only return 1 row exactly, if there was any chance at all > it could return more, I'd be picking some other join method. It seems like everyone agrees that it would be good to do something about this problem, but the question is whether it's best to do something that tries to be general, or whether we should instead do something about this specific case. I favor the latter approach. Risk and uncertainty exist all over the place, but dealing with that in a general way seems difficult, and maybe unnecessary. Addressing the case of unparameterized nest loops specifically seems simpler, because it's easier to reason about what the alternatives are. Your last sentence here seems right on point to me. Basically, what you argue for there is disabling unparameterized nested loops entirely except when we can prove that the inner side will never generate more than one row. But, that's almost never going to be something that we can prove. If the inner side is coming from a table or sub-join, it can turn out to be big. As far as I can see, the only way that this doesn't happen is if it's something like a subquery that aggregates everything down to one row, or has LIMIT 1, but those are such special cases that I don't even think we should be worrying about them. So my counter-proposal is: how about if we split merge_unsorted_outer() into two functions, one of which generates nested loops only based on parameterized paths and the other of which generates nested loops based only on unparameterized paths, and then rejigger add_paths_to_joinrel() so that we do the latter between the steps that are currently number 5 and 6 and only if we haven't got any other paths yet? If somebody later comes up with logic for proving that the inner side can never have more than 1 row, we can let this be run in those cases as well. In the meantime, if somebody does something like a JOIN b ON a.x < b.x, we will still generate these paths because there's no other approach, or similarly if it's a.x = b.x but for some strange type that doesn't have a hash-joinable or merge-joinable equality operator. But otherwise we omit those paths entirely. -- Robert Haas EDB: http://www.enterprisedb.com
On Mon, Jun 21, 2021 at 7:45 AM Robert Haas <robertmhaas@gmail.com> wrote: > On Mon, Jun 21, 2021 at 6:41 AM David Rowley <dgrowleyml@gmail.com> wrote: > > For example, in an unparameterized Nested Loop that estimates the > > outer Path to have 1 row will cost an entire additional inner cost if > > there are 2 rows. With Hash Join the cost is just an additional > > hashtable lookup, which is dead cheap. I don't know exactly how > > add_path() would weigh all that up, but it seems to me that I wouldn't > > take the risk unless I was 100% certain that the Nested Loop's outer > > Path would only return 1 row exactly, if there was any chance at all > > it could return more, I'd be picking some other join method. > > It seems like everyone agrees that it would be good to do something > about this problem, but the question is whether it's best to do > something that tries to be general, or whether we should instead do > something about this specific case. I favor the latter approach. I agree with your conclusion, but FWIW I am sympathetic to David's view too. I certainly understand why he'd naturally want to define the class of problems that are like this one, to understand what the boundaries are. The heuristic that has the optimizer flat out avoids an unparameterized nested loop join is justified by the belief that that's fundamentally reckless. Even though we all agree on that much, I don't know when it stops being reckless and starts being "too risky for me, but not fundamentally reckless". I think that that's worth living with, but it isn't very satisfying. > Risk > and uncertainty exist all over the place, but dealing with that in a > general way seems difficult, and maybe unnecessary. Addressing the > case of unparameterized nest loops specifically seems simpler, because > it's easier to reason about what the alternatives are. Your last > sentence here seems right on point to me. Right. Part of why this is a good idea is that the user is exposed to so many individual risks and uncertainties. We cannot see any one risk as existing in a vacuum. It is not the only risk the user will ever take in the planner -- if it was then it might actually be okay to allow unparameterized nested loop joins. A bad unparameterized nested loop join plan has, in a sense, unknown and unbounded cost/downside. But it is only very slightly faster than a hash join, by a fixed well understood amount. With enough "trials" and on a long enough timeline, it will inevitably blow up and cause the application to grind to a halt. It seems like no amount of fixed, bounded benefit from "fast unparameterized nested loop joins" could possibly make up for that. The life of Postgres users would be a lot better if bad plans were at least "survivable events". -- Peter Geoghegan
Peter Geoghegan <pg@bowt.ie> writes: > The heuristic that has the optimizer flat out avoids an > unparameterized nested loop join is justified by the belief that > that's fundamentally reckless. Even though we all agree on that much, > I don't know when it stops being reckless and starts being "too risky > for me, but not fundamentally reckless". I think that that's worth > living with, but it isn't very satisfying. There are certainly cases where the optimizer can prove (in principle; it doesn't do so today) that a plan node will produce at most one row. They're hardly uncommon either: an equality comparison on a unique key, or a subquery with a simple aggregate function, come to mind. In such cases, not only is this choice not reckless, but it's provably superior to a hash join. So in the end this gets back to the planning risk factor that we keep circling around but nobody quite wants to tackle. I'd be a lot happier if this proposal were couched around some sort of estimate of the risk of the outer side producing more than the expected number of rows. The arguments so far seem like fairly lame rationalizations for not putting forth the effort to do that. regards, tom lane
On Mon, Jun 21, 2021 at 8:55 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > There are certainly cases where the optimizer can prove (in principle; > it doesn't do so today) that a plan node will produce at most one row. > They're hardly uncommon either: an equality comparison on a unique > key, or a subquery with a simple aggregate function, come to mind. That sounds like it might be useful in general. > In such cases, not only is this choice not reckless, but it's provably > superior to a hash join. So in the end this gets back to the planning > risk factor that we keep circling around but nobody quite wants to > tackle. Let's assume for the sake of argument that we really have to have that additional infrastructure to move forward with the idea. (I'm not sure if it's possible in principle to use infrastructure like that for some of the cases that Robert has in mind, but for now I'll assume that it is both possible and a practical necessity.) Even when I make this working assumption I don't see what it changes at a fundamental level. You've merely come up with a slightly more specific definition of the class of plans that are "reckless". You've only refined the original provisional definition of "reckless" to exclude specific "clearly not reckless" cases (I think). But the definition of "reckless" is no less squishy than what we started out with. > I'd be a lot happier if this proposal were couched around some sort > of estimate of the risk of the outer side producing more than the > expected number of rows. The arguments so far seem like fairly lame > rationalizations for not putting forth the effort to do that. I'm not so sure that it is. The point isn't the risk, even if it could be calculated. The point is the downsides of being wrong are huge and pretty much unbounded, whereas the benefits of being right are tiny and bounded. It almost doesn't matter what the underlying probabilities are. To be clear I'm not arguing against modelling risk. I'm just not sure that it's useful to think of this problem as truly a problem of risk. -- Peter Geoghegan
On 6/21/21 5:55 PM, Tom Lane wrote: > Peter Geoghegan <pg@bowt.ie> writes: >> The heuristic that has the optimizer flat out avoids an >> unparameterized nested loop join is justified by the belief that >> that's fundamentally reckless. Even though we all agree on that much, >> I don't know when it stops being reckless and starts being "too risky >> for me, but not fundamentally reckless". I think that that's worth >> living with, but it isn't very satisfying. > > There are certainly cases where the optimizer can prove (in principle; > it doesn't do so today) that a plan node will produce at most one row. > They're hardly uncommon either: an equality comparison on a unique > key, or a subquery with a simple aggregate function, come to mind. > > In such cases, not only is this choice not reckless, but it's provably > superior to a hash join. So in the end this gets back to the planning > risk factor that we keep circling around but nobody quite wants to > tackle. > Agreed. > I'd be a lot happier if this proposal were couched around some sort > of estimate of the risk of the outer side producing more than the > expected number of rows. The arguments so far seem like fairly lame > rationalizations for not putting forth the effort to do that. > I agree having such measure would be helpful, but do you have an idea how it could be implemented? I've been thinking about this a bit recently and searching for papers talking about this, and but it's not clear to me how to calculate the risk (and propagate it through the plan) without making the whole cost evaluation way more complicated / expensive :-( The "traditional approach" to quantifying risk would be confidence intervals, i.e. for each cardinality estimate "e" we'd determine a range [a,b] so that P(a <= e <= b) = p. So we could pick "p" depending on how "robust" the plan choice should be (say 0.9 for "risky" and 0.999 for "safe" plans) and calculate a,b. Or maybe we could calculate where the plan changes, and then we'd see if those "break points" are within the confidence interval. If not, great - we can assume the plan is stable, otherwise we'd need to consider the other plans too, somehow. But what I'm not sure about is: 1) Now we're dealing with three cardinality estimates (the original "e" and the boundaries "a, "b"). So which one do we use to calculate cost and pass to upper parts of the plan? 2) The outer relation may be a complex join, so we'd need to combine the confidence intervals for the two input relations, somehow. 3) We'd need to know how to calculate the confidence intervals for most plan nodes, which I'm not sure we know how to do. So it's not clear to me how to do this, which seems rather problematic because we need to propagate and combine those confidence intervals through the plan. But maybe you have thought about some much simpler approach, addressing this sufficiently well? regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Peter Geoghegan <pg@bowt.ie> writes: > On Mon, Jun 21, 2021 at 8:55 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I'd be a lot happier if this proposal were couched around some sort >> of estimate of the risk of the outer side producing more than the >> expected number of rows. The arguments so far seem like fairly lame >> rationalizations for not putting forth the effort to do that. > I'm not so sure that it is. The point isn't the risk, even if it could > be calculated. The point is the downsides of being wrong are huge and > pretty much unbounded, whereas the benefits of being right are tiny > and bounded. It almost doesn't matter what the underlying > probabilities are. You're throwing around a lot of pejorative adjectives there without having bothered to quantify any of them. This sounds less like a sound argument to me than like a witch trial. Reflecting for a bit on the ancient principle that "the only good numbers in computer science are 0, 1, and N", it seems to me that we could make an effort to classify RelOptInfos as provably empty, provably at most one row, and others. (This would be a property of relations, not individual paths, so it needn't bloat struct Path.) We already understand about provably-empty rels, so this is just generalizing that idea a little. Once we know about that, at least for the really-common cases like unique keys, I'd be okay with a hard rule that we don't consider unparameterized nestloop joins with an outer side that falls in the "N" category. Unless there's no alternative, of course. Another thought that occurs to me here is that maybe we could get rid of the enable_material knob in favor of forcing (or at least encouraging) materialization when the outer side isn't provably one row. regards, tom lane
Tomas Vondra <tomas.vondra@enterprisedb.com> writes: > I've been thinking about this a bit recently and searching for papers > talking about this, and but it's not clear to me how to calculate the > risk (and propagate it through the plan) without making the whole cost > evaluation way more complicated / expensive :-( Yeah, a truly complete approach using confidence intervals or the like seems frighteningly complicated. > But maybe you have thought about some much simpler approach, addressing > this sufficiently well? See my nearby response to Peter. The main case that's distressing me is the possibility of forcing a hash join even when the outer side is obviously only one row. If we could avoid that, at least for large values of "obvious", I'd be far more comfortable. regards, tom lane
On Mon, Jun 21, 2021 at 9:52 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I'm not so sure that it is. The point isn't the risk, even if it could > > be calculated. The point is the downsides of being wrong are huge and > > pretty much unbounded, whereas the benefits of being right are tiny > > and bounded. It almost doesn't matter what the underlying > > probabilities are. > > You're throwing around a lot of pejorative adjectives there without > having bothered to quantify any of them. This sounds less like a sound > argument to me than like a witch trial. I'm not sure what you mean by pejorative. Isn't what I said about downsides and upsides pretty accurate? > Reflecting for a bit on the ancient principle that "the only good numbers > in computer science are 0, 1, and N", it seems to me that we could make > an effort to classify RelOptInfos as provably empty, provably at most one > row, and others. (This would be a property of relations, not individual > paths, so it needn't bloat struct Path.) We already understand about > provably-empty rels, so this is just generalizing that idea a little. It sounds like you're concerned about properly distinguishing between: 1. Cases where the only non-reckless choice is a hash join over a unparameterized nested loop join 2. Cases that look like that at first, that don't really have that quality on closer examination. This seems like a reasonable concern. > Once we know about that, at least for the really-common cases like unique > keys, I'd be okay with a hard rule that we don't consider unparameterized > nestloop joins with an outer side that falls in the "N" category. > Unless there's no alternative, of course. I thought that you were arguing against the premise itself. It's now clear that you weren't, though. I don't have an opinion for or against bringing the provably-empty rels stuff into the picture. At least not right now. -- Peter Geoghegan
On Mon, Jun 21, 2021 at 11:55 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > There are certainly cases where the optimizer can prove (in principle; > it doesn't do so today) that a plan node will produce at most one row. > They're hardly uncommon either: an equality comparison on a unique > key, or a subquery with a simple aggregate function, come to mind. Hmm, maybe I need to see an example of the sort of plan shape that you have in mind. To me it feels like a comparison on a unique key ought to use a *parameterized* nested loop. And it's also not clear to me why a nested loop is actually better in a case like this. If the nested loop iterates more than once because there are more rows on the outer side, then you don't want to have something on the inner side that might be expensive, and either an aggregate or an unparameterized search for a unique value are potentially quite expensive. Now if you put a materialize node on top of the inner side, then you don't have to worry about that problem, but how much are you saving at that point vs. just doing a hash join? > I'd be a lot happier if this proposal were couched around some sort > of estimate of the risk of the outer side producing more than the > expected number of rows. The arguments so far seem like fairly lame > rationalizations for not putting forth the effort to do that. I don't understand how to generate a risk assessment or what we ought to do with it if we had one. I don't even understand what units we would use. We measure costs using abstract cost units, but those abstract cost units are intended to be a proxy for runtime. If it's not the case that a plan that runs for longer has a higher cost, then something's wrong with the costing model or the settings. In the case of risk, the whole thing seems totally subjective. We're talking about the risk that our estimate is bogus, but how do we estimate the risk that we don't know how to estimate? Given quals (x + 0) = x, x = some MCV, and x = some non-MCV, we can say that we're most likely to be wrong about the first one and least likely to be wrong about the second one, but how much more likely? I don't know how you can decide that, even in principle. We can also say that an unparameterized nested loop is more risky than some other plan because it could turn out to be crazy expensive, but is that more or less risky than scanning the index on b as a way to implement SELECT * FROM foo WHERE a = 1 ORDER BY b LIMIT 1? How much more risky, and why? And then, even supposing we had a risk metric for every path, what exactly would we do with it? Suppose path A is cheaper than path B, but also more risky. Which should we keep? We could keep both, but that seems to be just kicking the can down the road. If plan B is likely to blow up in our face, we should probably just get rid of it, or not even generate it in the first place. Even if we do keep both, at some point we're going to have to make a cost-vs-risk tradeoff, and I don't see how to do that intelligently, because the point is precisely that if the risk is high, the cost number might be totally wrong. If we know that plan A is cheaper than plan B, we should choose plan A. But if all we know is that plan A would be cheaper than plan B if our estimate of the cost were correct, but also that it probably isn't, then what we actually know is nothing. We have no principled basis for deciding anything based on cost unless we're reasonably confident that the cost estimate is pretty good. So AFAICT the only principled strategy here is to throw away high risk paths as early as we possibly can. What am I missing? The other thing is - the risk of a particular path doesn't matter in an absolute sense, only a relative one. In the particular case I'm on about here, we *know* there's a less-risky alternative. We don't need to quantify the risk to know which of the two options has more. In many other cases, the risk is irreducible e.g. a default estimate could be totally bogus, but switching paths is of no help in getting rid of it. -- Robert Haas EDB: http://www.enterprisedb.com
On Mon, Jun 21, 2021 at 1:11 PM Peter Geoghegan <pg@bowt.ie> wrote: > On Mon, Jun 21, 2021 at 9:52 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > I'm not so sure that it is. The point isn't the risk, even if it could > > > be calculated. The point is the downsides of being wrong are huge and > > > pretty much unbounded, whereas the benefits of being right are tiny > > > and bounded. It almost doesn't matter what the underlying > > > probabilities are. > > > > You're throwing around a lot of pejorative adjectives there without > > having bothered to quantify any of them. This sounds less like a sound > > argument to me than like a witch trial. > > I'm not sure what you mean by pejorative. Isn't what I said about > downsides and upsides pretty accurate? Yeah, I don't see why Peter's characterization deserves to be labelled as pejorative here. A hash join or merge join or parameterized nested loop can turn out to be slower than some other algorithm, but all of those approaches have some component that tends to make the asymptotic cost less than the product of the sizes of the inputs. I don't think that's true in absolutely every case; for example, if a merge join has every row duplicated on both sides, it will have to scan every inner tuple once per outer tuple, just like a nested loop, and the other algorithms also are going to degrade toward O(NM) performance in the face of many duplicates. Also, a hash join can be pretty close to that if it needs a shazillion batches. But in normal cases, any algorithm other than an unparameterized nested loop tends to read each input tuple on each side ONCE, so the cost is more like the sum of the input sizes than the product. And there's nothing pejorative in saying that N + M can be less than N * M by an unbounded amount. That's just the facts. -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Jun 21, 2021 at 11:55 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> There are certainly cases where the optimizer can prove (in principle; >> it doesn't do so today) that a plan node will produce at most one row. >> They're hardly uncommon either: an equality comparison on a unique >> key, or a subquery with a simple aggregate function, come to mind. > Hmm, maybe I need to see an example of the sort of plan shape that you > have in mind. To me it feels like a comparison on a unique key ought > to use a *parameterized* nested loop. The unique-key comparison would be involved in the outer scan in the cases I'm thinking of. As an example, select * from t1, t2 where t1.id = constant and t1.x op t2.y; where I'm not assuming much about the properties of "op". This could be amenable to a plan like NestLoop Join Join Filter: t1.x op t2.y -> Index Scan on t1_pkey Index Cond: t1.id = constant -> Seq Scan on t2 and if we can detect that the pkey indexscan produces just one row, this is very possibly the best available plan. Nor do I think this is an unusual situation that we can just ignore. BTW, it strikes me that there might be an additional consideration here: did parameterization actually help anything? That is, the proposed rule wants to reject the above but allow NestLoop Join -> Index Scan on t1_pkey Index Cond: t1.id = constant -> Seq Scan on t2 Filter: t1.x op t2.y even though the latter isn't meaningfully better. It's possible this won't arise because we don't consider parameterized paths except where the parameter is used in an indexqual or the like, but I'm not confident of that. See in particular reparameterize_path and friends before you assert there's no such issue. So we might need to distinguish essential from incidental parameterization, or something like that. regards, tom lane
I wrote: > ... As an example, > select * from t1, t2 where t1.id = constant and t1.x op t2.y; > where I'm not assuming much about the properties of "op". > This could be amenable to a plan like > NestLoop Join > Join Filter: t1.x op t2.y > -> Index Scan on t1_pkey > Index Cond: t1.id = constant > -> Seq Scan on t2 Also, to enlarge on that example: if "op" isn't hashable then the original argument is moot. However, it'd still be useful to know if the outer scan is sure to return no more than one row, as that could inform the choice whether to plaster a Materialize node on the inner scan. regards, tom lane
On Mon, Jun 21, 2021 at 1:38 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Hmm, maybe I need to see an example of the sort of plan shape that you > > have in mind. To me it feels like a comparison on a unique key ought > > to use a *parameterized* nested loop. > > The unique-key comparison would be involved in the outer scan in > the cases I'm thinking of. As an example, > > select * from t1, t2 where t1.id = constant and t1.x op t2.y; > > where I'm not assuming much about the properties of "op". > This could be amenable to a plan like > > NestLoop Join > Join Filter: t1.x op t2.y > -> Index Scan on t1_pkey > Index Cond: t1.id = constant > -> Seq Scan on t2 > > and if we can detect that the pkey indexscan produces just one row, > this is very possibly the best available plan. Hmm, yeah, I guess that's possible. How much do you think this loses as compared with: Hash Join Hash Cond: t1.x op t2.y -> Seq Scan on t2 -> Hash -> Index Scan on t1_pkey (If the operator is not hashable then this plan is impractical, but in such a case the question of preferring the hash join over the nested loop doesn't arise anyway.) > BTW, it strikes me that there might be an additional consideration > here: did parameterization actually help anything? That is, the > proposed rule wants to reject the above but allow > > NestLoop Join > -> Index Scan on t1_pkey > Index Cond: t1.id = constant > -> Seq Scan on t2 > Filter: t1.x op t2.y > > even though the latter isn't meaningfully better. It's possible > this won't arise because we don't consider parameterized paths > except where the parameter is used in an indexqual or the like, > but I'm not confident of that. See in particular reparameterize_path > and friends before you assert there's no such issue. So we might > need to distinguish essential from incidental parameterization, > or something like that. Hmm, perhaps. I think it won't happen in the normal cases, but I can't completely rule out the possibility that there are corner cases where it does. -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Jun 21, 2021 at 1:38 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> NestLoop Join >> Join Filter: t1.x op t2.y >> -> Index Scan on t1_pkey >> Index Cond: t1.id = constant >> -> Seq Scan on t2 > Hmm, yeah, I guess that's possible. How much do you think this loses > as compared with: > Hash Join > Hash Cond: t1.x op t2.y > -> Seq Scan on t2 > -> Hash > -> Index Scan on t1_pkey Hard to say. The hash overhead might or might not pay for itself. If the equality operator proper is expensive and we get to avoid applying it at most t2 rows, then this might be an OK alternative; otherwise not so much. In any case, the former looks like plans that we generate now, the second not. Do you really want to field a lot of questions about why we suddenly changed to a not-clearly-superior plan? regards, tom lane
On Mon, Jun 21, 2021 at 10:14 AM Robert Haas <robertmhaas@gmail.com> wrote: > Hmm, maybe I need to see an example of the sort of plan shape that you > have in mind. To me it feels like a comparison on a unique key ought > to use a *parameterized* nested loop. And it's also not clear to me > why a nested loop is actually better in a case like this. If the > nested loop iterates more than once because there are more rows on the > outer side, then you don't want to have something on the inner side > that might be expensive, and either an aggregate or an unparameterized > search for a unique value are potentially quite expensive. Now if you > put a materialize node on top of the inner side, then you don't have > to worry about that problem, but how much are you saving at that point > vs. just doing a hash join? I suspected that that was true, but even that doesn't seem like the really important thing. While it may be true that the simple heuristic can't be quite as simple as we'd hoped at first, ISTM that this is ultimately not much of a problem. The basic fact remains: some more or less simple heuristic makes perfect sense, and should be adapted. This conclusion is counterintuitive because it's addressing a very complicated problem with a very simple solution. However, if we lived in a world where things that sound too good to be true always turned out to be false, we'd also live in a world where optimizers were completely impractical and useless. Optimizers have that quality already, whether or not we officially acknowledge it. > I don't understand how to generate a risk assessment or what we ought > to do with it if we had one. I don't even understand what units we > would use. We measure costs using abstract cost units, but those > abstract cost units are intended to be a proxy for runtime. If it's > not the case that a plan that runs for longer has a higher cost, then > something's wrong with the costing model or the settings. In the case > of risk, the whole thing seems totally subjective. We're talking about > the risk that our estimate is bogus, but how do we estimate the risk > that we don't know how to estimate? Clearly we need a risk estimate for our risk estimate! > The other thing is - the risk of a particular path doesn't matter in > an absolute sense, only a relative one. In the particular case I'm on > about here, we *know* there's a less-risky alternative. Exactly! This, a thousand times. This reminds me of how people behave in the real world. In the real world people deal with this without too much difficulty. Everything is situational and based on immediate trade-offs, with plenty of uncertainty at every step. For example, if you think that there is even a tiny chance of a piece of fruit being poisonous, you don't eat the piece of fruit -- better to wait until lunchtime. This is one of the *easiest* decisions I can think of, despite the uncertainty. (Except perhaps if you happen to be in danger of dying of starvation, in which case it might be a very different story. And so on.) -- Peter Geoghegan
Peter Geoghegan <pg@bowt.ie> writes: > On Mon, Jun 21, 2021 at 10:14 AM Robert Haas <robertmhaas@gmail.com> wrote: >> The other thing is - the risk of a particular path doesn't matter in >> an absolute sense, only a relative one. In the particular case I'm on >> about here, we *know* there's a less-risky alternative. > Exactly! This, a thousand times. This is a striking oversimplification. You're ignoring the fact that the plan shape we generate now is in fact *optimal*, and easily proven to be so, in some very common cases. I don't think the people whose use-cases get worse are going to be mollified by the argument that you reduced their risk, when there is provably no risk. Obviously the people whose use-cases are currently hitting the wrong end of the risk will be happy with any change whatever, but those may not be the same people. I'm willing to take some flak if there's not an easy proof that the outer scan is single-row, but I don't think we should just up and change it for cases where there is. regards, tom lane
On Mon, Jun 21, 2021 at 1:52 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > You're ignoring the fact that the plan shape we generate now is in fact > *optimal*, and easily proven to be so, in some very common cases. As I've said I don't reject the idea that there is room for disagreement on the specifics. For example perhaps it'll turn out that only a restricted subset of the cases that Robert originally had in mind will truly turn out to work as well as hoped. But that just seems like a case of Robert refining a very preliminary proposal. I absolutely expect there to be some need to iron out the wrinkles. > I don't > think the people whose use-cases get worse are going to be mollified by > the argument that you reduced their risk, when there is provably no risk. By definition what we're doing here is throwing away slightly cheaper plans when the potential downside is much higher than the potential upside of choosing a reasonable alternative. I don't think that the downside is particularly likely. In fact I believe that it's fairly unlikely in general. This is an imperfect trade-off, at least in theory. I fully own that. > I'm willing to take some flak if there's not an easy proof that the outer > scan is single-row, but I don't think we should just up and change it > for cases where there is. Seems reasonable to me. -- Peter Geoghegan
On Mon, Jun 21, 2021 at 12:52:39PM -0400, Tom Lane wrote: > You're throwing around a lot of pejorative adjectives there without > having bothered to quantify any of them. This sounds less like a sound > argument to me than like a witch trial. > > Reflecting for a bit on the ancient principle that "the only good numbers > in computer science are 0, 1, and N", it seems to me that we could make > an effort to classify RelOptInfos as provably empty, provably at most one > row, and others. (This would be a property of relations, not individual > paths, so it needn't bloat struct Path.) We already understand about > provably-empty rels, so this is just generalizing that idea a little. > Once we know about that, at least for the really-common cases like unique > keys, I'd be okay with a hard rule that we don't consider unparameterized > nestloop joins with an outer side that falls in the "N" category. > Unless there's no alternative, of course. > > Another thought that occurs to me here is that maybe we could get rid of > the enable_material knob in favor of forcing (or at least encouraging) > materialization when the outer side isn't provably one row. There were a lot of interesting ideas in this thread and I want to analyze some of them. First, there is the common assumption (not stated) that over-estimating by 5% and underestimating by 5% cause the same harm, which is clearly false. If I go to a restaurant and estimate the bill to be 5% higher or %5 lower, assuming I have sufficient funds, under or over estimating is probably fine. If I am driving and under-estimate the traction of my tires, I am probably fine, but if I over-estimate their traction by 5%, I might crash. Closer to home, Postgres is more tolerant of memory usage under-estimation than over-estimation: https://momjian.us/main/blogs/pgblog/2018.html#December_7_2018 What I hear Robert saying is that unparameterized nested loops are much more sensitive to misestimation than hash joins, and only slightly faster than hash joins when they use accurate row counts, so we might want to avoid them by default. Tom is saying that if we know the outer side will have zero or one row, we should still do unparameterized nested loops because those are not more sensitive to misestimation than hash joins, and slightly faster. If that is accurate, I think the big question is how common are cases where the outer side can't be proven to have zero or one row and nested loops are enough of a win to risk its greater sensitivity to misestimation. If it is uncommon, seems we could just code the optimizer to use hash joins in those cases without a user-visible knob, beyond the knob that already turns off nested loop joins. Peter's comment about having nodes in the executor that adjust to the row counts it finds is interesting, and eventually might be necessary once we are sure we have gone as far as we can without it. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com If only the physical world exists, free will is an illusion.
On Mon, Jun 21, 2021 at 4:51 PM Bruce Momjian <bruce@momjian.us> wrote: > There were a lot of interesting ideas in this thread and I want to > analyze some of them. First, there is the common assumption (not > stated) that over-estimating by 5% and underestimating by 5% cause the > same harm, which is clearly false. If I go to a restaurant and estimate > the bill to be 5% higher or %5 lower, assuming I have sufficient funds, > under or over estimating is probably fine. If I am driving and > under-estimate the traction of my tires, I am probably fine, but if I > over-estimate their traction by 5%, I might crash. My favorite analogy is the home insurance one: It might make sense to buy home insurance because losing one's home (say through fire) is a loss that usually just cannot be tolerated -- you are literally ruined. We can debate how likely it is to happen, but in the end it's not so unlikely that it can't be ruled out. At the same time I may be completely unwilling to buy insurance for personal electronic devices. I can afford to replace all of them if I truly have to. And the chances of all of them breaking or being stolen on the same day is remote (unless my home burns down!). If I drop my cell phone and crack the screen, I'll be annoyed, but it's certainly not the end of the world. This behavior will make perfect sense to most people. But it doesn't scale up or down. I have quite a few electronic devices, but only a single home, so technically I'm taking risks way more often than I am playing it safe here. Am I risk tolerant when it comes to insurance? Conservative? I myself don't think that it is sensible to apply either term here. It's easier to just look at the specifics. A home is a pretty important thing to almost everybody; we can afford to treat it as a special case. > If that is accurate, I think the big question is how common are cases > where the outer side can't be proven to have zero or one row and nested > loops are enough of a win to risk its greater sensitivity to > misestimation. If it is uncommon, seems we could just code the > optimizer to use hash joins in those cases without a user-visible knob, > beyond the knob that already turns off nested loop joins. I think it's possible that Robert's proposal will lead to very slightly slower plans in the vast majority of cases that are affected, while still being a very good idea. Why should insurance be 100% free, though? Maybe it can be in some cases where we get lucky, but why should that be the starting point? It just has to be very cheap relative to what we do today for us to come out ahead, certainly, but that seems quite possible in at least this case. -- Peter Geoghegan
On 6/22/21 2:25 AM, Peter Geoghegan wrote: > On Mon, Jun 21, 2021 at 4:51 PM Bruce Momjian <bruce@momjian.us> wrote: >> There were a lot of interesting ideas in this thread and I want to >> analyze some of them. First, there is the common assumption (not >> stated) that over-estimating by 5% and underestimating by 5% cause the >> same harm, which is clearly false. If I go to a restaurant and estimate >> the bill to be 5% higher or %5 lower, assuming I have sufficient funds, >> under or over estimating is probably fine. If I am driving and >> under-estimate the traction of my tires, I am probably fine, but if I >> over-estimate their traction by 5%, I might crash. > > My favorite analogy is the home insurance one: > > It might make sense to buy home insurance because losing one's home > (say through fire) is a loss that usually just cannot be tolerated -- > you are literally ruined. We can debate how likely it is to happen, > but in the end it's not so unlikely that it can't be ruled out. At the > same time I may be completely unwilling to buy insurance for personal > electronic devices. I can afford to replace all of them if I truly > have to. And the chances of all of them breaking or being stolen on > the same day is remote (unless my home burns down!). If I drop my cell > phone and crack the screen, I'll be annoyed, but it's certainly not > the end of the world. > > This behavior will make perfect sense to most people. But it doesn't > scale up or down. I have quite a few electronic devices, but only a > single home, so technically I'm taking risks way more often than I am > playing it safe here. Am I risk tolerant when it comes to insurance? > Conservative? > > I myself don't think that it is sensible to apply either term here. > It's easier to just look at the specifics. A home is a pretty > important thing to almost everybody; we can afford to treat it as a > special case. > >> If that is accurate, I think the big question is how common are cases >> where the outer side can't be proven to have zero or one row and nested >> loops are enough of a win to risk its greater sensitivity to >> misestimation. If it is uncommon, seems we could just code the >> optimizer to use hash joins in those cases without a user-visible knob, >> beyond the knob that already turns off nested loop joins. > > I think it's possible that Robert's proposal will lead to very > slightly slower plans in the vast majority of cases that are affected, > while still being a very good idea. Why should insurance be 100% free, > though? Maybe it can be in some cases where we get lucky, but why > should that be the starting point? It just has to be very cheap > relative to what we do today for us to come out ahead, certainly, but > that seems quite possible in at least this case. > Yeah, I like the insurance analogy - it gets to the crux of the problem, because insurance is pretty much exactly about managing risk. But making everything slower will be a hard sell, because wast majority of workloads already running on Postgres don't have this issue at all, so for them it's not worth the expense. Following the insurance analogy, selling tornado insurance in Europe is mostly pointless. Insurance is also about personal preference / risk tolerance. Maybe I'm fine with accepting risk that my house burns down, or whatever ... And the lack of data also plays role - the insurance company will ask for higher rates when it does not have enough accurate data about the phenomenon, or when there's a lot of unknowns. Maybe this would allow some basic measure of uncertainty, based on the number and type of restrictions, joins, etc. The more restrictions we have, the less certain the estimates are. Some conditions are estimated less accurately, and using default estimates makes it much less accurate. So maybe some fairly rough measure of uncertainty might work, and the user might specify how much risk it's willing to tolerate. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> But making everything slower will be a hard sell, because vast majority of > workloads already running on Postgres don't have this issue at all, so > for them it's not worth the expense. Following the insurance analogy, > selling tornado insurance in Europe is mostly pointless. > Agree. I've been surprised about NOT hearing complaints from PostgreSQL customers about a particular "bad" plan choice that was common in other rdbms products where large, complex queries were the norm. The situation occurs late in a plan with many joins where a hash join can be used and where either side is estimated to fit in memory. On one side is a base table with cardinality that we have statistics for, while the other side has an estimated cardinality that is the result of many estimates each of which has error that can compound, and that in some cases amounts to a wild guess. (e.g. what is the selectivity of SUM(x) < 12 ?). If the planner's point estimate of cardinality is such that both sides could fit in memory, then a bad plan can easily be made. As Peter said, [ most ] humans have no trouble dealing with these kind of situations. They take the risk of being wrong into account. So in our world, the useful numbers are 0, 1, measured N, and estimated N, but we don't distinguish between measured N and estimated N. But that doesn't mean that OLTP customers would be willing to accept slightly suboptimal plans to mitigate a risk they don't experience. > Insurance is also about personal preference / risk tolerance. Maybe I'm > fine with accepting risk that my house burns down, or whatever ... Right, and that's why the problem mentioned above is still out there annoying customers who have complex plans. To them it looks like an obviously bad plan choice. Something that might help is to have the planner cost be a structure instead of a number. Costs of plans that are deemed "risky" are accumulated separately from plans that make no risky choices, and for a given set of join items you keep the minimum cost plan of both types. It may happen that all plans eventually make a risky choice, in which case you take the plan with the minimum cost, but if there exists a plan with no risky choices, then the minimum cost plan with no risky choices is chosen, with a GUC that enables a customer to ignore risk when making this choice. This is not in the spirit of the hoped for simple heuristic, and it would be heuristic in its classification of plans that are risky, but in the NLJ case the cost of an unparameterized NLJ could be deemed risky if the cardinality of the inner relation is not 0, 1, or measured N.
On Mon, Jun 21, 2021 at 4:52 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > I'm willing to take some flak if there's not an easy proof that the outer > scan is single-row, but I don't think we should just up and change it > for cases where there is. I think that's a reasonable request. I'm not sure that I believe it's 100% necessary, but it's certainly an improvement on a technical level, and given that the proposed change could impact quite a lot of plans, it's fair to want to see some effort being put into mitigating the possible downsides. Now, I'm not sure when I might have time to actually try to do the work, which kind of sucks, but that's how it goes sometimes. -- Robert Haas EDB: http://www.enterprisedb.com
On Tue, Jun 22, 2021 at 2:53 AM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > Yeah, I like the insurance analogy - it gets to the crux of the problem, > because insurance is pretty much exactly about managing risk. The user's exposure to harm is what truly matters. I admit that that's very hard to quantify, but we should at least try to do so. We sometimes think about a plan that is 10x slower as if it's infinitely slow, or might as well be. But it's usually not like that -- it is generally meaningfully much better than the plan being 100x slower, which is itself sometimes appreciably better than 1000x slower. And besides, users often don't get anything like the optimal plan, even on what they would consider to be a good day (which is most days). So maybe 10x slower is actually the baseline good case already, without anybody knowing it. Most individual queries are not executed very often, even on the busiest databases. The extremes really do matter a lot. If a web app or OLTP query is ~10x slower than optimal then it might be the practical equivalent of an outage that affects the query alone (i.e. "infinitely slow") -- but probably not. I think that it is worth paying more than nothing to avoid plans that are so far from optimal that they might as well take forever to execute. This is not meaningfully different from a database outage affecting one particular query. It kind of is in a category of its own that surpasses "slow plan", albeit one that is very difficult or impossible to define formally. There may be a huge amount of variation in risk tolerance among basically reasonable people. For example, if somebody chooses to engage in some kind of extreme sport, to me it seems understandable. It's just not my cup of tea. Whereas if somebody chooses to never wear a seatbelt while driving, then to me they're simply behaving foolishly. They're not willing to incur the tiniest inconvenience in order to get a huge reduction in potential harm -- including a very real risk of approximately the worst thing that can happen to you. Sure, refusing to wear a seatbelt can theoretically be classified as just another point on the risk tolerance spectrum, but that seems utterly contrived to me. Some things (maybe not that many) really are like that, or can at least be assumed to work that way as a practical matter. > But making > everything slower will be a hard sell, because wast majority of > workloads already running on Postgres don't have this issue at all, so > for them it's not worth the expense. I think that we're accepting too much risk here. But I bet it's also true that we're not taking enough risk in other areas. That was really my point with the insurance analogy -- we can afford to take lots of individual risks as long as they don't increase our exposure to truly disastrous outcomes -- by which I mean queries that might as well take forever to execute as far as the user is concerned. (Easier said than done, of course.) A simple trade-off between fast and robust doesn't seem like a universally helpful thing. Sometimes it's a very unhelpful way of looking at the situation. If you make something more robust to extreme bad outcomes, then you may have simultaneously made it *faster* (not slower) for all practical purposes. This can happen when the increase in robustness allows the user to tune the system aggressively, and only take on new risks that they can truly live with (which wouldn't have been possible without the increase in robustness). For example, I imagine that the failsafe mechanism added to VACUUM will actually make it possible to tune VACUUM much more aggressively -- it might actually end up significantly improving performance for all practical purposes, even though technically it has nothing to do with performance. Having your indexes a little more bloated because the failsafe kicked-in is a survivable event -- the DBA lives to fight another day, and *learns* to tune vacuum/the app so it doesn't happen again and again. An anti-wraparound failure is perhaps not a survivable event -- the DBA gets fired. This really does seem like a fundamental difference to me. > Following the insurance analogy, > selling tornado insurance in Europe is mostly pointless. Principled skepticism of this kind of thing is of course necessary and welcome. It *could* be taken too far. > And the lack of data also plays role - the insurance company will ask > for higher rates when it does not have enough accurate data about the > phenomenon, or when there's a lot of unknowns. Maybe this would allow > some basic measure of uncertainty, based on the number and type of > restrictions, joins, etc. I don't think that you can really model uncertainty. But you can have true certainty (or close to it) about a trade-off that makes the system fundamentally more robust over time. You can largely be certain about both the cost of the insurance, as well as how it ameliorates the problem in at least some cases. > So maybe some fairly rough measure of uncertainty might work, and the > user might specify how much risk it's willing to tolerate. I think that most or all of the interesting stuff is where you have this extreme asymmetry -- places where it's much more likely to be true that basically everybody wants that. Kind of like wearing seatbelts -- things that we really can claim are a universal good without too much controversy. There might be as few as one or two things in the optimizer that this could be said of. But they matter. -- Peter Geoghegan
I think that it is worth paying more than nothing to avoid plans that are so far from optimal that they might as well take forever to execute
On Tue, Jun 22, 2021 at 2:53 AM Tomas Vondra
<tomas.vondra@ enterprisedb. com> wrote: Yeah, I like the insurance analogy - it gets to the crux of the problem, because insurance is pretty much exactly about managing risk.
The user's exposure to harm is what truly matters. I admit that that's very hard to quantify, but we should at least try to do so.
We sometimes think about a plan that is 10x slower as if it's infinitely slow, or might as well be. But it's usually not like that
-- it is generally meaningfully much better than the plan being 100x slower, which is itself sometimes appreciably better than 1000x slower. And besides, users often don't get anything like the optimal plan, even on what they would consider to be a good day (which is most days). So maybe 10x slower is actually the baseline good case already, without anybody knowing it. Most individual queries are not executed very often, even on the busiest databases. The extremes really do matter a lot.If a web app or OLTP query is ~10x slower than optimal then it might be the practical equivalent of an outage that affects the query alone
(i.e. "infinitely slow") -- but probably not. I think that it is worth paying more than nothing to avoid plans that are so far from optimal that they might as well take forever to execute. This is not meaningfully different from a database outage affecting one particular query. It kind of is in a category of its own that surpasses "slow plan", albeit one that is very difficult or impossible to define formally.There may be a huge amount of variation in risk tolerance among basically reasonable people. For example, if somebody chooses to engage in some kind of extreme sport, to me it seems understandable. It's just not my cup of tea. Whereas if somebody chooses to never wear a seatbelt while driving, then to me they're simply behaving foolishly. They're not willing to incur the tiniest inconvenience in order to get a huge reduction in potential harm -- including a very real risk of approximately the worst thing that can happen to you. Sure, refusing to wear a seatbelt can theoretically be classified as just another point on the risk tolerance spectrum, but that seems utterly contrived to me. Some things (maybe not that many) really are like that, or can at least be assumed to work that way as a practical matter.
But making
everything slower will be a hard sell, because wast majority of workloads already running on Postgres don't have this issue at all, so for them it's not worth the expense.I think that we're accepting too much risk here. But I bet it's also true that we're not taking enough risk in other areas. That was really my point with the insurance analogy -- we can afford to take lots of individual risks as long as they don't increase our exposure to truly disastrous outcomes -- by which I mean queries that might as well take forever to execute as far as the user is concerned. (Easier said than done, of course.)
A simple trade-off between fast and robust doesn't seem like a universally helpful thing. Sometimes it's a very unhelpful way of looking at the situation. If you make something more robust to extreme bad outcomes, then you may have simultaneously made it *faster* (not slower) for all practical purposes. This can happen when the increase in robustness allows the user to tune the system aggressively, and only take on new risks that they can truly live with (which wouldn't have been possible without the increase in robustness). For example, I imagine that the failsafe mechanism added to VACUUM will actually make it possible to tune VACUUM much more aggressively -- it might actually end up significantly improving performance for all practical purposes, even though technically it has nothing to do with performance.
Having your indexes a little more bloated because the failsafe kicked-in is a survivable event -- the DBA lives to fight another day, and *learns* to tune vacuum/the app so it doesn't happen again and again. An anti-wraparound failure is perhaps not a survivable event -- the DBA gets fired. This really does seem like a fundamental difference to me.
Following the insurance analogy,
selling tornado insurance in Europe is mostly pointless.Principled skepticism of this kind of thing is of course necessary and welcome. It *could* be taken too far.
And the lack of data also plays role - the insurance company will ask for higher rates when it does not have enough accurate data about the phenomenon, or when there's a lot of unknowns. Maybe this would allow some basic measure of uncertainty, based on the number and type of restrictions, joins, etc.
I don't think that you can really model uncertainty. But you can have true certainty (or close to it) about a trade-off that makes the system fundamentally more robust over time. You can largely be certain about both the cost of the insurance, as well as how it ameliorates the problem in at least some cases.
So maybe some fairly rough measure of uncertainty might work, and the user might specify how much risk it's willing to tolerate.
I think that most or all of the interesting stuff is where you have this extreme asymmetry -- places where it's much more likely to be true that basically everybody wants that. Kind of like wearing seatbelts -- things that we really can claim are a universal good without too much controversy. There might be as few as one or two things in the optimizer that this could be said of. But they matter.
--
Peter Geoghegan
On Tue, Jun 22, 2021 at 4:13 AM Robert Haas <robertmhaas@gmail.com> wrote: > I think that's a reasonable request. I'm not sure that I believe it's > 100% necessary, but it's certainly an improvement on a technical > level, and given that the proposed change could impact quite a lot of > plans, it's fair to want to see some effort being put into mitigating > the possible downsides. Now, I'm not sure when I might have time to > actually try to do the work, which kind of sucks, but that's how it > goes sometimes. Should I take it that you've dropped this project? I was rather hoping that you'd get back to it, FWIW. -- Peter Geoghegan
On Wed, Jan 12, 2022 at 7:20 PM Peter Geoghegan <pg@bowt.ie> wrote: > Should I take it that you've dropped this project? I was rather hoping > that you'd get back to it, FWIW. Honestly, I'd forgotten all about it. While in theory I'd like to do something about this, but I just have too many other things going on. -- Robert Haas EDB: http://www.enterprisedb.com
I'd like to revamp this important discussion. As is well described in this fairly recent paper here https://www.vldb.org/pvldb/vol9/p204-leis.pdf (which also looks atPostgres) "estimation errors quickly grow as the number of joins increases, and that these errors are usually the reasonfor bad plans" - I think we can all get behind that statement. While nested loop joins work great when cardinality estimates are correct, they are notoriously bad when the optimizer underestimatesand they degrade very fast in such cases - the good vs. bad here is very asymmetric. On the other hand hashjoins degrade much more gracefully - they are considered very robust against underestimation. The above mentioned paperillustrates that all mayor DBMS (including Postgres) tend to underestimate and usually that underestimation increasesdrastically with the number of joins (see Figures 3+4 of the paper). Now, a simple approach to guarding against bad plans that arise from underestimation could be to use what I would call anested-loop-conviction-multiplier based on the current depth of the join tree, e.g. for a base table that multiplier wouldobviously be 1, but would then grow (e.g.) quadratically. That conviction-multiplier would *NOT* be used to skew thecardinality estimates themselves, but rather be applied to the overall nested loop join cost at each particular stageof the plan when comparing it to other more robust join strategies like hash or sort-merge joins. That way when we canbe sure to have a good estimate at the bottom of the join tree we treat all things equal, but favor nested loops lessand less as we move up the join tree for the sake of robustness. Also, we can expand the multiplier whenever we fall back to using the default cardinality constant as surely all bets areoff at that point - we should definitely treat nested loop joins as out of favor in this instance and that could easilybe incorporated by simply increasing the conviction-mutliplier. What are your thoughts on this simple idea - is it perhaps too simple? Cheers, Ben
On Thu, Sep 29, 2022 at 7:32 AM Benjamin Coutu <ben.coutu@zeyos.com> wrote: > Also, we can expand the multiplier whenever we fall back to using the default cardinality constant as surely all bets areoff at that point - we should definitely treat nested loop joins as out of favor in this instance and that could easilybe incorporated by simply increasing the conviction-mutliplier. > > What are your thoughts on this simple idea - is it perhaps too simple? Offhand I'd say it's more likely to be too complicated. Without meaning to sound glib, the first question that occurs to me is "will we also need a conviction multiplier conviction multiplier?". Anything like that is going to have unintended consequences that might very well be much worse than the problem that you set out to solve. Personally I still like the idea of just avoiding unparameterized nested loop joins altogether when an "equivalent" hash join plan is available. I think of it as preferring the hash join plan because it will have virtually the same performance characteristics when you have a good cardinality estimate (probably very often), but far better performance characteristics when you don't. We can perhaps be approximately 100% sure that something like that will be true in all cases, no matter the details. That seems like a very different concept to what you've proposed. -- Peter Geoghegan
On Thu, Sep 29, 2022 at 04:12:06PM -0700, Peter Geoghegan wrote: > Offhand I'd say it's more likely to be too complicated. Without > meaning to sound glib, the first question that occurs to me is "will > we also need a conviction multiplier conviction multiplier?". Anything > like that is going to have unintended consequences that might very > well be much worse than the problem that you set out to solve. > > Personally I still like the idea of just avoiding unparameterized > nested loop joins altogether when an "equivalent" hash join plan is > available. I think of it as preferring the hash join plan because it > will have virtually the same performance characteristics when you have > a good cardinality estimate (probably very often), but far better > performance characteristics when you don't. We can perhaps be > approximately 100% sure that something like that will be true in all > cases, no matter the details. That seems like a very different concept > to what you've proposed. I think the point the original poster as making, and I have made in the past, is that even of two optimizer costs are the same, one might be more penalized by misestimation than the other, and we don't have a good way of figuring that into our plan choices. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
On Thu, Sep 29, 2022 at 4:27 PM Bruce Momjian <bruce@momjian.us> wrote: > I think the point the original poster as making, and I have made in the > past, is that even of two optimizer costs are the same, one might be > more penalized by misestimation than the other, and we don't have a good > way of figuring that into our plan choices. Right. But that seems fraught with difficulty. I suspect that the costs that the planner attributes to each plan often aren't very reliable in any absolute sense, even when everything is working very well by every available metric. Even a very noisy cost model with somewhat inaccurate selectivity estimates will often pick the cheapest plan, or close enough. Having a cost-based optimizer that determines the cheapest plan quite reliably is one thing. Taking the same underlying information and adding the dimension of risk to it and expecting a useful result is quite another -- that seems far harder. -- Peter Geoghegan
Bruce Momjian <bruce@momjian.us> writes: > I think the point the original poster as making, and I have made in the > past, is that even of two optimizer costs are the same, one might be > more penalized by misestimation than the other, and we don't have a good > way of figuring that into our plan choices. Agreed, but dealing with uncertainty in those numbers is an enormous task if you want to do it right. "Doing it right", IMV, would start out by extending all the selectivity estimation functions to include error bars; then we could have error bars on rowcount estimates and then costs; then we could start adding policies about avoiding plans with too large a possible upper-bound cost. Trying to add such policy with no data to go on is not going to work well. I think Peter's point is that a quick-n-dirty patch is likely to make as many cases worse as it makes better. That's certainly my opinion about the topic. regards, tom lane
On Thu, Sep 29, 2022 at 07:46:18PM -0400, Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > I think the point the original poster as making, and I have made in the > > past, is that even of two optimizer costs are the same, one might be > > more penalized by misestimation than the other, and we don't have a good > > way of figuring that into our plan choices. > > Agreed, but dealing with uncertainty in those numbers is an enormous > task if you want to do it right. "Doing it right", IMV, would start > out by extending all the selectivity estimation functions to include > error bars; then we could have error bars on rowcount estimates and > then costs; then we could start adding policies about avoiding plans > with too large a possible upper-bound cost. Trying to add such > policy with no data to go on is not going to work well. > > I think Peter's point is that a quick-n-dirty patch is likely to make > as many cases worse as it makes better. That's certainly my opinion > about the topic. Agreed on all points --- I was thinking error bars too. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
On Thu, Sep 29, 2022 at 4:46 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Agreed, but dealing with uncertainty in those numbers is an enormous > task if you want to do it right. "Doing it right", IMV, would start > out by extending all the selectivity estimation functions to include > error bars; then we could have error bars on rowcount estimates and > then costs; then we could start adding policies about avoiding plans > with too large a possible upper-bound cost. Trying to add such > policy with no data to go on is not going to work well. In general I suspect that we'd be better off focussing on mitigating the impact at execution time. There are at least a few things that we could do there, at least in theory. Mostly very ambitious, long term things. I like the idea of just avoiding unparameterized nested loop joins altogether when an "equivalent" hash join plan is available because it's akin to an execution-time mitigation, despite the fact that it happens during planning. While it doesn't actually change anything in the executor, it is built on the observation that we have virtually everything to gain and nothing to lose during execution, no matter what happens. It seems like a very small oasis of certainty in a desert of uncertainty -- which seems nice, as far as it goes. > I think Peter's point is that a quick-n-dirty patch is likely to make > as many cases worse as it makes better. That's certainly my opinion > about the topic. Right. Though I am actually sympathetic to the idea that users might gladly pay a cost for performance stability -- even a fairly large cost. That part doesn't seem like the problem. -- Peter Geoghegan
On Thu, Sep 29, 2022 at 07:51:47PM -0400, Bruce Momjian wrote: > On Thu, Sep 29, 2022 at 07:46:18PM -0400, Tom Lane wrote: > > Agreed, but dealing with uncertainty in those numbers is an enormous > > task if you want to do it right. "Doing it right", IMV, would start > > out by extending all the selectivity estimation functions to include > > error bars; then we could have error bars on rowcount estimates and > > then costs; then we could start adding policies about avoiding plans > > with too large a possible upper-bound cost. Trying to add such > > policy with no data to go on is not going to work well. > > > > I think Peter's point is that a quick-n-dirty patch is likely to make > > as many cases worse as it makes better. That's certainly my opinion > > about the topic. > > Agreed on all points --- I was thinking error bars too. Actually, if we wanted to improve things in this area, we should have a set of queries that don't chose optimal plans we can test with. We used to see them a lot before we had extended statistics, but I don't remember seeing many recently, let alone a collection of them. I guess that is good. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
On Fri, 30 Sept 2022 at 13:06, Peter Geoghegan <pg@bowt.ie> wrote: > I like the idea of just avoiding unparameterized nested loop joins > altogether when an "equivalent" hash join plan is available because > it's akin to an execution-time mitigation, despite the fact that it > happens during planning. While it doesn't actually change anything in > the executor, it is built on the observation that we have virtually > everything to gain and nothing to lose during execution, no matter > what happens. I'm not sure if it's a good idea to assume that performing non-parameterised Nested Loops when we shouldn't is the only shape of plan that causes us problems. We also have the case where we assume early start-up plans are favourable. For example: SELECT * FROM t WHERE a = 1 ORDER BY b LIMIT 10; where we have two indexes, one on t(a) and another on t(b). Should we use the t(b) index and filter out the rows that don't match a = 1 and hope we get 10 a=1 rows soon in the t(b) index? or do we use t(a) and then perform a sort? Best case for using the t(b) index is that we find 10 a=1 rows in the first 10 rows of the index scan, the worst case is that there are no rows with a=1. Having something coded into the cost model is a more generic way of addressing this issue. Providing we design the cost model correctly, we'd be able to address future issues we discover using which ever cost model infrastructure that we design for this. I understand that what you propose would be a fast way to fix this issue. However, if we went and changed the join path creation code to not add non-parameterised nested loop paths when other paths exist, then how could we ever dare to put that code back again when we come up with a better solution? David
> Right. But that seems fraught with difficulty. I suspect that the > costs that the planner attributes to each plan often aren't very > reliable in any absolute sense, even when everything is working very > well by every available metric. Even a very noisy cost model with > somewhat inaccurate selectivity estimates will often pick the cheapest > plan, or close enough. Sure, the absolute cost of a complex plan will always be inaccurate at best. My point is that we can be very confident in the cardinalities of base tables. As the paper states in "3.1. Estimates forBase Tables": "The median q-error is close to the optimal value of 1 for all systems, indicating that the majority of all selections are estimated correctly." Thanks to the statistics will practically never be off by an order of magnitude when estimating base table cardinalities. The paper also clearly shows (and that certainly coincides with my experience) that those cardinality underestimations growexponentially as they propagate up the join tree. Given the research I'd stipulate that at any given level of the join tree, the current depth is a reasonable indicator ofunderestimation. Taking that into account (even if only to mitigate nested loops on higher levels) is IMV a principledapproach, and not necesseraly a hack. Obviously having something like error bars as proposed by Tom would be even better and perhaps more general, but that ison a whole different level in terms of complexity and I certainly have no idea how we would easily get there.
> In general I suspect that we'd be better off focussing on mitigating > the impact at execution time. There are at least a few things that we > could do there, at least in theory. Mostly very ambitious, long term > things. I think these things are orthogonal. No matter how good the cost model ever gets, we will always have degenerate cases. Having some smarts about that in the executor is surely a good thing, but it shouldn't distract us from improving on theplanner front. > > I like the idea of just avoiding unparameterized nested loop joins > altogether when an "equivalent" hash join plan is available because > it's akin to an execution-time mitigation, despite the fact that it > happens during planning. While it doesn't actually change anything in > the executor, it is built on the observation that we have virtually > everything to gain and nothing to lose during execution, no matter > what happens. I agree with you, that those plans are too risky. But let's maybe find a more general way of dealing with this. > Right. Though I am actually sympathetic to the idea that users might > gladly pay a cost for performance stability -- even a fairly large > cost. That part doesn't seem like the problem.
> Agreed, but dealing with uncertainty in those numbers is an enormous > task if you want to do it right. "Doing it right", IMV, would start > out by extending all the selectivity estimation functions to include > error bars; then we could have error bars on rowcount estimates and > then costs; then we could start adding policies about avoiding plans > with too large a possible upper-bound cost. Trying to add such > policy with no data to go on is not going to work well. Error bars would be fantastic, no question. But that would make things very complex. A lot of judgment calls would be necessary for the policy behind upper-bound pruning, picking up on Peter's comment about"conviction multiplier of conviction multiplier" ;) Also, the math in deriving those bounds based on the stats and how they propagate up the join tree doesn't seem trivial either. > I think Peter's point is that a quick-n-dirty patch is likely to make > as many cases worse as it makes better. That's certainly my opinion > about the topic. As in my reply to Peter, I think the join level/depth metric is a simple but principled way of dealing with it, given thereferenced research. In the first step, we'd use this merely to be more risk-averse towards nested loop joins as we climb up the join tree - weare not fiddling with the cost model itself, nor the join ordering, just when it comes to considering that particular joinalgorithm. Later this could be expanded to be more broadly scoped. Please not give up on a simple way to reap most of the fruits just yet.
> Actually, if we wanted to improve things in this area, we should have a > set of queries that don't chose optimal plans we can test with. We used > to see them a lot before we had extended statistics, but I don't > remember seeing many recently, let alone a collection of them. I guess > that is good. In the VLDB paper they actually created their own "Join Order Benchmark", which is publicly available under https://github.com/gregrahn/join-order-benchmark It would probably be more suited for this kind of testing than, e.g. the TPC benchmarks. If there is interest, I could also compile a set of relevant cases based on the message history of the performance mailinglist.
On Thu, Sep 29, 2022 at 7:46 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Agreed, but dealing with uncertainty in those numbers is an enormous > task if you want to do it right. "Doing it right", IMV, would start > out by extending all the selectivity estimation functions to include > error bars; then we could have error bars on rowcount estimates and > then costs; then we could start adding policies about avoiding plans > with too large a possible upper-bound cost. Trying to add such > policy with no data to go on is not going to work well. I think that the point of the paper which started the thread that this discussion branched from was essentially that trying to add such a policy with no data to go on worked extremely well in practice, and other database systems are already doing it, and we're hurting ourselves by not doing it. And specifically what they did was to disfavor unparameterized nested loops. And I think that actually makes a lot of sense. If you think through parameterized nested loops, unparameterized nested loops, hash joins, and merge joins, which is basically all the strategies, and you imagine having many more or fewer rows on one side of the join or the other than you thought, unparameterized nested loops are the standout case. It's the only kind of join where the expense grows as the product of the sizes of the two inputs. The other cases tend to be more like O(n+m) rather than O(nm), and even if there are some lg n or lg m terms in there too they don't tend to make a whole lot of difference in practice. So I think what you would find if you did all of this analysis is that, basically, every time you did the cost computation for a parameterized nested loop, a hash join, or a merge join, the error bars would be whatever they were, and then when you did a cost computation for an unparameterized nested loop, the error bars would be way worse. Like, if we assume that the estimates for each side of a hash join are off by 100x, then the cost will be off by roughly 100x if it still fits in work_mem and by several hundred x if it now spills to disk. But if we assume the same thing for an unparameterized nested loop, the cost is now off by 10000x. And that is massively, massively more, so clearly we should avoid the unparameterized nested loop. But it's unnecessary to do this computation at runtime for every separate unparameterized nested loop: we can do it right here, in a generic way, for every such loop. Now, it's true that the actual risk depends on how certain we are of the estimates for the input rels, but that's difficult to quantify and I'm not convinced it really matters at all. Like, if the input is a base relation with no filter condition, then the error should be small, unless the table size changes a lot between planning and execution. If it's the output of a user-defined aggregate, the error could be really, really large. But that has no impact on the *relative* dangers of the unparameterized nested loop vs. some other join method. If join A is between two input rels whose sizes are probably known fairly precisely, and join B is between two input rels whose sizes we might be drastically wrong about, then a hash join is riskier for join B than it is for join A. But that really does not matter. What *does* matter is that an unparameterized nested loop is riskier for join A than a hash join is for join A; and likewise for join B. I think we're kind of just making life complicated for ourselves here by pretending that unparameterized nested loops are part of some general class of uncertainty problems that we need to worry about. In some sense, they are, and David Rowley is right to mention the other one that comes up pretty frequently. But like that list of two is pretty much the whole list. I think we've talked ourselves into believing that this problem is much harder than it really is. Maybe a blanket ban on unparameterized nested loops is too strong (or maybe it's exactly the right thing) but it can't possibly be wrong to think about that case in particular as something we need to solve. It's the only join method that can go quadratic in easy, realistic scenarios -- and it often does. -- Robert Haas EDB: http://www.enterprisedb.com
On Fri, Sep 30, 2022 at 10:43 AM Robert Haas <robertmhaas@gmail.com> wrote: > But it's unnecessary to do this computation at runtime for every separate > unparameterized nested loop: we can do it right here, in a generic > way, for every such loop. It's not just that the risks are ludicrously high, of course. The potential benefits must *also* be very low. It's both factors, together. > Now, it's true that the actual risk depends on how certain we are of > the estimates for the input rels, but that's difficult to quantify and > I'm not convinced it really matters at all. We're talking about a problem that is fairly unlikely to occur in general, I think -- let's not forget that. These are presumably rare events that nevertheless cause many real practical problems. If we're going to add error bars, why wouldn't we also need error bars for our error bars? > I think we're kind of just making life complicated for ourselves here > by pretending that unparameterized nested loops are part of some > general class of uncertainty problems that we need to worry about. In > some sense, they are, and David Rowley is right to mention the other > one that comes up pretty frequently. But like that list of two is > pretty much the whole list. I think we've talked ourselves into > believing that this problem is much harder than it really is. +1 -- Peter Geoghegan
On Thu, Sep 29, 2022 at 9:00 PM David Rowley <dgrowleyml@gmail.com> wrote: > I understand that what you propose would be a fast way to fix this > issue. However, if we went and changed the join path creation code to > not add non-parameterised nested loop paths when other paths exist, > then how could we ever dare to put that code back again when we come > up with a better solution? But why would it matter, even then? I don't deny that something like that could make sense, but I don't see why it should be in tension with this proposal. We're talking about a plan shape that is (in practical terms) inherently unreasonable, given the availability of an alternative plan shape. Why wouldn't that continue to be true in every such case, forever? To put it another way, the proposal seems like taking away something that we don't want to have, ever. It seems like a subtractive thing to me. The potential upside of allowing unparameterized nestloop joins seems infinitesimal; zero for all practical purposes. So even with a far more sophisticated framework for "plan riskiness" in place, it would still make sense to treat unparameterized nestloop joins as inherently undesirable. There is perhaps a theoretical sense in which that isn't quite true, but it's true for all practical purposes, which should be enough. -- Peter Geoghegan
On Thu, Sep 29, 2022 at 11:44 PM Benjamin Coutu <ben.coutu@zeyos.com> wrote: > I think these things are orthogonal. I agree that they're orthogonal. I just meant that execution time strategies seem underexplored in general. > No matter how good the cost model ever gets, we will always have degenerate cases. Sure, but the model isn't the problem here, really -- not to me. The problem is that the planner can in some cases choose a plan that is inherently unreasonable, at least in practical terms. You're talking about uncertainties. But I'm actually talking about the opposite thing -- certainty (albeit a limited kind of certainty that applies only to one narrow set of conditions). It's theoretically possible that bogosort will be faster than quicksort in some individual cases. After all, bogosort is O(n) in the best case, which is impossible to beat! But there is no practical sense in which bogosort could ever be better than quicksort. Having fewer choices by just not offering inherently bad choices seems quite unrelated to what you're talking about. For all I know you might be onto something. But it really seems independent to me. -- Peter Geoghegan
> Sure, but the model isn't the problem here, really -- not to me. The > problem is that the planner can in some cases choose a plan that is > inherently unreasonable, at least in practical terms. You're talking > about uncertainties. But I'm actually talking about the opposite thing > -- certainty (albeit a limited kind of certainty that applies only to > one narrow set of conditions). I absolutely agree and support your proposal to simply not generate those paths at all unless necessary. > For all I know you might be onto something. But it really seems > independent to me. Yeah, I‘m sorry if I highjacked this thread for something related but technically different. I just wanted to expand on yourproposal by taking into account the join depth and also not just talking about unparametrized nested loop joins. Theresearch is very clear that the uncertainty is proportional to the join level, and that is the point I am trying to focusthe discussion on. I really encourage everyone to read the VLDB paper. BTW, the unnamed proprietary DBMSs in that paper are the 3 big ones fromWashington, California and NY, in that order.
On Fri, Sep 30, 2022 at 2:24 PM Peter Geoghegan <pg@bowt.ie> wrote: > It's not just that the risks are ludicrously high, of course. The > potential benefits must *also* be very low. It's both factors, > together. Hmm, maybe. But it also wouldn't surprise me very much if someone can come up with a test case where a nested loop with a single row (or maybe no rows) on one side or the other and it's significantly faster than any alternative plan. I believe, though, that even if such cases exist, they are probably relatively few in number compared to the cases where parameterized nested loops hurt, and the savings are probably small compared to the multiple-orders-of-magnitude slowdowns that you can get when a nested loop goes bad. But they might still be relatively large -- 2x, 3x? -- in absolute terms. In the prior discussion, the only person who showed a case in which he thought that an unparameterized nested loop might be a clear winner was Tom, but it was just sort of a general "this kind of case might be a problem" thing rather than a fully worked out example with real timings. Perhaps someone ought to try to characterize the kinds of cases he mentioned, to help us get a clearer feeling about what, if anything, we're gaining from the current scheme. -- Robert Haas EDB: http://www.enterprisedb.com
On Sun, Oct 2, 2022 at 3:43 AM Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Sep 30, 2022 at 2:24 PM Peter Geoghegan <pg@bowt.ie> wrote: > > It's not just that the risks are ludicrously high, of course. The > > potential benefits must *also* be very low. It's both factors, > > together. > > Hmm, maybe. But it also wouldn't surprise me very much if someone can > come up with a test case where a nested loop with a single row (or > maybe no rows) on one side or the other and it's significantly faster > than any alternative plan. That's certainly possible, but wouldn't the difference all come from fixed startup costs? If we're talking about a single row, with a minimal test case, then the potential downside of this more conservative strategy might indeed amount to something like a 2x or 3x slowdown, if we look at it in isolation. But why measure it that way? I think that absolute differences like milliseconds of execution time are much more relevant. Real production databases have many queries with very diverse characteristics -- there is a lot going on at any given moment. The proportion of queries that will be affected either way by avoiding unparamaterized nested loop joins is probably going to be very small. Nobody is going to notice if only a small subset or all queries are maybe 1 ms or 2 ms slower. As long as it's well within the margin of noise in 100% of all cases, it really shouldn't matter. AFAICT the choice is one of "bounded, low upside versus unbounded, high downside". > I believe, though, that even if such cases > exist, they are probably relatively few in number compared to the > cases where parameterized nested loops hurt, and the savings are > probably small compared to the multiple-orders-of-magnitude slowdowns > that you can get when a nested loop goes bad. But they might still be > relatively large -- 2x, 3x? -- in absolute terms. I suspect it won't even matter if disallowing unparamaterized nested loop joins loses on average. I am reminded of this: https://en.wikipedia.org/wiki/St._Petersburg_paradox#Expected_utility_theory -- Peter Geoghegan
On Fri, Sep 30, 2022 at 3:19 PM Benjamin Coutu <ben.coutu@zeyos.com> wrote: > > For all I know you might be onto something. But it really seems > > independent to me. > > Yeah, I‘m sorry if I highjacked this thread for something related but technically different. I certainly wouldn't say that you hijacked the thread. I'm glad that you revived the discussion, in fact. The way that I've framed the problem is probably at least somewhat controversial. In fact I'm almost certain that at least one or two people will flat out disagree with me. But even if everybody else already thought about unparameterized nested loop joins in the same terms, it might still be useful to make the arguments that you've made. What I'm saying is that the probability of "getting it right" is virtually irrelevant in the case of these unparameterized nested loop join plans specifically. Any probability that's less than 1.0 is already unacceptable, more or less. A probability of 1.0 is never unattainable in the real world, no matter what, so why should the true probability (whatever that means) matter at all? The appropriate course of action will still be "just don't do that, ever". To me this dynamic seems qualitatively different to other cases, where we might want to give some weight to uncertainty. Understanding where the boundaries lie between those trickier cases and this simpler case seems important and relevant to me. -- Peter Geoghegan
On 29/9/2022 21:32, Benjamin Coutu wrote: > I'd like to revamp this important discussion. > > As is well described in this fairly recent paper here https://www.vldb.org/pvldb/vol9/p204-leis.pdf (which also looks atPostgres) "estimation errors quickly grow as the number of joins increases, and that these errors are usually the reasonfor bad plans" - I think we can all get behind that statement. > > While nested loop joins work great when cardinality estimates are correct, they are notoriously bad when the optimizerunderestimates and they degrade very fast in such cases - the good vs. bad here is very asymmetric. On the otherhand hash joins degrade much more gracefully - they are considered very robust against underestimation. The above mentionedpaper illustrates that all mayor DBMS (including Postgres) tend to underestimate and usually that underestimationincreases drastically with the number of joins (see Figures 3+4 of the paper). > > Now, a simple approach to guarding against bad plans that arise from underestimation could be to use what I would calla nested-loop-conviction-multiplier based on the current depth of the join tree, e.g. for a base table that multiplierwould obviously be 1, but would then grow (e.g.) quadratically. That conviction-multiplier would *NOT* be usedto skew the cardinality estimates themselves, but rather be applied to the overall nested loop join cost at each particularstage of the plan when comparing it to other more robust join strategies like hash or sort-merge joins. That waywhen we can be sure to have a good estimate at the bottom of the join tree we treat all things equal, but favor nestedloops less and less as we move up the join tree for the sake of robustness. > Also, we can expand the multiplier whenever we fall back to using the default cardinality constant as surely all bets areoff at that point - we should definitely treat nested loop joins as out of favor in this instance and that could easilybe incorporated by simply increasing the conviction-mutliplier. > > What are your thoughts on this simple idea - is it perhaps too simple? In my practice, parameterized nested loop reduces, sometimes drastically, execution time. If your query touches a lot of tables but extracts only a tiny part of the data, and you have good coverage by indexes, PNL works great. Moreover, I have pondered extending parameterization through subqueries and groupings. What could you say about a different way: hybrid join? In MS SQL Server, they have such a feature [1], and, according to their description, it requires low overhead. They start from HashJoin and switch to NestLoop if the inner input contains too small tuples. It solves the issue, Isn't it? [1] https://techcommunity.microsoft.com/t5/sql-server-blog/introducing-batch-mode-adaptive-joins/ba-p/385411 -- regards, Andrey Lepikhov Postgres Professional
On Wed, 20 Sept 2023 at 19:56, Andrey Lepikhov <a.lepikhov@postgrespro.ru> wrote: > What could you say about a different way: hybrid join? In MS SQL Server, > they have such a feature [1], and, according to their description, it > requires low overhead. They start from HashJoin and switch to NestLoop > if the inner input contains too small tuples. It solves the issue, Isn't it? A complexity which you may not be considering here is that Nested Loop joins always preserve the tuple order from the outer side of the join, whereas hash joins will not do this when multi-batching. I've no idea how the SQL Server engineers solved that. David > [1] > https://techcommunity.microsoft.com/t5/sql-server-blog/introducing-batch-mode-adaptive-joins/ba-p/385411
On Wed, Sep 20, 2023, at 4:49 PM, David Rowley wrote: > On Wed, 20 Sept 2023 at 19:56, Andrey Lepikhov > <a.lepikhov@postgrespro.ru> wrote: >> What could you say about a different way: hybrid join? In MS SQL Server, >> they have such a feature [1], and, according to their description, it >> requires low overhead. They start from HashJoin and switch to NestLoop >> if the inner input contains too small tuples. It solves the issue, Isn't it? > > A complexity which you may not be considering here is that Nested Loop > joins always preserve the tuple order from the outer side of the join, > whereas hash joins will not do this when multi-batching. My idea here is the same as MS SQL guys did: prefetch from the HashJoin inner some predefined number of tuples and, if theplanner has made a mistake and overestimated it, move hash join inner to NestLoop as an outer. The opposite strategy, "underestimation" - starting with NestLoop and switching to HashJoin looks more difficult, but themain question is: is it worthwhile to research? > I've no idea how the SQL Server engineers solved that. >> [1] >> https://techcommunity.microsoft.com/t5/sql-server-blog/introducing-batch-mode-adaptive-joins/ba-p/385411 -- Regards, Andrei Lepikhov