Re: a path towards replacing GEQO with something better - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: a path towards replacing GEQO with something better |
Date | |
Msg-id | 2e930f7d-a3ea-793b-830b-54247f1a39e5@enterprisedb.com Whole thread Raw |
In response to | a path towards replacing GEQO with something better (John Naylor <john.naylor@enterprisedb.com>) |
Responses |
Re: a path towards replacing GEQO with something better
|
List | pgsql-hackers |
Hi, On 6/10/21 3:21 AM, John Naylor wrote: > Hi, > > On occasion it comes up that the genetic query optimizer (GEQO) doesn't > produce particularly great plans, and is slow ([1] for example). The > only alternative that has gotten as far as a prototype patch (as far as > I know) is simulated annealing some years ago, which didn't seem to get far. > > The join problem as it pertains to Postgres has been described within > the community in > [Gustaffson, 2017] and [Stroffek & Kovarik, 2007]. > > The fact that there is much more interest than code in this area > indicates that this is a hard problem. I hadn't given it much thought > myself until by chance I came across [Neumann, 2018], which describes a > number of interesting ideas. The key takeaway is that you want a > graceful transition between exhaustive search and heuristic search. In > other words, if the space of possible join orderings is just slightly > larger than the maximum allowed exhaustive search, then the search > should be *almost* exhaustive. Yeah, I think this is one of the places with the annoying "cliff edge" behavior in our code base, so an alternative that would degrade a bit more gracefully would be welcome. I only quickly read the [Neumann, 2018] paper over the weekend, and overall it seems like a very interesting/promising approach. Of course, the question is how well it can be combined with the rest of our code, and various other details from real-world queries (papers often ignore some of those bits for simplicity). > This not only increases the chances of finding a good plan, but also > has three engineering advantages I can think of: > > 1) It's natural to re-use data structures etc. already used by the > existing dynamic programming (DP) algorithm. Framing the problem as > extending DP greatly lowers the barrier to making realistic progress. If > the problem is framed as "find an algorithm as a complete drop-in > replacement for GEQO", it's a riskier project in my view. > True. > 2) We can still keep GEQO around (with some huge limit by default) for a > few years as an escape hatch, while we refine the replacement. If there > is some bug that prevents finding a plan, we can emit a WARNING and fall > back to GEQO. Or if a user encounters a regression in a big query, they > can lower the limit to restore the plan they had in an earlier version. > Not sure. Keeping significant amounts of code may not be free - both for maintenance and new features. It'd be a bit sad if someone proposed improvements to join planning, but had to do 2x the work to support it in both the DP and GEQO branches, or risk incompatibility. OTOH maybe this concern is unfounded in practice - I don't think we've done very many big changes to geqo in the last few years. > 3) It actually improves the existing exhaustive search, because the > complexity of the join order problem depends on the query shape: a > "chain" shape (linear) vs. a "star" shape (as in star schema), for the > most common examples. The size of the DP table grows like this (for n >= 4): > > Chain: (n^3 - n) / 6 (including bushy plans) > Star: (n - 1) * 2^(n - 2) > > n chain star > -------------------- > 4 10 12 > 5 20 32 > 6 35 80 > 7 56 192 > 8 84 448 > 9 120 1024 > 10 165 2304 > 11 220 5120 > 12 286 11264 > 13 364 24576 > 14 455 53248 > 15 560 114688 > ... > 64 43680 290536219160925437952 > > The math behind this is described in detail in [Ono & Lohman, 1990]. I > verified this in Postgres by instrumenting the planner to count how many > times it calls make_join_rel(). > So, did you verify it for star query with 64 relations? ;-) > Imagine having a "join enumeration budget" that, if exceeded, prevents > advancing to the next join level. Given the above numbers and a query > with some combination of chain and star shapes, a budget of 400 can > guarantee an exhaustive search when there are up to 8 relations. For a > pure chain join, we can do an exhaustive search on up to 13 relations, > for a similar cost of time and space. Out of curiosity I tested HEAD > with a chain query having 64 tables found in the SQLite tests [2] and > found exhaustive search to take only twice as long as GEQO. If we have > some 30-way join, and some (> 400) budget, it's actually possible that > we will complete the exhaustive search and get the optimal plan. This is > a bottom-up way of determining the complexity. Rather than configuring a > number-of-relations threshold and possibly have exponential behavior > blow up in their faces, users can configure something that somewhat > resembles the runtime cost. > Sound reasonable in principle, I think. This reminds me the proposals to have a GUC that'd determine how much effort should the planner invest into various optimizations. For OLTP it might be quite low, for large OLAP queries it'd be economical to spend more time trying some more expensive optimizations. The challenge of course is how / in what units / to define the budget, so that it's meaningful and understandable for users. Not sure if "number of join rels generated" will be clear enough for users. But it seems good enough for PoC / development, and hopefully people won't have to tweak it very often. For JIT we used the query cost, which is a term users are familiar with, but that's possible because we do the decision after the plan is built/costed. That doesn't work for join order search :-( > Now, I'll walk through one way that a greedy heuristic can integrate > with DP. In our 30-way join example, if we use up our budget and don't > have a valid plan, we'll break out of DP at the last join level > we completed. Since we already have built a number of joinrels, we build > upon that work as we proceed. The approach I have in mind is described > in [Kossmann & Stocker, 2000], which the authors call "iterative dynamic > programming" (IDP). I'll describe one of the possible variants here. > Let's say we only got as far as join level 8, so we've created up to > 8-way joinrels. We pick the best few (maybe 5%) of these 8-way joinrels > by some measure (doesn't have to be the full cost model) and on top of > each of them create a full plan quickly: At each join level, we only > pick one base relation (again by some measure) to create one new joinrel > and then move to the next join level. This is very fast, even with > hundreds of relations. > > Once we have one valid, complete plan, we can technically stop at any > time (Coding this much is also a good proof-of-concept). How much > additional effort we expend to find a good plan could be another budget > we have. With a complete plan obtained quickly, we also have an upper > bound on the measure of the cheapest overall plan, so with that we can > prune any more expensive plans as we iterate through the 8-way joinrels. > Once we have a set of complete plans, we pick some of them to improve > the part of the plan picked during the greedy step. For some size k > (typically between 4 and 7), we divide the greedy-step part of the join > into k-sized sections. So with our 30-table join where we started with > an 8-way joinrel, we have 22 tables. If k=5, we run standard dynamic > programming (including the standard cost model) on four 5-table sections > and then once the last 2-table section. > > You can also think of it like this: We quickly find 5 tables that likely > would be good to join next, find the optimal join order among the 5, > then add that to our joinrel. We keep doing that until we get a valid > plan. The only difference is, performing the greedy step to completion > allows us to prune subsequent bad intermediate steps. > I haven't read the [Kossmann & Stocker, 2000] paper yet, but the [Neumann, 2018] paper seems to build on it, and it seems to work with much larger subtrees of the join tree than k=5. > By "some measure" above I'm being a bit hand-wavy, but at least in the > literature I've read, fast heuristic algorithms seem to use simpler and > cheaper-to-compute metrics like intermediate result size or selectivity, > rather than a full cost function. That's a detail to be worked out. > Also, it must be said that in choosing among intermediate steps we need > to be careful to include things like: > > - interesting sort orders > - patition-wise joins > - parameterized paths > Yeah. I think this is going to be somewhat tricky - the paper seems to have very few details about dealing with criteria like this. OTOH the query plans for TPC-H/TPC-DS etc. seem to be quite good. What I find fairly interesting is the section in [Neumann, 2018] about cardinality estimates, and quality of query plans when the estimates are off. The last paragraph in the 6.5 section essentially says that despite poor estimates, the proposed algorithm performs better than the simple (and cheap) heuristics. I'm not sure what to think about that, because my "intuitive" understanding is that the more elaborate the planning is, the more errors it can make when the estimates are off. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: