Re: should we have a fast-path planning for OLTP starjoins? - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: should we have a fast-path planning for OLTP starjoins? |
Date | |
Msg-id | 3132376.1758656765@sss.pgh.pa.us Whole thread Raw |
In response to | Re: should we have a fast-path planning for OLTP starjoins? (Tomas Vondra <tomas@vondra.me>) |
List | pgsql-hackers |
[ sorry for ridiculously slow response to this ] Tomas Vondra <tomas@vondra.me> writes: > Here's a patch trying to do it more like this - by manipulating the > lists describing the join problems, before it's passed the the actual > join search algorithm (which is where the PoC patch did this). > I wonder if this is roughly the place where you imagined this would be > done, or if you envision some other issue with this approach. Cool. This is proof-of-concept that manipulating the joinlist can do what we need done. So we can move on to what heuristics we need to use. > I initially tried to manipulate the joinlist much earlier - pretty much > right at the end of deconstruct_jointree. But that turned out to be way > too early. To identify dimensions etc. we need to check stuff about > foreign keys, join clauses, ... and that's not available that early. > So I think this needs to happen much later in query_planner(), and the > patch does it right before the make_one_rel() call. Maybe that's too > late, but it needs to happen after match_foreign_keys_to_quals(), as it > relies on some of the FK info built by that call. Maybe we could call > match_foreign_keys_to_quals() earlier, but I don't quite see any > benefits of doing that ... I don't have a problem with doing it where you did it, but the comment should explain the placement. What you do have in the comment mostly belongs with the code, too; it's not the business of the caller. So in planmain.c something like + /* + * Try to simplify the join search problem for starjoin-like joins. + * This step relies on info about FK relationships, so we can't do it + * till after match_foreign_keys_to_quals(). + */ would be more appropriate IMO. I'd be slightly inclined to put the GUC test there, too: + if (enable_starjoin_join_search) + joinlist = starjoin_adjust_joins(root, joinlist); I agree that you need to worry about join order restrictions, and that it's not immediately clear how to do that. join_is_legal would work if we could call it, but the problem is that at this stage we'll only have RelOptInfos for base rels not join rels. If we have a joinlist (A B C D) and we are considering treating C as a dimension table, then the questions we have to ask are: (a) is it okay to build the (A B D) join without C? (b) is it okay to join (A B D) to C? In this simple case, I think (b) must be true if (a) is, but I'm not quite sure that that's so in more complex cases with multiple candidates for dimension tables. In any case, join_is_legal won't help us if we don't have join RelOptInfos. I'm inclined to start by using has_join_restriction: if that says "false" for a candidate dimension table, it should be safe to postpone the join to the dimension table. We might be able to refine that later. > The second example (create-2/select-2) is quite different, in that it's > nor a starjoin schema. It still joins one "main" table with multiple > "dimensions", but the FKs go in the other direction (to a single column > in the main table). But it has a very similar bottleneck - the order of > the joins is expensive, but it often does not matter very much, because > the query matches just one row anyway. And even if it returns more rows, > isn't the join order determined just by the selectivity of each join? > Maybe the starjoin optimization could be made to work for this type too? Yeah, I'm slightly itchy about relying on FKs in this heuristic at all; it doesn't seem like quite the right thing. I think we do want one side of the join to be joining to a PK or at least a unique index, but I'm not sure we need to insist on there being an FK relationship. A couple of minor coding notes: * There's no point in doing anything (except recursing) if the joinlist contains fewer than 3 items, and maybe as a further heuristic this shouldn't kick in till later yet, like 5 or so items. When there are just a few items, the possibility of missing the best plan seems to outweigh the minimal savings in plan time. * Joinlists never contain anything but RangeTblRefs and sub-lists. See make_rel_from_joinlist. * Your reconstructed joinlist is overly complicated. Instead of + newlist = list_make2(newlist, list_make1(lfirst(lc))); you could just do + newlist = list_make2(newlist, lfirst(lc)); because a single-element subproblem is useless. I notice that the patch doesn't apply cleanly anymore because of the introduction of guc_parameters.dat. So here's a v3 that rebases over that, and I took the liberty of fixing the joinlist construction as above, but I didn't do anything else. regards, tom lane diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c index 2a3dea88a94..5417c10fbf8 100644 --- a/src/backend/optimizer/plan/analyzejoins.c +++ b/src/backend/optimizer/plan/analyzejoins.c @@ -51,6 +51,7 @@ typedef struct } SelfJoinCandidate; bool enable_self_join_elimination; +bool enable_starjoin_join_search; /* local functions */ static bool join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo); @@ -2514,3 +2515,427 @@ remove_useless_self_joins(PlannerInfo *root, List *joinlist) return joinlist; } + +/* + * starjoin_match_to_foreign_key + * Try to match a join to a FK constraint. + * + * For a relation to be a dimension (for the starjoin heuristics), it needs + * to be joined through a FK constraint. The dimension is expected to be + * on the PK side of the join. The relation must not have any additional + * join clauses, beyond those matching the foreign key. + * + * We already have a list of relevant foreign keys, and we use that info + * for selectivity estimation in get_foreign_key_join_selectivity(). And + * we're actually doing something quite similar here. + * + * XXX Do we need to worry about the join type, e.g. inner/outer joins, + * semi/anti? get_foreign_key_join_selectivity() does care about it, and + * ignores some of those cases. Maybe we should too? + * + * XXX Check there are no other join clauses, beyond those matching the + * foreign key. But do we already have the joininfo at this point? Some + * of this stuff gets build only during the join order search later. + * The match_foreign_keys_to_quals() probably needs to be aware of all + * this, so how does it do that? + */ +static bool +starjoin_match_to_foreign_key(PlannerInfo *root, RelOptInfo *rel) +{ + ListCell *lc; + + /* Consider each FK constraint that is known to match the query */ + foreach(lc, root->fkey_list) + { + ForeignKeyOptInfo *fkinfo = (ForeignKeyOptInfo *) lfirst(lc); + int nmatches = 0; + + /* rel is not the referenced table of the FK */ + if (fkinfo->ref_relid != rel->relid) + continue; + + /* + * Do we have a match for each key of the FK? + * + * XXX get_foreign_key_join_selectivity checks EquivalenceClasses, + * we should probably (definitely) do that here too. + * + * XXX We should check that all the clauses have the same relation + * on the other side (for multi-column keys). And that there are + * no other join clauses other than those matching the FK. + * + * XXX Do we need to check that the FK side of the join (i.e. the fact + * table) has the columns referenced as NOT NULL? Otherwise we could + * have a FK join that reduces the cardinality, which is one of + * the arguments why it's fine to move the join (that it doesn't + * change the cardinality). But if the join is LEFT JOIN, this + * should be fine too - but do we get here with LEFT JOINs? + * + * XXX Do we need to check if the other side of the FK is in the + * current join list? Maybe it's in some later one? + */ + for (int i = 0; i < fkinfo->nkeys; i++) + { + bool has_matching_clause = false; + + /* + * Is there a clause matching this FK key? + * + * XXX We need to make sure it's a valid match, e.g. that the + * same referencing table matches all keys in a composite FK, + * and so on. + * + * XXX Do we need to check some relationship to other rels in + * the same jointree item? E.g. the referencing table should + * not be a dimensions we already removed. + */ + if ((fkinfo->rinfos[i] != NULL) || (fkinfo->eclass[i] != NULL)) + { + has_matching_clause = true; + nmatches++; + continue; + } + + /* found a FK key without a matching join clause, ignore the FK */ + if (has_matching_clause) + break; + } + + /* matched all FK keys */ + if (nmatches == fkinfo->nkeys) + { + return true; + } + } + + return false; +} + + +/* + * starjoin_is_dimension + * Determine if a range table entry is a dimension in a starjoin. + * + * To be considered a dimension for the simplified join order search, the + * join must not affect the cardinality of the join. We ensure that by + * requiring a couple things: + * + * 1) the join clause has to match a FK (that is, the fact does have at + * most one matching row in the dimension) + * + * 2) the FK side (the fact table) should be marked as NOT NULL, so that + * we know there's exactly one dimension row for each fact table row + * + * 3) there must be no additional restrictions on the relation (that + * might eliminate some of the rows, reducing the cardinality) + * + * XXX The Implementation is incomplete. It probably needs more thought, + * considering some join types would allow relaxing some of the checks + * (e.g. outer joins may not require checking (2) or possibly even (3), + * depending on where the condition is, what varnullingrels it has). + * + * XXX I wonder if we could handle (3) by ordering the dimensions by the + * selectivity of the restriction. There are no join clauses between the + * dimensions (ignoring the snowflake joins, but even there the clauses + * don't go between branches), so the selectivity could be treated as + * a measure of how much it shrinks the join result. So we could just + * sort the dimensions by this, starting with the lowest selectivity + * (close to 0.0), and ending with dimensions without restrictions (in + * which case the selectivity is 1.0). + * + * XXX If the join in INNER, and the fact side has NULL values in the + * join key, we might consider nullfrac as restriction. + * + * XXX I'm not sure how careful this needs to be about join order + * restrictions. Maybe it should call have_relevant_joinclause and + * have_join_order_restriction, to ensure the join order is OK? + * + * The optimizer/README is not very clear about this, but maybe it's + * a too specific question. It seems to say the relations in those + * lists can be joined in any order (lines 94 and 106). Maybe that's + * not what it means, or I'm misunderstanding it. + * + * It however seems has_join_restrictions() in join_search_one_level() + * forces the code to look only at "earlier" rels in the list + * + * first_rel = foreach_current_index(r) + 1 + * + * So maybe we just need to stop once we find a rel with a restriction, + * as determined byhas_join_restrictions()? + * + * But there's also join_is_legal() to check legality of joins, with + * LEFT/RIGHT joins, and IN/EXISTS clauses. See README line 188. And it + * also looks-up the SpecialJoinInfo for the join. So maybe we should + * lookup RelOptInfo for both sides of the join, and call join_is_legal + * on that? Might be too expensive, though. Maybe do that only when + * has_join_restrictions already says yes? + * + * Maybe we should use has_join_restrictions(), but in a different way. + * We could still treat rels with restrictions as dimensions, and move + * that to the separate list (that doesn't change the join order), but + * stop once we hit the first non-dimension with a restriction? Because + * if any relation after that was a dimention, we wouldn't be able to + * move it to the separate list. It'd change the join order in a way + * that might violate the restriction. I believe that's the idea behind + * first_rel in join_search_one_level(), but maybe not. + * + * Perhaps have_join_order_restriction and have_relevant_joinclause are + * useful for this, rather than has_join_restrictions? We might look at + * actual pairs of relations, and/or check there's no join order + * restriction with respect to the relations we skipped/moved to the + * list of dimension? + * + * AFAICS it's just the skipping that can break the order restrictions? + * Adding something to the list of dimensions keeps the order (at least + * with respect to the rels after it). + */ +static bool +starjoin_is_dimension(PlannerInfo *root, RangeTblRef *rtr) +{ + Index rti = rtr->rtindex; + RangeTblEntry *rte = root->simple_rte_array[rti]; + RelOptInfo *rel = root->simple_rel_array[rti]; + + /* only plain relations can be dimensions (we need FK/PK join) */ + if ((rte->rtekind != RTE_RELATION) || + (rel->reloptkind != RELOPT_BASEREL)) + return false; + + /* + * Does it have any conditions/restrictions that may affect the number + * of rows matched? If yes, don't treat as dimension. + * + * Dimensions in a starjoin may have restrictions, but that means it'll + * change cardinality of the joins (reduce it), so it may be better to + * join it early. We leave it to the regular join order planning. The + * expectation is that most dimensions won't have extra restrictions. + * + * XXX Should we check some other fields, like lateral references, and + * so on? Or is that unnecessary? What about eclasses? + */ + if (rel->baserestrictinfo != NIL) + return false; + + /* + * See if the join clause matches a foreign key. It should match a + * single relation on the other side, and the dimension should be on + * the PK side. + * + * XXX loosely inspired by get_foreign_key_join_selectivity() + */ + if (!starjoin_match_to_foreign_key(root, rel)) + return false; + + /* + * XXX Maybe some additional checks here ... + */ + + return true; +} + +/* + * starjoin_adjust_joins + * Adjust the jointree for starjoins, to simplify the join order search. + * + * The join search for starjoin queries is surprisingly expensive, because + * there are very few join order restrictions. Consider a starjoin query + * + * SELECT * FROM f + * JOIN d1 ON (f.id1 = d1.id) + * JOIN d2 ON (f.id2 = d2.id) + * ... + * JOIN d9 ON (f.id9 = d9.id) + * + * There are no clauses between the dimension tables (d#), which means those + * tables can be joined in almost arbitrary order. This means the standard + * join_order_search() would explore a N! possible join orders. It is not + * that bad in practice, because we split the problem by from_collapse_limit + * into a sequence of smaller problems, but even for the default limit of + * 8 relations it's quite expensive. This can be easily demonstrated by + * setting from_collapse_limit=1 for example starjoin queries. + * + * The idea here is to apply a much simpler join order search for this type + * of queries, without too much risk of picking a much worse plans. It is + * however a trade off between how expensive we allow this to be, and how + * good the decisions will be. This can help only starjoins with multiple + * dimension tables, and we don't want to harm planning of other queries, + * so the basic "query shape" detection needs to be very cheap. And then + * it needs to be cheaper than the regular join order search. + * + * If a perfect check is impossible or too expensive, it's better to end + * up with a cheap false negative (i.e. and not use the optimization), + * rather than risk regressions in other cases. + * + * The simplified join order search relies on the fact that if the joins + * to dimensions do not alter the cardinality of the join relation, then + * the relative order of those joins does not matter. All the possible + * orders are guaranteed to perform the same. So we can simply pick one + * of those orders, and "hardcode" it in the join tree later passed to the + * join_order_search(). + * + * The query may involve joins to additional (non-dimension) tables, and + * those may alter cardinality. Some joins may increase it, other joins + * may decrease it. In principle, it'd be best to first perform all the + * joins that reduce join size, then join all the dimensions, and finally + * perform joins that may increase the join size. But this is not done + * now, currently we simply apply all the dimensions at the end, hoping + * that the earlier joins did not inflate the join too much. + * + * The definition of a dimension is a bit vague. For our definition see + * the comment at starjoin_is_dimension(). + * + * The optimization works by manipulating the joinlist (originally built + * by deconstruct_jointree), which decomposed the original jointree into + * smaller "problems" depending on join type and join_collapse_limit. We + * inspect those smaller lists, and selectively split them into smaller + * problems to force a join order. This may effectively undo some of the + * merging done by deconstruct_jointree(), which tries to build problems + * with up to join_collapse_limit relations. + * + * For example, imagine a join problem with 8 rels - one fact table and + * then 7 dimensions, which we can represent a joinlist with 8 elements. + * + * (D7, D6, D5, D4, D3, D2, D1, F) + * + * Assuming all those joins meet the requirements (have a matching FK, + * don't affect the join cardinality, ...), then we can split this into + * + * (D7, (D6, (D5, (D4, (D3, (D2, (D1, F))))))) + * + * which is a nested joinlist, with only two elements on each level. That + * means there's no need for expensive join order search, there's only + * one way to join the relations (two, if we consider the relations may + * switch sides). + * + * The joinlist may already be nested, with multiple smaller subproblems. + * We look at each individual join problem independently, i.e. we don't + * try to merge problems to build join_collapse_limit problems again. + * This is partially to keep it cheap/simple, but also so not change + * behavior for cases when people use join_collapse_limit to force some + * particular join shape. + * + * XXX A possible improvement is to allow handling snowflake joins, i.e. + * recursive dimensions. That would require a somewhat more complicated + * processing, because a dimension would be allowed other rels, as long + * as those are dimensions too. And we'd need to be more careful about + * the order in which join them to the top of the join. + * + * XXX One possible risk is that moving the dimension joins at the very + * end may move that after joins that increase the cardinality. Which + * may cause a regression. Such joins however don't seem very common, at + * least in regular starjoin queries. So maybe we could simply check if + * there are any such joins and disable this optimization. Is there a + * cheap way to identify that a join increases cardinality? + * + * XXX Ideally, we'd perform the dimension joins at the place with the + * lowest cardinality. Imagine a joinlist + * + * (D1, D2, A, B, F) + * + * Where A increases join cardinality, while B does not (possibly even + * reduces it). Ideally, we'd do the join like this + * + * (A, (D2, (D1, (B, F)))) + * + * so D1/D2 get joined at the point of "lowest cardinality". We probably + * don't want to do all this cardinality estimation work here, it'd copy + * what we already do in the join_order_search(). Perhaps we could invent + * a "join item" representing a join to all those dimensions, and pass it + * to join_order_search()? And let it pick the right place for it? It'd + * always join them in the same order, it'd not reorder them. It would + * still do the regular cardinality estimations etc. It would be trivial + * to disable the optimization if needed - don't collapse the dimensions + * into the new type of join item. + */ +List * +starjoin_adjust_joins(PlannerInfo *root, List *joinlist) +{ + ListCell *lc; + List *newlist = NIL; + List *dimensions = NIL; + + /* + * Do nothing if starjoin optimization not enabled / not applicable. + * + * XXX It might seems we can skip this for lists with <= 2 items, but + * that does not work, the elements may be nested list, and we need to + * descend into those. So do what remove_useless_self_joins() does, and + * only bail out for the simplest single-relation case (i.e. no joins). + */ + if (!enable_starjoin_join_search || joinlist == NIL || + (list_length(joinlist) == 1 && !IsA(linitial(joinlist), List))) + return joinlist; + + /* + * Process the current join problem - split the elements into dimensions + * and non-dimensions. If there are dimensions, add them back at the end, + * as small single-rel joins. + * + * The list may contain various types of elements. It may contain a list, + * which means it's an independent join search problem and we need to + * process it recursively. Or it may be RangeTblRef, in which case we + * need to check if it's a dimension. Other types of elements are just + * added back to the list as-is. + * + * XXX I think we need to be careful to keep the order of the list (for + * the non-dimension entries). The join_search_one_level() relies on + * that when handling join order restrictions. + * + * XXX It might be better to only create a new list if needed, i.e. once + * we find the first dimension. So that non-starjoin queries don't pay + * for something they don't need. A mutable iterator might be a way, but + * I'm not sure how expensive this really is. + */ + foreach (lc, joinlist) + { + Node *item = (Node *) lfirst(lc); + + /* a separate join search problem, handle it recursively */ + if (IsA(item, List)) + { + newlist = lappend(newlist, + starjoin_adjust_joins(root, (List *) item)); + continue; + } + + /* + * Non-RangeTblRef elements can't be considered a dimension (only + * baserels can have FK, etc.), so just add those to the list. + */ + if (!IsA(item, RangeTblRef)) + { + newlist = lappend(newlist, item); + continue; + } + + /* + * An entry representing a baserel. If it's a dimension, save it in + * a separate list, and we'll add it at the "top" of the join at the + * end. Otherwise add it to the list just like other elements. + */ + if (starjoin_is_dimension(root, (RangeTblRef *) item)) + { + dimensions = lappend(dimensions, item); + continue; + } + + /* not a dimension, add it to the list directly */ + newlist = lappend(newlist, item); + } + + /* + * If we found some dimensions, add them to the join tree one by one. + * The exact order does not matter, so we add them in the order we + * found them in the original list. + * + * We need to add them by creating smaller 2-element lists, with the rest + * of the list being a sub-problem and then adding the dimension + * table. This is how we force the desired join order. + */ + foreach (lc, dimensions) + { + newlist = list_make2(newlist, lfirst(lc)); + } + + return newlist; +} diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c index 5467e094ca7..c75a5203aae 100644 --- a/src/backend/optimizer/plan/planmain.c +++ b/src/backend/optimizer/plan/planmain.c @@ -282,6 +282,16 @@ query_planner(PlannerInfo *root, */ distribute_row_identity_vars(root); + /* + * Try to simplify the join search problem for starjoin-like joins, with + * joins over FK relationships. The dimensions can be joined in almost + * any order, so the join search can be close to factorial complexity. + * But there's not much difference between such join orders, so we just + * leave the dimensions at the end of each group (as determined by the + * join_collapse_limit earlier). + */ + joinlist = starjoin_adjust_joins(root, joinlist); + /* * Ready to do the primary planning. */ diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat index 6bc6be13d2a..bde4378527e 100644 --- a/src/backend/utils/misc/guc_parameters.dat +++ b/src/backend/utils/misc/guc_parameters.dat @@ -203,6 +203,13 @@ boot_val => 'true', }, +{ name => 'enable_starjoin_join_search', type => 'bool', context => 'PGC_USERSET', group => 'QUERY_TUNING_METHOD', + short_desc => 'Enables simplified join order planning for starjoins.', + flags => 'GUC_EXPLAIN', + variable => 'enable_starjoin_join_search', + boot_val => 'false', +}, + { name => 'geqo', type => 'bool', context => 'PGC_USERSET', group => 'QUERY_TUNING_GEQO', short_desc => 'Enables genetic query optimization.', long_desc => 'This algorithm attempts to do planning without exhaustive searching.', diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h index 9d3debcab28..fee6c695d03 100644 --- a/src/include/optimizer/planmain.h +++ b/src/include/optimizer/planmain.h @@ -21,6 +21,7 @@ #define DEFAULT_CURSOR_TUPLE_FRACTION 0.1 extern PGDLLIMPORT double cursor_tuple_fraction; extern PGDLLIMPORT bool enable_self_join_elimination; +extern PGDLLIMPORT bool enable_starjoin_join_search; /* query_planner callback to compute query_pathkeys */ typedef void (*query_pathkeys_callback) (PlannerInfo *root, void *extra); @@ -119,6 +120,7 @@ extern bool innerrel_is_unique_ext(PlannerInfo *root, Relids joinrelids, JoinType jointype, List *restrictlist, bool force_cache, List **extra_clauses); extern List *remove_useless_self_joins(PlannerInfo *root, List *joinlist); +extern List *starjoin_adjust_joins(PlannerInfo *root, List *joinlist); /* * prototypes for plan/setrefs.c
pgsql-hackers by date: