diff --git a/contrib/pg_plan_advice/README b/contrib/pg_plan_advice/README index 0b888fd82f2..441bd79da50 100644 --- a/contrib/pg_plan_advice/README +++ b/contrib/pg_plan_advice/README @@ -141,7 +141,7 @@ side and some kind of join between t2 and t3 on the inner side, but without saying how that join must be performed or anything about which relation should appear on which side of the join, or even whether this kind of join has sides. -Join Strategy Advice +Join Method Advice ==================== Tags such as NESTED_LOOP_PLAIN specify the method that should be used to @@ -151,10 +151,10 @@ side of a plain nested loop (one without materialization or memoization) and that it should also put a join between the relation whose identifier is "y" and the relation whose identifier is "z" on the inner side of a nested loop. Hence, for an N-table join problem, there will be N-1 pieces of join -strategy advice; no join strategy advice is required for the outermost +method advice; no join method advice is required for the outermost table in the join problem. -Considering that we have both join order advice and join strategy advice, +Considering that we have both join order advice and join method advice, it might seem natural to say that NESTED_LOOP_PLAIN(x) should be redefined to mean that x should appear by itself on one side or the other of a nested loop, rather than specifically on the inner side, but this definition appears @@ -166,13 +166,13 @@ join, the two sides are treated very differently and saying that a certain relation should be involved in one of those operations without saying which role it should take isn't saying much. -This choice of definition implies that join strategy advice also imposes some +This choice of definition implies that join method advice also imposes some join order constraints. For example, given a join between foo and bar, HASH_JOIN(bar) implies that foo is the driving table. Otherwise, it would be impossible to put bar beneath the inner side of a Hash Join. Note that, given this definition, it's reasonable to consider deleting the -join order advice but applying the join strategy advice. For example, +join order advice but applying the join method advice. For example, consider a star schema with tables fact, dim1, dim2, dim3, dim4, and dim5. The automatically generated advice might specify JOIN_ORDER(fact dim1 dim3 dim4 dim2 dim5) HASH_JOIN(dim2 dim4) NESTED_LOOP_PLAIN(dim1 dim3 dim5). @@ -218,6 +218,9 @@ in the query, specifying just PARTITIONWISE((t1 t2)) would have the same effect, since there would be no other rels to which t3 could be joined in a partitionwise fashion. +Note that, regardless of what advice is specified, no partitionwise joins +will be possible if enable_partitionwise_join = off. + Parallel Query (Gather, etc.) ============================= diff --git a/doc/src/sgml/pgplanadvice.sgml b/doc/src/sgml/pgplanadvice.sgml index 817bafa21c6..305f3890ed8 100644 --- a/doc/src/sgml/pgplanadvice.sgml +++ b/doc/src/sgml/pgplanadvice.sgml @@ -12,12 +12,12 @@ to be described, reproduced, and altered using a special-purpose "plan advice" mini-language. It is intended to allow stabilization of plan choices that the user believes to be good, as well as experimentation with plans that - the planner believes to be non-optimal. + the planner believes to be suboptimal. Note that, since the planner often makes good decisions, overriding its - judgement can easily backfire. For example, if the distribution of the + judgment can easily backfire. For example, if the distribution of the underlying data changes, the planner normally has the option to adjust the plan in an attempt to preserve good performance. If the plan advice prevents this, a very poor plan may be chosen. It is important to use plan advice @@ -29,9 +29,8 @@ Getting Started - In order to use this module, the pg_plan_advice module - must be loaded. You can do this on a system-wide basis by adding - pg_plan_advice to + This module must first be loaded by adding + pg_plan_advice either to and restarting the server, or by adding it to and starting a new session, @@ -43,7 +42,7 @@ Once the pg_plan_advice module is loaded, EXPLAIN will support a PLAN_ADVICE option. You can use this option to see - a plan advice string for the chosen plan. For example: + the plan advice string for the chosen plan. For example: @@ -133,15 +132,15 @@ EXPLAIN (COSTS OFF) Plan advice is written imperatively; that is, it specifies what should be done. However, at an implementation level, - pg_plan_advice works by telling the core planner what + pg_plan_advice works by telling the planner what should not be done. In other words, it operates by constraining the planner's choices, not by replacing it. Therefore, no matter what advice - you provide, you will only ever get a plan that the core planner would have + you provide, you will only ever get a plan that the planner would have considered for the query in question. If you attempt to force what you believe to be the correct plan by supplying an advice string, and the planner still fails to produce the desired plan, this means that either there is a bug in your advice string, or the plan in question was not - considered viable by the core planner. This commonly happens for one of two + considered viable by the planner. This commonly happens for one of two reasons. First, it might be that the planner believes that the plan you're trying to force would be semantically incorrect - that is, it would produce the wrong results - and for that reason it wasn't considered. Second, it @@ -200,7 +199,7 @@ EXPLAIN (COSTS OFF) An advice target uniquely identifies a particular - instance of a particular relation involved in a particular query. In simple + instance of a particular relation involved in a particular (sub)plan. In simple cases, such as the examples shown above, the advice target is simply the relation alias. However, a more complex syntax is required when subqueries are used, when tables are partitioned, or when the same relation alias is @@ -237,8 +236,8 @@ alias_name#occurrence_number/partition_schema.partition_name@plan_name and uses the parent query's subplan name (or no subplan name, if pulled up to the top level). Furthermore, the correct subquery name is sometimes not obvious. For example, when two queries are joined using an operation such as - UNION or INTERSECT, no name for the - subqueries is present in the SQL syntax; instead, a system-generated name is + UNION or INTERSECT, no names for the + subqueries are present in the SQL syntax; instead, a system-generated name is assigned to each branch. The easiest way to discover the proper advice targets is to use EXPLAIN (PLAN_ADVICE) and examine the generated advice. @@ -362,7 +361,7 @@ Join Parenthesized sublists can be arbitrarily nested, but sublists surrounded - by curly braces cannot themselves contain sublists. + by curly braces cannot contain sublists. @@ -371,7 +370,7 @@ Join are multiple join problems that are optimized separately by the planner. This can happen due to the presence of subqueries, or because there is a partitionwise join. In the latter case, each branch of the partitionwise - join can have its own join order, independent of every other branch. + join can have its own join order. @@ -405,10 +404,11 @@ join_method_name(join_method_item [ ... ]) - Note that join method advice implies a negative join order constraint. + Note that join method advice necessarily constrains the available join orders. Since the named relation or relations must be on the inner side of a join using the specified method, none of them can be the driving table for the - entire join problem. Moreover, no relation inside the set should be joined + entire join problem (i.e., JOIN_ORDER(a b) and HASH_JOIN(a) are conflicting). + Moreover, no relation inside the set should be joined to any relation outside the set until all relations within the set have been joined to each other. For example, if the advice specifies HASH_JOIN((a b)) and the system begins by joining either @@ -416,7 +416,7 @@ join_method_name(join_method_item [ ... ]) plan could never be compliant with the request to put exactly those two relations on the inner side of a hash join. When using both join order advice and join method advice for the same query, it is a good idea to make - sure that they do not mandate incompatible join orders. + sure that they do not mandate conflicting join orders. @@ -432,13 +432,16 @@ PARTITIONWISE(partitionwise_item [ ... ]) ( advice_target [ ... ] ) } - When applied to a single target, PARTITIONWISE - specifies that the specified table should not be part of any partitionwise - join. When applied to a list of targets, PARTITIONWISE - specifies that exactly that set of relations should be joined in - partitionwise fashion. Note that, regardless of what advice is specified, - no partitionwise joins will be possible if - enable_partitionwise_join = off. + PARTITIONWISE() advise can be used to specify both those partitionwise joins + which should be performed and those which should not be performed; the idea + is that each argument to PARTITIONWISE specifies a set of relations that + should be scanned partitionwise after being joined to each other and nothing + else. Hence, for example, PARTITIONWISE((t1 t2) t3) specifies that the + query should contain a partitionwise join between t1 and t2 and that t3 + should not be part of any partitionwise join. If there are no other rels + in the query, specifying just PARTITIONWISE((t1 t2)) would have the same + effect, since there would be no other rels to which t3 could be joined in + a partitionwise fashion. @@ -446,8 +449,8 @@ PARTITIONWISE(partitionwise_item [ ... ]) Semijoin Uniqueness Advice -SEMIJOIN_UNIQUE(sj_unique_item [ ... ]) SEMIJOIN_NON_UNIQUE(sj_unique_item [ ... ]) +SEMIJOIN_UNIQUE(sj_unique_item [ ... ]) where sj_unique_item is: @@ -457,9 +460,9 @@ SEMIJOIN_NON_UNIQUE(sj_unique_item [ ... ]) The planner sometimes has a choice between implementing a semijoin directly and implementing a semijoin by making the nullable side unique - and then performing an inner join. SEMIJOIN_UNIQUE - specifies the latter strategy, while SEMIJOIN_NON_UNIQUE - specifies the former strategy. In either case, the argument is the single + and then performing an inner join. SEMIJOIN_NON_UNIQUE + specifies the former strategy, while SEMIJOIN_UNIQUE + specifies the latter strategy. In either case, the argument is the single relation or list of relations that appear beneath the nullable side of the join. @@ -545,11 +548,11 @@ NO_GATHER(advice_target [ ... ]) SET pg_plan_advice.advice = 'hash_join(f g) join_order(f g) index_scan(f no_such_index)'; -SET -rhaas=# EXPLAIN (COSTS OFF) SELECT * FROM jo_fact f - LEFT JOIN jo_dim1 d1 ON f.dim1_id = d1.id - LEFT JOIN jo_dim2 d2 ON f.dim2_id = d2.id - WHERE val1 = 1 AND val2 = 1; +EXPLAIN (COSTS OFF) + SELECT * FROM jo_fact f + LEFT JOIN jo_dim1 d1 ON f.dim1_id = d1.id + LEFT JOIN jo_dim2 d2 ON f.dim2_id = d2.id + WHERE val1 = 1 AND val2 = 1; QUERY PLAN ------------------------------------------------------------------- Hash Join @@ -577,12 +580,16 @@ rhaas=# EXPLAIN (COSTS OFF) on the inner side of a hash join is listed as not matched. The JOIN_ORDER advice tag involves one valid target and one invalid target, and so is listed as - partially matched. Note that - HASH_JOIN(f g) is actually a request for two logically - separate behaviors, whereas JOIN_ORDER(f g) is a single - request. When providing advice feedback, EXPLAIN shows - each logical request separately, together with all the feedback applicable - to that request type. + partially matched. + + + + Feedback advice is normalized into logical directives. Note, in the example, + that the HASH_JOIN(f g) join method advice was decomposed + into its two logically separate behaviors, HASH_JOIN(f) + and HASH_JOIN(g), in the feedback. Since + JOIN_ORDER(f g) is a single logical request it appears + as-is in the feedback. @@ -594,17 +601,17 @@ rhaas=# EXPLAIN (COSTS OFF) matched means that all of the specified advice targets - were observed together during query planning, at a time at which the + were observed together during query planning, at times when the advice could be enforced. - partially matched means that some but not all of the - specified advice targets were observed during query planning, or all - of the advice targets were observed but not together. For example, this - may happen if all the targets of JOIN_ORDER advice + partially matched means either that only some of the + advice targets were observed during query planning, or all + of the advice targets were observed but not at the right times. For example, + this may happen if all the targets of JOIN_ORDER advice individually match the query, but the proposed join order is not legal. @@ -663,6 +670,15 @@ rhaas=# EXPLAIN (COSTS OFF) partially matched, or not matched. + + Thus, in the example above, INDEX_SCAN(f no_such_index) matched because the + only named target in the advice, f, is present in the query. The attempt to + locate no_such_index failed, marking the advice inapplicable. Since the plan + did not include an index scan, but matched the target, the additional failed + marking is applied. Both partially matched and not matched imply a failed + outcome so the additional marking is omitted. + + @@ -719,8 +735,8 @@ rhaas=# EXPLAIN (COSTS OFF) pg_plan_advice.always_store_advice_details allows EXPLAIN to show details related to plan advice even - when prepared queries are used. The default value is - false. When planning a prepared query, it is not + when prepared queries are used. The default value is + false. When planning a prepared query, it is not possible to know whether EXPLAIN will later be used, so by default, to reduce overhead, pg_plan_advice will not generate plan advice or feedback on supplied advice. This means @@ -747,6 +763,9 @@ rhaas=# EXPLAIN (COSTS OFF) emits a warning whenever supplied plan advice is not successfully enforced. The default value is false. + + Success means that all advice feedback is marked as matched, and not failed. +