Re: Changing SQL Inlining Behaviour (or...?) - Mailing list pgsql-hackers
From | Paul Ramsey |
---|---|
Subject | Re: Changing SQL Inlining Behaviour (or...?) |
Date | |
Msg-id | 52427AA5-7A1B-4F22-8E0E-CCBC3AABDE82@cleverelephant.ca Whole thread Raw |
In response to | Re: Changing SQL Inlining Behaviour (or...?) (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Changing SQL Inlining Behaviour (or...?)
(Tom Lane <tgl@sss.pgh.pa.us>)
|
List | pgsql-hackers |
> On Jan 19, 2019, at 12:25 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Adam Brightwell <adam.brightwell@crunchydata.com> writes: >> I've been working with Paul to create and test a patch (attached) that >> addresses Solution #2. This patch essentially modifies the inlining >> logic to compare the cost of the function with the total cost of the >> parameters. The goal as stated above, is that for these kinds of >> functions, they would be assigned relatively high cost to trigger the >> inlining case. > > I played around with this a bit. I think we could make the rule simpler > and more useful by expressing it as "if the total incremental cost from > multiple evaluation of parameters exceeds the inlinable function's > nominal cost, then don't inline". The existing rule is "no individual > param expression can cost more than 10*cpu_operator_cost, if the > function uses it more than once --- never mind how many times exactly". > With the default cost (100) for a SQL language function, this'd become > "the total incremental cost can't be more than 100*cpu_operator_cost". > So that's more or less in the same ballpark, though it will make some > different decisions in cases near the boundary, and probably will choose > to inline in more cases than before. But it's simpler to compute and > more defensible as a criterion, IMO. Is there a rule of thumb we can use in costing our wrappers to ensure that they always inline? P > inlining-solution-3a.patch, attached, does it like that, keeping the > existing rule that we flat out won't inline if there are any subplans > in multiply-evaluated parameters. That seems kind of ugly, though, > so I also prepared inlining-solution-3b.patch which eliminates the > cruft around subplans in this logic by moving the cruft into > cost_qual_eval instead: it has to make an arbitrary assumption about > the cost of a SubLink. > > I'm not really sure which of these I like better. The arbitrary cost > assumption in cost_qual_eval is not great, and it'd be really bad if > anyone ever tried to use costs for pre-SS_process_sublinks expressions > for something more important than the case at hand. But on the other > hand it seems nice to get rid of the arbitrary inlining choice. > > Thoughts? > > The other thing we need to consider is whether we need any documentation > adjustments. I believe that right now, the rules for inlining SQL > functions are not documented anywhere but the code, and maybe it's not > this patch's job to change that. > > regards, tom lane > > diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c > index f0ef102..c99304d 100644 > *** a/src/backend/optimizer/util/clauses.c > --- b/src/backend/optimizer/util/clauses.c > *************** evaluate_function(Oid funcid, Oid result > *** 4645,4651 **** > * unreasonably expensive to evaluate. The expensiveness check not only > * prevents us from doing multiple evaluations of an expensive parameter > * at runtime, but is a safety value to limit growth of an expression due > ! * to repeated inlining. > * > * We must also beware of changing the volatility or strictness status of > * functions by inlining them. > --- 4645,4653 ---- > * unreasonably expensive to evaluate. The expensiveness check not only > * prevents us from doing multiple evaluations of an expensive parameter > * at runtime, but is a safety value to limit growth of an expression due > ! * to repeated inlining. (Currently, the expensiveness check is expressed > ! * as a limit on the total additional cost of extra parameter evaluations, > ! * rather than of any one parameter.) > * > * We must also beware of changing the volatility or strictness status of > * functions by inlining them. > *************** inline_function(Oid funcid, Oid result_t > *** 4682,4687 **** > --- 4684,4690 ---- > Query *querytree; > Node *newexpr; > int *usecounts; > + Cost delta_param_cost; > ListCell *arg; > int i; > > *************** inline_function(Oid funcid, Oid result_t > *** 4876,4882 **** > newexpr = substitute_actual_parameters(newexpr, funcform->pronargs, > args, usecounts); > > ! /* Now check for parameter usage */ > i = 0; > foreach(arg, args) > { > --- 4879,4889 ---- > newexpr = substitute_actual_parameters(newexpr, funcform->pronargs, > args, usecounts); > > ! /* > ! * Check for parameter usage, and estimate the delta in evaluation cost > ! * resulting from parameters possibly being evaluated more than once. > ! */ > ! delta_param_cost = 0.0; > i = 0; > foreach(arg, args) > { > *************** inline_function(Oid funcid, Oid result_t > *** 4887,4922 **** > /* Param not used at all: uncool if func is strict */ > if (funcform->proisstrict) > goto fail; > } > else if (usecounts[i] != 1) > { > ! /* Param used multiple times: uncool if expensive or volatile */ > QualCost eval_cost; > > /* > ! * We define "expensive" as "contains any subplan or more than 10 > ! * operators". Note that the subplan search has to be done > ! * explicitly, since cost_qual_eval() will barf on unplanned > ! * subselects. > */ > if (contain_subplans(param)) > goto fail; > - cost_qual_eval(&eval_cost, list_make1(param), NULL); > - if (eval_cost.startup + eval_cost.per_tuple > > - 10 * cpu_operator_cost) > - goto fail; > > /* > ! * Check volatility last since this is more expensive than the > ! * above tests > */ > ! if (contain_volatile_functions(param)) > ! goto fail; > } > i++; > } > > /* > * Whew --- we can make the substitution. Copy the modified expression > * out of the temporary memory context, and clean up. > */ > --- 4894,4949 ---- > /* Param not used at all: uncool if func is strict */ > if (funcform->proisstrict) > goto fail; > + > + /* > + * In principle, we should credit the eval cost of the param expr > + * as a savings from inlining. However, it's hard to deal with > + * subplans properly, so we don't bother. This case is unlikely > + * anyway, so we just try to be semantically correct. > + */ > } > else if (usecounts[i] != 1) > { > ! /* Param used multiple times: uncool if volatile */ > QualCost eval_cost; > > + if (contain_volatile_functions(param)) > + goto fail; > + > /* > ! * If the param contains any SubLinks, cost_qual_eval() will barf > ! * on it. Arbitrarily assume that the true cost of a subplan is > ! * high enough that we should not inline. > */ > if (contain_subplans(param)) > goto fail; > > /* > ! * Otherwise, estimate the evaluation cost of the param, and > ! * charge for the N-1 extra evaluations. Note that we consider > ! * only per-tuple cost not startup cost. Ideally we'd account for > ! * that somehow, but it's hard since we have as yet no idea how > ! * many times the query will execute this expression. > */ > ! cost_qual_eval_node(&eval_cost, param, context->root); > ! delta_param_cost += eval_cost.per_tuple * (usecounts[i] - 1); > } > i++; > } > > /* > + * If the incremental cost of multiple parameter evaluation exceeds the > + * nominal cost of the function we're inlining, don't inline. With the > + * default cost for a SQL-language function, this rule causes us to not > + * inline if multiple evaluation would add more than 100 times > + * cpu_operator_cost. However, function authors can adjust the claimed > + * cost of inlinable functions to encourage or discourage inlining in the > + * presence of multiple parameter usage. > + */ > + if (delta_param_cost > funcform->procost * cpu_operator_cost) > + goto fail; > + > + /* > * Whew --- we can make the substitution. Copy the modified expression > * out of the temporary memory context, and clean up. > */ > diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c > index 99c5ad9..eb311fa 100644 > *** a/src/backend/optimizer/path/costsize.c > --- b/src/backend/optimizer/path/costsize.c > *************** cost_qual_eval_walker(Node *node, cost_q > *** 3929,3936 **** > } > else if (IsA(node, SubLink)) > { > ! /* This routine should not be applied to un-planned expressions */ > ! elog(ERROR, "cannot handle unplanned sub-select"); > } > else if (IsA(node, SubPlan)) > { > --- 3929,3943 ---- > } > else if (IsA(node, SubLink)) > { > ! /* > ! * In normal usage we won't see un-planned SubLinks here, because they > ! * will previously have been converted to SubPlans or InitPlans. > ! * However, we can reach here due to inline_function() wishing to > ! * estimate costs for function parameter expressions. Arbitrarily > ! * return a fairly large cost to discourage inlining when that would > ! * cause multiple eval of a subplan. > ! */ > ! context->total.per_tuple += 100000 * cpu_operator_cost; > } > else if (IsA(node, SubPlan)) > { > diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c > index f0ef102..84fdcc6 100644 > *** a/src/backend/optimizer/util/clauses.c > --- b/src/backend/optimizer/util/clauses.c > *************** evaluate_function(Oid funcid, Oid result > *** 4645,4651 **** > * unreasonably expensive to evaluate. The expensiveness check not only > * prevents us from doing multiple evaluations of an expensive parameter > * at runtime, but is a safety value to limit growth of an expression due > ! * to repeated inlining. > * > * We must also beware of changing the volatility or strictness status of > * functions by inlining them. > --- 4645,4653 ---- > * unreasonably expensive to evaluate. The expensiveness check not only > * prevents us from doing multiple evaluations of an expensive parameter > * at runtime, but is a safety value to limit growth of an expression due > ! * to repeated inlining. (Currently, the expensiveness check is expressed > ! * as a limit on the total additional cost of extra parameter evaluations, > ! * rather than of any one parameter.) > * > * We must also beware of changing the volatility or strictness status of > * functions by inlining them. > *************** inline_function(Oid funcid, Oid result_t > *** 4682,4687 **** > --- 4684,4690 ---- > Query *querytree; > Node *newexpr; > int *usecounts; > + Cost delta_param_cost; > ListCell *arg; > int i; > > *************** inline_function(Oid funcid, Oid result_t > *** 4876,4882 **** > newexpr = substitute_actual_parameters(newexpr, funcform->pronargs, > args, usecounts); > > ! /* Now check for parameter usage */ > i = 0; > foreach(arg, args) > { > --- 4879,4889 ---- > newexpr = substitute_actual_parameters(newexpr, funcform->pronargs, > args, usecounts); > > ! /* > ! * Check for parameter usage, and estimate the delta in evaluation cost > ! * resulting from parameters possibly being evaluated more than once. > ! */ > ! delta_param_cost = 0.0; > i = 0; > foreach(arg, args) > { > *************** inline_function(Oid funcid, Oid result_t > *** 4888,4922 **** > if (funcform->proisstrict) > goto fail; > } > ! else if (usecounts[i] != 1) > { > ! /* Param used multiple times: uncool if expensive or volatile */ > ! QualCost eval_cost; > ! > ! /* > ! * We define "expensive" as "contains any subplan or more than 10 > ! * operators". Note that the subplan search has to be done > ! * explicitly, since cost_qual_eval() will barf on unplanned > ! * subselects. > ! */ > ! if (contain_subplans(param)) > ! goto fail; > ! cost_qual_eval(&eval_cost, list_make1(param), NULL); > ! if (eval_cost.startup + eval_cost.per_tuple > > ! 10 * cpu_operator_cost) > goto fail; > > /* > ! * Check volatility last since this is more expensive than the > ! * above tests > */ > ! if (contain_volatile_functions(param)) > ! goto fail; > } > i++; > } > > /* > * Whew --- we can make the substitution. Copy the modified expression > * out of the temporary memory context, and clean up. > */ > --- 4895,4939 ---- > if (funcform->proisstrict) > goto fail; > } > ! else if (usecounts[i] > 1) > { > ! /* Param used multiple times: uncool if volatile */ > ! if (contain_volatile_functions(param)) > goto fail; > + } > > + if (usecounts[i] != 1) > + { > /* > ! * Estimate the evaluation cost of the param, and charge for the > ! * N-1 extra evaluations --- or, if it's not used at all, credit > ! * the savings. Note that we consider only per-tuple cost, not > ! * startup cost. Ideally we'd account for that somehow, but it's > ! * hard since we have as yet no idea how many times the query will > ! * execute this expression. > */ > ! QualCost eval_cost; > ! > ! cost_qual_eval_node(&eval_cost, param, context->root); > ! delta_param_cost += eval_cost.per_tuple * (usecounts[i] - 1); > } > + > i++; > } > > /* > + * If the incremental cost of multiple parameter evaluation exceeds the > + * nominal cost of the function we're inlining, don't inline. With the > + * default cost for a SQL-language function, this rule causes us to not > + * inline if multiple evaluation would add more than 100 times > + * cpu_operator_cost. However, function authors can adjust the claimed > + * cost of inlinable functions to encourage or discourage inlining in the > + * presence of multiple parameter usage. > + */ > + if (delta_param_cost > funcform->procost * cpu_operator_cost) > + goto fail; > + > + /* > * Whew --- we can make the substitution. Copy the modified expression > * out of the temporary memory context, and clean up. > */
pgsql-hackers by date: