Thread: Changing SQL Inlining Behaviour (or...?)
As I promised at PgConf.eu, here's a follow-up email about PostGIS parallelism and function inlining (an odd combination, it is true).
So, context:
- We want PostGIS to parallelize more. In order to achieve that we need to mark our functions with more realistic COSTs. Much much higher COSTs.
- When we do that, we hit a different problem. Our most commonly used functions, ST_Intersects(), ST_DWithin() are actually SQL wrapper functions are more complex combinations of index operators and exact computational geometry functions.
- In the presence of high cost parameters that are used multiple times in SQL functions, PostgreSQL will stop inlining those functions, in an attempt to save the costs of double-calculating the parameters.
- For us, that's the wrong choice, because we lose the index operators at the same time as we "save" the cost of double calculation.
- We need our wrapper functions inlined, even when they are carrying a high COST.
At pgconf.eu, I canvassed this problem and some potential solutions:
* Solution #1 - Quick and dirty and visible: Add an 'INLINE' function decorator, which tells PostgreSQL to just ignore costs and inline the function regardless. Pros: it's not too hard to implement and I'm happy to contribute this. Cons: it adds very specific single-purpose syntax to CREATE FUNCTION.
* Solution #2 - Quick and dirty and invisible. Tom suggested a hack that achieves the aims of #1 but without adding syntax to CREATE FUNCTION: have the inlining logic look at the cost of the wrapper and the cost of parameters, and if the cost of the wrapper "greatly exceeded" the cost of the parameters, then inline. So the PostGIS project would just set the cost of our wrappers very high, and we'd get the behaviour we want, while other users who want to use wrappers to force caching of calculations would have zero coded wrapper functions. Pros: Solves the problem and easy to implement, I'm happy to contribute. Cons: it's so clearly a hack involving hidden (from users) magic.
* Solution #3 - Correct and globally helpful. When first presented with this problem last year, both Andres and Tom said [1] "but the right fix is to avoid the double-calculation of identical entries in the target list" because then it would be safe to inline functions with duplicate expensive parameters. This would not only address the proximate PostGIS problem but make a whole class of queries faster. There was some discussion of this approach last week [2]. Pros: The right thing! Improves a whole pile of other performance cases. Cons: Hard! Only experienced PgSQL developers need apply.
Naturally, I would love to see #3 implemented, but there's only so much experienced developer time to go around, and it's beyond my current skill set. I would like to be able to start to improve PostGIS parallelism with PgSQL 12, so in order to make that not impossible, I'd like to implement either #1 or #2 in case #3 doesn't happen for PgSQL 12.
So my question to hackers is: which is less worse, #1 or #2, to implement and submit to commitfest, in case #3 does not materialize in time for PgSQL 12?
Thanks!
Paul
> Neglected to include the footnotes...
As I promised at PgConf.eu, here's a follow-up email about PostGIS parallelism and function inlining (an odd combination, it is true).
So, context:
- We want PostGIS to parallelize more. In order to achieve that we need to mark our functions with more realistic COSTs. Much much higher COSTs.
- When we do that, we hit a different problem. Our most commonly used functions, ST_Intersects(), ST_DWithin() are actually SQL wrapper functions are more complex combinations of index operators and exact computational geometry functions.
- In the presence of high cost parameters that are used multiple times in SQL functions, PostgreSQL will stop inlining those functions, in an attempt to save the costs of double-calculating the parameters.
- For us, that's the wrong choice, because we lose the index operators at the same time as we "save" the cost of double calculation.
- We need our wrapper functions inlined, even when they are carrying a high COST.
At pgconf.eu, I canvassed this problem and some potential solutions:
* Solution #1 - Quick and dirty and visible: Add an 'INLINE' function decorator, which tells PostgreSQL to just ignore costs and inline the function regardless. Pros: it's not too hard to implement and I'm happy to contribute this. Cons: it adds very specific single-purpose syntax to CREATE FUNCTION.
* Solution #2 - Quick and dirty and invisible. Tom suggested a hack that achieves the aims of #1 but without adding syntax to CREATE FUNCTION: have the inlining logic look at the cost of the wrapper and the cost of parameters, and if the cost of the wrapper "greatly exceeded" the cost of the parameters, then inline. So the PostGIS project would just set the cost of our wrappers very high, and we'd get the behaviour we want, while other users who want to use wrappers to force caching of calculations would have zero coded wrapper functions. Pros: Solves the problem and easy to implement, I'm happy to contribute. Cons: it's so clearly a hack involving hidden (from users) magic.
* Solution #3 - Correct and globally helpful. When first presented with this problem last year, both Andres and Tom said [1] "but the right fix is to avoid the double-calculation of identical entries in the target list" because then it would be safe to inline functions with duplicate expensive parameters. This would not only address the proximate PostGIS problem but make a whole class of queries faster. There was some discussion of this approach last week [2]. Pros: The right thing! Improves a whole pile of other performance cases. Cons: Hard! Only experienced PgSQL developers need apply.
Naturally, I would love to see #3 implemented, but there's only so much experienced developer time to go around, and it's beyond my current skill set. I would like to be able to start to improve PostGIS parallelism with PgSQL 12, so in order to make that not impossible, I'd like to implement either #1 or #2 in case #3 doesn't happen for PgSQL 12.
So my question to hackers is: which is less worse, #1 or #2, to implement and submit to commitfest, in case #3 does not materialize in time for PgSQL 12?
Thanks!
So, context:
- We want PostGIS to parallelize more. In order to achieve that we need to mark our functions with more realistic COSTs. Much much higher COSTs.
- When we do that, we hit a different problem. Our most commonly used functions, ST_Intersects(), ST_DWithin() are actually SQL wrapper functions are more complex combinations of index operators and exact computational geometry functions.
- In the presence of high cost parameters that are used multiple times in SQL functions, PostgreSQL will stop inlining those functions, in an attempt to save the costs of double-calculating the parameters.
- For us, that's the wrong choice, because we lose the index operators at the same time as we "save" the cost of double calculation.
- We need our wrapper functions inlined, even when they are carrying a high COST.
At pgconf.eu, I canvassed this problem and some potential solutions:
* Solution #1 - Quick and dirty and visible: Add an 'INLINE' function decorator, which tells PostgreSQL to just ignore costs and inline the function regardless. Pros: it's not too hard to implement and I'm happy to contribute this. Cons: it adds very specific single-purpose syntax to CREATE FUNCTION.
* Solution #2 - Quick and dirty and invisible. Tom suggested a hack that achieves the aims of #1 but without adding syntax to CREATE FUNCTION: have the inlining logic look at the cost of the wrapper and the cost of parameters, and if the cost of the wrapper "greatly exceeded" the cost of the parameters, then inline. So the PostGIS project would just set the cost of our wrappers very high, and we'd get the behaviour we want, while other users who want to use wrappers to force caching of calculations would have zero coded wrapper functions. Pros: Solves the problem and easy to implement, I'm happy to contribute. Cons: it's so clearly a hack involving hidden (from users) magic.
* Solution #3 - Correct and globally helpful. When first presented with this problem last year, both Andres and Tom said [1] "but the right fix is to avoid the double-calculation of identical entries in the target list" because then it would be safe to inline functions with duplicate expensive parameters. This would not only address the proximate PostGIS problem but make a whole class of queries faster. There was some discussion of this approach last week [2]. Pros: The right thing! Improves a whole pile of other performance cases. Cons: Hard! Only experienced PgSQL developers need apply.
Naturally, I would love to see #3 implemented, but there's only so much experienced developer time to go around, and it's beyond my current skill set. I would like to be able to start to improve PostGIS parallelism with PgSQL 12, so in order to make that not impossible, I'd like to implement either #1 or #2 in case #3 doesn't happen for PgSQL 12.
So my question to hackers is: which is less worse, #1 or #2, to implement and submit to commitfest, in case #3 does not materialize in time for PgSQL 12?
Thanks!
Paul
On Fri, Nov 9, 2018 at 11:14 AM Paul Ramsey <pramsey@cleverelephant.ca> wrote:
> Neglected to include the footnotes...As I promised at PgConf.eu, here's a follow-up email about PostGIS parallelism and function inlining (an odd combination, it is true).
So, context:
- We want PostGIS to parallelize more. In order to achieve that we need to mark our functions with more realistic COSTs. Much much higher COSTs.
- When we do that, we hit a different problem. Our most commonly used functions, ST_Intersects(), ST_DWithin() are actually SQL wrapper functions are more complex combinations of index operators and exact computational geometry functions.
- In the presence of high cost parameters that are used multiple times in SQL functions, PostgreSQL will stop inlining those functions, in an attempt to save the costs of double-calculating the parameters.
- For us, that's the wrong choice, because we lose the index operators at the same time as we "save" the cost of double calculation.
- We need our wrapper functions inlined, even when they are carrying a high COST.
At pgconf.eu, I canvassed this problem and some potential solutions:
* Solution #1 - Quick and dirty and visible: Add an 'INLINE' function decorator, which tells PostgreSQL to just ignore costs and inline the function regardless. Pros: it's not too hard to implement and I'm happy to contribute this. Cons: it adds very specific single-purpose syntax to CREATE FUNCTION.
* Solution #2 - Quick and dirty and invisible. Tom suggested a hack that achieves the aims of #1 but without adding syntax to CREATE FUNCTION: have the inlining logic look at the cost of the wrapper and the cost of parameters, and if the cost of the wrapper "greatly exceeded" the cost of the parameters, then inline. So the PostGIS project would just set the cost of our wrappers very high, and we'd get the behaviour we want, while other users who want to use wrappers to force caching of calculations would have zero coded wrapper functions. Pros: Solves the problem and easy to implement, I'm happy to contribute. Cons: it's so clearly a hack involving hidden (from users) magic.
* Solution #3 - Correct and globally helpful. When first presented with this problem last year, both Andres and Tom said [1] "but the right fix is to avoid the double-calculation of identical entries in the target list" because then it would be safe to inline functions with duplicate expensive parameters. This would not only address the proximate PostGIS problem but make a whole class of queries faster. There was some discussion of this approach last week [2]. Pros: The right thing! Improves a whole pile of other performance cases. Cons: Hard! Only experienced PgSQL developers need apply.
Naturally, I would love to see #3 implemented, but there's only so much experienced developer time to go around, and it's beyond my current skill set. I would like to be able to start to improve PostGIS parallelism with PgSQL 12, so in order to make that not impossible, I'd like to implement either #1 or #2 in case #3 doesn't happen for PgSQL 12.
So my question to hackers is: which is less worse, #1 or #2, to implement and submit to commitfest, in case #3 does not materialize in time for PgSQL 12?
Absent any preferences, I would be inclined to go with #2, having a high personal tolerance for ugly hacks... :)
P
I wonder if there is a middle ground between #2 and #3. A proper mechanism for deduplicating entries might be hard, but on the inlining stage we already know they're going to get duplicated. Can we make a subplan/lateral join/whatever for arguments and deduplicate just them, as we know about that duplication from structure already?
Another thing I see here is PostGIS using indexes. Can we just always inline if we see an index-accelerated operator (or just an operator) on top level AND inside inlined function?
* Solution #1 - Quick and dirty and visible: Add an 'INLINE' function decorator, which tells PostgreSQL to just ignore costs and inline the function regardless. Pros: it's not too hard to implement and I'm happy to contribute this. Cons: it adds very specific single-purpose syntax to CREATE FUNCTION.
* Solution #2 - Quick and dirty and invisible. Tom suggested a hack that achieves the aims of #1 but without adding syntax to CREATE FUNCTION: have the inlining logic look at the cost of the wrapper and the cost of parameters, and if the cost of the wrapper "greatly exceeded" the cost of the parameters, then inline. So the PostGIS project would just set the cost of our wrappers very high, and we'd get the behaviour we want, while other users who want to use wrappers to force caching of calculations would have zero coded wrapper functions. Pros: Solves the problem and easy to implement, I'm happy to contribute. Cons: it's so clearly a hack involving hidden (from users) magic.
* Solution #3 - Correct and globally helpful. When first presented with this problem last year, both Andres and Tom said [1] "but the right fix is to avoid the double-calculation of identical entries in the target list" because then it would be safe to inline functions with duplicate expensive parameters. This would not only address the proximate PostGIS problem but make a whole class of queries faster. There was some discussion of this approach last week [2]. Pros: The right thing! Improves a whole pile of other performance cases. Cons: Hard! Only experienced PgSQL developers need apply.
Naturally, I would love to see #3 implemented, but there's only so much experienced developer time to go around, and it's beyond my current skill set. I would like to be able to start to improve PostGIS parallelism with PgSQL 12, so in order to make that not impossible, I'd like to implement either #1 or #2 in case #3 doesn't happen for PgSQL 12.
So my question to hackers is: which is less worse, #1 or #2, to implement and submit to commitfest, in case #3 does not materialize in time for PgSQL 12?Absent any preferences, I would be inclined to go with #2, having a high personal tolerance for ugly hacks... :)P
Darafei Praliaskouski
Support me: http://patreon.com/komzpa
Support me: http://patreon.com/komzpa
All, > So, context: > > - We want PostGIS to parallelize more. In order to achieve that we need to mark our functions with more realistic COSTs.Much much higher COSTs. > - When we do that, we hit a different problem. Our most commonly used functions, ST_Intersects(), ST_DWithin() are actuallySQL wrapper functions are more complex combinations of index operators and exact computational geometry functions. > - In the presence of high cost parameters that are used multiple times in SQL functions, PostgreSQL will stop inliningthose functions, in an attempt to save the costs of double-calculating the parameters. > - For us, that's the wrong choice, because we lose the index operators at the same time as we "save" the cost of doublecalculation. > - We need our wrapper functions inlined, even when they are carrying a high COST. > > At pgconf.eu, I canvassed this problem and some potential solutions: > ... > * Solution #2 - Quick and dirty and invisible. Tom suggested a hack that achieves the aims of #1 but without adding syntaxto CREATE FUNCTION: have the inlining logic look at the cost of the wrapper and the cost of parameters, and if thecost of the wrapper "greatly exceeded" the cost of the parameters, then inline. So the PostGIS project would just setthe cost of our wrappers very high, and we'd get the behaviour we want, while other users who want to use wrappers toforce caching of calculations would have zero coded wrapper functions. Pros: Solves the problem and easy to implement,I'm happy to contribute. Cons: it's so clearly a hack involving hidden (from users) magic. > ... > So my question to hackers is: which is less worse, #1 or #2, to implement and submit to commitfest, in case #3 does notmaterialize in time for PgSQL 12? I've been working with Paul to create and test a patch (attached) that addresses Solution #2. This patch essentially modifies the inlining logic to compare the cost of the function with the total cost of the parameters. The goal as stated above, is that for these kinds of functions, they would be assigned relatively high cost to trigger the inlining case. The modification that this patch makes is the following: * Collect the cost for each parameter (no longer short circuits when a single parameter is costly). * Compare the total cost of all parameters to the individual cost of the function. * If the function cost is greater than the parameters, then it will inline the function. * If the function cost is less than the parameters, then it will perform the original parameter checks (short circuiting as necessary). I've included a constant called "INLINE_FUNC_COST_FACTOR" that is currently set to '1'. This is meant to assume for adjustments to what "greatly exceeded" means in the description above. Perhaps this isn't necessary, but I wanted to at least initially provide the flexibility in case it were. I have also attached a simple test case as was originally previously by Paul to demonstrate the functionality. -Adam
Attachment
On Mon, Dec 31, 2018 at 2:23 PM Adam Brightwell <adam.brightwell@crunchydata.com> wrote:
> * Solution #2 - Quick and dirty and invisible. Tom suggested a hack that achieves the aims of #1 but without adding syntax to CREATE FUNCTION: have the inlining logic look at the cost of the wrapper and the cost of parameters, and if the cost of the wrapper "greatly exceeded" the cost of the parameters, then inline. So the PostGIS project would just set the cost of our wrappers very high, and we'd get the behaviour we want, while other users who want to use wrappers to force caching of calculations would have zero coded wrapper functions. Pros: Solves the problem and easy to implement, I'm happy to contribute. Cons: it's so clearly a hack involving hidden (from users) magic.
> ...
> So my question to hackers is: which is less worse, #1 or #2, to implement and submit to commitfest, in case #3 does not materialize in time for PgSQL 12?
I've been working with Paul to create and test a patch (attached) that
addresses Solution #2. This patch essentially modifies the inlining
logic to compare the cost of the function with the total cost of the
parameters. The goal as stated above, is that for these kinds of
functions, they would be assigned relatively high cost to trigger the
inlining case.
I've tested this patch for both the internal types case and for the PostGIS functions case, and in both cases it provides a way to get around the undesirable inlining behaviour for our case without impacting people for whom the behaviour has been desirable in the past.
Is this already in the commitfest?
P.
>> > * Solution #2 - Quick and dirty and invisible. Tom suggested a hack that achieves the aims of #1 but without addingsyntax to CREATE FUNCTION: have the inlining logic look at the cost of the wrapper and the cost of parameters, andif the cost of the wrapper "greatly exceeded" the cost of the parameters, then inline. So the PostGIS project would justset the cost of our wrappers very high, and we'd get the behaviour we want, while other users who want to use wrappersto force caching of calculations would have zero coded wrapper functions. Pros: Solves the problem and easy to implement,I'm happy to contribute. Cons: it's so clearly a hack involving hidden (from users) magic. >> > ... >> > So my question to hackers is: which is less worse, #1 or #2, to implement and submit to commitfest, in case #3 doesnot materialize in time for PgSQL 12? >> >> I've been working with Paul to create and test a patch (attached) that >> addresses Solution #2. This patch essentially modifies the inlining >> logic to compare the cost of the function with the total cost of the >> parameters. The goal as stated above, is that for these kinds of >> functions, they would be assigned relatively high cost to trigger the >> inlining case. > > > I've tested this patch for both the internal types case and for the PostGIS functions case, and in both cases it providesa way to get around the undesirable inlining behaviour for our case without impacting people for whom the behaviourhas been desirable in the past. > > Is this already in the commitfest? Yes. https://commitfest.postgresql.org/21/1965/ -Adam
Adam Brightwell <adam.brightwell@crunchydata.com> writes: > I've been working with Paul to create and test a patch (attached) that > addresses Solution #2. This patch essentially modifies the inlining > logic to compare the cost of the function with the total cost of the > parameters. The goal as stated above, is that for these kinds of > functions, they would be assigned relatively high cost to trigger the > inlining case. I played around with this a bit. I think we could make the rule simpler and more useful by expressing it as "if the total incremental cost from multiple evaluation of parameters exceeds the inlinable function's nominal cost, then don't inline". The existing rule is "no individual param expression can cost more than 10*cpu_operator_cost, if the function uses it more than once --- never mind how many times exactly". With the default cost (100) for a SQL language function, this'd become "the total incremental cost can't be more than 100*cpu_operator_cost". So that's more or less in the same ballpark, though it will make some different decisions in cases near the boundary, and probably will choose to inline in more cases than before. But it's simpler to compute and more defensible as a criterion, IMO. inlining-solution-3a.patch, attached, does it like that, keeping the existing rule that we flat out won't inline if there are any subplans in multiply-evaluated parameters. That seems kind of ugly, though, so I also prepared inlining-solution-3b.patch which eliminates the cruft around subplans in this logic by moving the cruft into cost_qual_eval instead: it has to make an arbitrary assumption about the cost of a SubLink. I'm not really sure which of these I like better. The arbitrary cost assumption in cost_qual_eval is not great, and it'd be really bad if anyone ever tried to use costs for pre-SS_process_sublinks expressions for something more important than the case at hand. But on the other hand it seems nice to get rid of the arbitrary inlining choice. Thoughts? The other thing we need to consider is whether we need any documentation adjustments. I believe that right now, the rules for inlining SQL functions are not documented anywhere but the code, and maybe it's not this patch's job to change that. regards, tom lane diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c index f0ef102..c99304d 100644 *** a/src/backend/optimizer/util/clauses.c --- b/src/backend/optimizer/util/clauses.c *************** evaluate_function(Oid funcid, Oid result *** 4645,4651 **** * unreasonably expensive to evaluate. The expensiveness check not only * prevents us from doing multiple evaluations of an expensive parameter * at runtime, but is a safety value to limit growth of an expression due ! * to repeated inlining. * * We must also beware of changing the volatility or strictness status of * functions by inlining them. --- 4645,4653 ---- * unreasonably expensive to evaluate. The expensiveness check not only * prevents us from doing multiple evaluations of an expensive parameter * at runtime, but is a safety value to limit growth of an expression due ! * to repeated inlining. (Currently, the expensiveness check is expressed ! * as a limit on the total additional cost of extra parameter evaluations, ! * rather than of any one parameter.) * * We must also beware of changing the volatility or strictness status of * functions by inlining them. *************** inline_function(Oid funcid, Oid result_t *** 4682,4687 **** --- 4684,4690 ---- Query *querytree; Node *newexpr; int *usecounts; + Cost delta_param_cost; ListCell *arg; int i; *************** inline_function(Oid funcid, Oid result_t *** 4876,4882 **** newexpr = substitute_actual_parameters(newexpr, funcform->pronargs, args, usecounts); ! /* Now check for parameter usage */ i = 0; foreach(arg, args) { --- 4879,4889 ---- newexpr = substitute_actual_parameters(newexpr, funcform->pronargs, args, usecounts); ! /* ! * Check for parameter usage, and estimate the delta in evaluation cost ! * resulting from parameters possibly being evaluated more than once. ! */ ! delta_param_cost = 0.0; i = 0; foreach(arg, args) { *************** inline_function(Oid funcid, Oid result_t *** 4887,4922 **** /* Param not used at all: uncool if func is strict */ if (funcform->proisstrict) goto fail; } else if (usecounts[i] != 1) { ! /* Param used multiple times: uncool if expensive or volatile */ QualCost eval_cost; /* ! * We define "expensive" as "contains any subplan or more than 10 ! * operators". Note that the subplan search has to be done ! * explicitly, since cost_qual_eval() will barf on unplanned ! * subselects. */ if (contain_subplans(param)) goto fail; - cost_qual_eval(&eval_cost, list_make1(param), NULL); - if (eval_cost.startup + eval_cost.per_tuple > - 10 * cpu_operator_cost) - goto fail; /* ! * Check volatility last since this is more expensive than the ! * above tests */ ! if (contain_volatile_functions(param)) ! goto fail; } i++; } /* * Whew --- we can make the substitution. Copy the modified expression * out of the temporary memory context, and clean up. */ --- 4894,4949 ---- /* Param not used at all: uncool if func is strict */ if (funcform->proisstrict) goto fail; + + /* + * In principle, we should credit the eval cost of the param expr + * as a savings from inlining. However, it's hard to deal with + * subplans properly, so we don't bother. This case is unlikely + * anyway, so we just try to be semantically correct. + */ } else if (usecounts[i] != 1) { ! /* Param used multiple times: uncool if volatile */ QualCost eval_cost; + if (contain_volatile_functions(param)) + goto fail; + /* ! * If the param contains any SubLinks, cost_qual_eval() will barf ! * on it. Arbitrarily assume that the true cost of a subplan is ! * high enough that we should not inline. */ if (contain_subplans(param)) goto fail; /* ! * Otherwise, estimate the evaluation cost of the param, and ! * charge for the N-1 extra evaluations. Note that we consider ! * only per-tuple cost not startup cost. Ideally we'd account for ! * that somehow, but it's hard since we have as yet no idea how ! * many times the query will execute this expression. */ ! cost_qual_eval_node(&eval_cost, param, context->root); ! delta_param_cost += eval_cost.per_tuple * (usecounts[i] - 1); } i++; } /* + * If the incremental cost of multiple parameter evaluation exceeds the + * nominal cost of the function we're inlining, don't inline. With the + * default cost for a SQL-language function, this rule causes us to not + * inline if multiple evaluation would add more than 100 times + * cpu_operator_cost. However, function authors can adjust the claimed + * cost of inlinable functions to encourage or discourage inlining in the + * presence of multiple parameter usage. + */ + if (delta_param_cost > funcform->procost * cpu_operator_cost) + goto fail; + + /* * Whew --- we can make the substitution. Copy the modified expression * out of the temporary memory context, and clean up. */ diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c index 99c5ad9..eb311fa 100644 *** a/src/backend/optimizer/path/costsize.c --- b/src/backend/optimizer/path/costsize.c *************** cost_qual_eval_walker(Node *node, cost_q *** 3929,3936 **** } else if (IsA(node, SubLink)) { ! /* This routine should not be applied to un-planned expressions */ ! elog(ERROR, "cannot handle unplanned sub-select"); } else if (IsA(node, SubPlan)) { --- 3929,3943 ---- } else if (IsA(node, SubLink)) { ! /* ! * In normal usage we won't see un-planned SubLinks here, because they ! * will previously have been converted to SubPlans or InitPlans. ! * However, we can reach here due to inline_function() wishing to ! * estimate costs for function parameter expressions. Arbitrarily ! * return a fairly large cost to discourage inlining when that would ! * cause multiple eval of a subplan. ! */ ! context->total.per_tuple += 100000 * cpu_operator_cost; } else if (IsA(node, SubPlan)) { diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c index f0ef102..84fdcc6 100644 *** a/src/backend/optimizer/util/clauses.c --- b/src/backend/optimizer/util/clauses.c *************** evaluate_function(Oid funcid, Oid result *** 4645,4651 **** * unreasonably expensive to evaluate. The expensiveness check not only * prevents us from doing multiple evaluations of an expensive parameter * at runtime, but is a safety value to limit growth of an expression due ! * to repeated inlining. * * We must also beware of changing the volatility or strictness status of * functions by inlining them. --- 4645,4653 ---- * unreasonably expensive to evaluate. The expensiveness check not only * prevents us from doing multiple evaluations of an expensive parameter * at runtime, but is a safety value to limit growth of an expression due ! * to repeated inlining. (Currently, the expensiveness check is expressed ! * as a limit on the total additional cost of extra parameter evaluations, ! * rather than of any one parameter.) * * We must also beware of changing the volatility or strictness status of * functions by inlining them. *************** inline_function(Oid funcid, Oid result_t *** 4682,4687 **** --- 4684,4690 ---- Query *querytree; Node *newexpr; int *usecounts; + Cost delta_param_cost; ListCell *arg; int i; *************** inline_function(Oid funcid, Oid result_t *** 4876,4882 **** newexpr = substitute_actual_parameters(newexpr, funcform->pronargs, args, usecounts); ! /* Now check for parameter usage */ i = 0; foreach(arg, args) { --- 4879,4889 ---- newexpr = substitute_actual_parameters(newexpr, funcform->pronargs, args, usecounts); ! /* ! * Check for parameter usage, and estimate the delta in evaluation cost ! * resulting from parameters possibly being evaluated more than once. ! */ ! delta_param_cost = 0.0; i = 0; foreach(arg, args) { *************** inline_function(Oid funcid, Oid result_t *** 4888,4922 **** if (funcform->proisstrict) goto fail; } ! else if (usecounts[i] != 1) { ! /* Param used multiple times: uncool if expensive or volatile */ ! QualCost eval_cost; ! ! /* ! * We define "expensive" as "contains any subplan or more than 10 ! * operators". Note that the subplan search has to be done ! * explicitly, since cost_qual_eval() will barf on unplanned ! * subselects. ! */ ! if (contain_subplans(param)) ! goto fail; ! cost_qual_eval(&eval_cost, list_make1(param), NULL); ! if (eval_cost.startup + eval_cost.per_tuple > ! 10 * cpu_operator_cost) goto fail; /* ! * Check volatility last since this is more expensive than the ! * above tests */ ! if (contain_volatile_functions(param)) ! goto fail; } i++; } /* * Whew --- we can make the substitution. Copy the modified expression * out of the temporary memory context, and clean up. */ --- 4895,4939 ---- if (funcform->proisstrict) goto fail; } ! else if (usecounts[i] > 1) { ! /* Param used multiple times: uncool if volatile */ ! if (contain_volatile_functions(param)) goto fail; + } + if (usecounts[i] != 1) + { /* ! * Estimate the evaluation cost of the param, and charge for the ! * N-1 extra evaluations --- or, if it's not used at all, credit ! * the savings. Note that we consider only per-tuple cost, not ! * startup cost. Ideally we'd account for that somehow, but it's ! * hard since we have as yet no idea how many times the query will ! * execute this expression. */ ! QualCost eval_cost; ! ! cost_qual_eval_node(&eval_cost, param, context->root); ! delta_param_cost += eval_cost.per_tuple * (usecounts[i] - 1); } + i++; } /* + * If the incremental cost of multiple parameter evaluation exceeds the + * nominal cost of the function we're inlining, don't inline. With the + * default cost for a SQL-language function, this rule causes us to not + * inline if multiple evaluation would add more than 100 times + * cpu_operator_cost. However, function authors can adjust the claimed + * cost of inlinable functions to encourage or discourage inlining in the + * presence of multiple parameter usage. + */ + if (delta_param_cost > funcform->procost * cpu_operator_cost) + goto fail; + + /* * Whew --- we can make the substitution. Copy the modified expression * out of the temporary memory context, and clean up. */
> On Jan 19, 2019, at 12:25 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Adam Brightwell <adam.brightwell@crunchydata.com> writes: >> I've been working with Paul to create and test a patch (attached) that >> addresses Solution #2. This patch essentially modifies the inlining >> logic to compare the cost of the function with the total cost of the >> parameters. The goal as stated above, is that for these kinds of >> functions, they would be assigned relatively high cost to trigger the >> inlining case. > > I played around with this a bit. I think we could make the rule simpler > and more useful by expressing it as "if the total incremental cost from > multiple evaluation of parameters exceeds the inlinable function's > nominal cost, then don't inline". The existing rule is "no individual > param expression can cost more than 10*cpu_operator_cost, if the > function uses it more than once --- never mind how many times exactly". > With the default cost (100) for a SQL language function, this'd become > "the total incremental cost can't be more than 100*cpu_operator_cost". > So that's more or less in the same ballpark, though it will make some > different decisions in cases near the boundary, and probably will choose > to inline in more cases than before. But it's simpler to compute and > more defensible as a criterion, IMO. Is there a rule of thumb we can use in costing our wrappers to ensure that they always inline? P > inlining-solution-3a.patch, attached, does it like that, keeping the > existing rule that we flat out won't inline if there are any subplans > in multiply-evaluated parameters. That seems kind of ugly, though, > so I also prepared inlining-solution-3b.patch which eliminates the > cruft around subplans in this logic by moving the cruft into > cost_qual_eval instead: it has to make an arbitrary assumption about > the cost of a SubLink. > > I'm not really sure which of these I like better. The arbitrary cost > assumption in cost_qual_eval is not great, and it'd be really bad if > anyone ever tried to use costs for pre-SS_process_sublinks expressions > for something more important than the case at hand. But on the other > hand it seems nice to get rid of the arbitrary inlining choice. > > Thoughts? > > The other thing we need to consider is whether we need any documentation > adjustments. I believe that right now, the rules for inlining SQL > functions are not documented anywhere but the code, and maybe it's not > this patch's job to change that. > > regards, tom lane > > diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c > index f0ef102..c99304d 100644 > *** a/src/backend/optimizer/util/clauses.c > --- b/src/backend/optimizer/util/clauses.c > *************** evaluate_function(Oid funcid, Oid result > *** 4645,4651 **** > * unreasonably expensive to evaluate. The expensiveness check not only > * prevents us from doing multiple evaluations of an expensive parameter > * at runtime, but is a safety value to limit growth of an expression due > ! * to repeated inlining. > * > * We must also beware of changing the volatility or strictness status of > * functions by inlining them. > --- 4645,4653 ---- > * unreasonably expensive to evaluate. The expensiveness check not only > * prevents us from doing multiple evaluations of an expensive parameter > * at runtime, but is a safety value to limit growth of an expression due > ! * to repeated inlining. (Currently, the expensiveness check is expressed > ! * as a limit on the total additional cost of extra parameter evaluations, > ! * rather than of any one parameter.) > * > * We must also beware of changing the volatility or strictness status of > * functions by inlining them. > *************** inline_function(Oid funcid, Oid result_t > *** 4682,4687 **** > --- 4684,4690 ---- > Query *querytree; > Node *newexpr; > int *usecounts; > + Cost delta_param_cost; > ListCell *arg; > int i; > > *************** inline_function(Oid funcid, Oid result_t > *** 4876,4882 **** > newexpr = substitute_actual_parameters(newexpr, funcform->pronargs, > args, usecounts); > > ! /* Now check for parameter usage */ > i = 0; > foreach(arg, args) > { > --- 4879,4889 ---- > newexpr = substitute_actual_parameters(newexpr, funcform->pronargs, > args, usecounts); > > ! /* > ! * Check for parameter usage, and estimate the delta in evaluation cost > ! * resulting from parameters possibly being evaluated more than once. > ! */ > ! delta_param_cost = 0.0; > i = 0; > foreach(arg, args) > { > *************** inline_function(Oid funcid, Oid result_t > *** 4887,4922 **** > /* Param not used at all: uncool if func is strict */ > if (funcform->proisstrict) > goto fail; > } > else if (usecounts[i] != 1) > { > ! /* Param used multiple times: uncool if expensive or volatile */ > QualCost eval_cost; > > /* > ! * We define "expensive" as "contains any subplan or more than 10 > ! * operators". Note that the subplan search has to be done > ! * explicitly, since cost_qual_eval() will barf on unplanned > ! * subselects. > */ > if (contain_subplans(param)) > goto fail; > - cost_qual_eval(&eval_cost, list_make1(param), NULL); > - if (eval_cost.startup + eval_cost.per_tuple > > - 10 * cpu_operator_cost) > - goto fail; > > /* > ! * Check volatility last since this is more expensive than the > ! * above tests > */ > ! if (contain_volatile_functions(param)) > ! goto fail; > } > i++; > } > > /* > * Whew --- we can make the substitution. Copy the modified expression > * out of the temporary memory context, and clean up. > */ > --- 4894,4949 ---- > /* Param not used at all: uncool if func is strict */ > if (funcform->proisstrict) > goto fail; > + > + /* > + * In principle, we should credit the eval cost of the param expr > + * as a savings from inlining. However, it's hard to deal with > + * subplans properly, so we don't bother. This case is unlikely > + * anyway, so we just try to be semantically correct. > + */ > } > else if (usecounts[i] != 1) > { > ! /* Param used multiple times: uncool if volatile */ > QualCost eval_cost; > > + if (contain_volatile_functions(param)) > + goto fail; > + > /* > ! * If the param contains any SubLinks, cost_qual_eval() will barf > ! * on it. Arbitrarily assume that the true cost of a subplan is > ! * high enough that we should not inline. > */ > if (contain_subplans(param)) > goto fail; > > /* > ! * Otherwise, estimate the evaluation cost of the param, and > ! * charge for the N-1 extra evaluations. Note that we consider > ! * only per-tuple cost not startup cost. Ideally we'd account for > ! * that somehow, but it's hard since we have as yet no idea how > ! * many times the query will execute this expression. > */ > ! cost_qual_eval_node(&eval_cost, param, context->root); > ! delta_param_cost += eval_cost.per_tuple * (usecounts[i] - 1); > } > i++; > } > > /* > + * If the incremental cost of multiple parameter evaluation exceeds the > + * nominal cost of the function we're inlining, don't inline. With the > + * default cost for a SQL-language function, this rule causes us to not > + * inline if multiple evaluation would add more than 100 times > + * cpu_operator_cost. However, function authors can adjust the claimed > + * cost of inlinable functions to encourage or discourage inlining in the > + * presence of multiple parameter usage. > + */ > + if (delta_param_cost > funcform->procost * cpu_operator_cost) > + goto fail; > + > + /* > * Whew --- we can make the substitution. Copy the modified expression > * out of the temporary memory context, and clean up. > */ > diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c > index 99c5ad9..eb311fa 100644 > *** a/src/backend/optimizer/path/costsize.c > --- b/src/backend/optimizer/path/costsize.c > *************** cost_qual_eval_walker(Node *node, cost_q > *** 3929,3936 **** > } > else if (IsA(node, SubLink)) > { > ! /* This routine should not be applied to un-planned expressions */ > ! elog(ERROR, "cannot handle unplanned sub-select"); > } > else if (IsA(node, SubPlan)) > { > --- 3929,3943 ---- > } > else if (IsA(node, SubLink)) > { > ! /* > ! * In normal usage we won't see un-planned SubLinks here, because they > ! * will previously have been converted to SubPlans or InitPlans. > ! * However, we can reach here due to inline_function() wishing to > ! * estimate costs for function parameter expressions. Arbitrarily > ! * return a fairly large cost to discourage inlining when that would > ! * cause multiple eval of a subplan. > ! */ > ! context->total.per_tuple += 100000 * cpu_operator_cost; > } > else if (IsA(node, SubPlan)) > { > diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c > index f0ef102..84fdcc6 100644 > *** a/src/backend/optimizer/util/clauses.c > --- b/src/backend/optimizer/util/clauses.c > *************** evaluate_function(Oid funcid, Oid result > *** 4645,4651 **** > * unreasonably expensive to evaluate. The expensiveness check not only > * prevents us from doing multiple evaluations of an expensive parameter > * at runtime, but is a safety value to limit growth of an expression due > ! * to repeated inlining. > * > * We must also beware of changing the volatility or strictness status of > * functions by inlining them. > --- 4645,4653 ---- > * unreasonably expensive to evaluate. The expensiveness check not only > * prevents us from doing multiple evaluations of an expensive parameter > * at runtime, but is a safety value to limit growth of an expression due > ! * to repeated inlining. (Currently, the expensiveness check is expressed > ! * as a limit on the total additional cost of extra parameter evaluations, > ! * rather than of any one parameter.) > * > * We must also beware of changing the volatility or strictness status of > * functions by inlining them. > *************** inline_function(Oid funcid, Oid result_t > *** 4682,4687 **** > --- 4684,4690 ---- > Query *querytree; > Node *newexpr; > int *usecounts; > + Cost delta_param_cost; > ListCell *arg; > int i; > > *************** inline_function(Oid funcid, Oid result_t > *** 4876,4882 **** > newexpr = substitute_actual_parameters(newexpr, funcform->pronargs, > args, usecounts); > > ! /* Now check for parameter usage */ > i = 0; > foreach(arg, args) > { > --- 4879,4889 ---- > newexpr = substitute_actual_parameters(newexpr, funcform->pronargs, > args, usecounts); > > ! /* > ! * Check for parameter usage, and estimate the delta in evaluation cost > ! * resulting from parameters possibly being evaluated more than once. > ! */ > ! delta_param_cost = 0.0; > i = 0; > foreach(arg, args) > { > *************** inline_function(Oid funcid, Oid result_t > *** 4888,4922 **** > if (funcform->proisstrict) > goto fail; > } > ! else if (usecounts[i] != 1) > { > ! /* Param used multiple times: uncool if expensive or volatile */ > ! QualCost eval_cost; > ! > ! /* > ! * We define "expensive" as "contains any subplan or more than 10 > ! * operators". Note that the subplan search has to be done > ! * explicitly, since cost_qual_eval() will barf on unplanned > ! * subselects. > ! */ > ! if (contain_subplans(param)) > ! goto fail; > ! cost_qual_eval(&eval_cost, list_make1(param), NULL); > ! if (eval_cost.startup + eval_cost.per_tuple > > ! 10 * cpu_operator_cost) > goto fail; > > /* > ! * Check volatility last since this is more expensive than the > ! * above tests > */ > ! if (contain_volatile_functions(param)) > ! goto fail; > } > i++; > } > > /* > * Whew --- we can make the substitution. Copy the modified expression > * out of the temporary memory context, and clean up. > */ > --- 4895,4939 ---- > if (funcform->proisstrict) > goto fail; > } > ! else if (usecounts[i] > 1) > { > ! /* Param used multiple times: uncool if volatile */ > ! if (contain_volatile_functions(param)) > goto fail; > + } > > + if (usecounts[i] != 1) > + { > /* > ! * Estimate the evaluation cost of the param, and charge for the > ! * N-1 extra evaluations --- or, if it's not used at all, credit > ! * the savings. Note that we consider only per-tuple cost, not > ! * startup cost. Ideally we'd account for that somehow, but it's > ! * hard since we have as yet no idea how many times the query will > ! * execute this expression. > */ > ! QualCost eval_cost; > ! > ! cost_qual_eval_node(&eval_cost, param, context->root); > ! delta_param_cost += eval_cost.per_tuple * (usecounts[i] - 1); > } > + > i++; > } > > /* > + * If the incremental cost of multiple parameter evaluation exceeds the > + * nominal cost of the function we're inlining, don't inline. With the > + * default cost for a SQL-language function, this rule causes us to not > + * inline if multiple evaluation would add more than 100 times > + * cpu_operator_cost. However, function authors can adjust the claimed > + * cost of inlinable functions to encourage or discourage inlining in the > + * presence of multiple parameter usage. > + */ > + if (delta_param_cost > funcform->procost * cpu_operator_cost) > + goto fail; > + > + /* > * Whew --- we can make the substitution. Copy the modified expression > * out of the temporary memory context, and clean up. > */
Paul Ramsey <pramsey@cleverelephant.ca> writes: > Is there a rule of thumb we can use in costing our wrappers to ensure that they always inline? Not really, which is a weak spot of this whole approach. In particular, if you just make procost really big, you risk screwing things up rather badly in cases where inlining fails for whatever reason. Also I'm slightly concerned that we'd end up with an "arms race" where different functions are all trying to out-cost each other. I don't know whether you've actually tried to use a fix along this line? It might be good to experiment with whether you can actually improve matters for PostGIS before we commit to supporting this hack. regards, tom lane
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: Tom> The other thing we need to consider is whether we need any Tom> documentation adjustments. I believe that right now, the rules for Tom> inlining SQL functions are not documented anywhere but the code, That is correct, though we got so tired of explaining it on IRC that there is a wiki page (though it hasn't been updated since 9.5, but I'm pretty sure there aren't any significant changes needed): https://wiki.postgresql.org/wiki/Inlining_of_SQL_functions -- Andrew (irc:RhodiumToad)
I wrote: > Paul Ramsey <pramsey@cleverelephant.ca> writes: >> Is there a rule of thumb we can use in costing our wrappers to ensure that they always inline? > Not really, which is a weak spot of this whole approach. BTW, it suddenly strikes me that at least for the specific cases you've shown in this thread, worrying about forcing inlining despite multiple parameter occurrences is solving the wrong problem. As I understand it, what you're doing is basically expanding some_operation(x,y) into x indexable_operator y AND exact_condition(x,y) where exact_condition(x,y) is semantically identical to some_operation(x,y), and the point of the maneuver is to inject an indexable operator that implements some lossy form of the exact_condition. But this is not an ideal solution: aside from the possible cost of evaluating x and y twice, adding the indexable_operator is just a dead loss if you don't end up with an indexscan on x. So ISTM what we *really* want is that the wrapper function is just an alias for exact_condition() --- which eliminates the multiple-evaluation hazards and hence the risk of not inlining --- and then teach the planner how to extract the indexable_operator when, and only when, it's actually useful for an indexscan. The machinery for that sort of thing already exists; it's what indxpath.c calls a "special index operator". But right now it's only applied to some built-in operators such as LIKE. I've had a personal TODO item for a very long time to make that ability accessible to extensions, but it never rose to the top of the queue. Maybe we should be trying to make that happen, instead of messing around with dubious changes to the inlining rules. Does this line of thought seem like it would fix your problem, or are there places where the inlining rules are causing you issues but the use-case doesn't look like "add a lossy indexable operator"? regards, tom lane
> On Jan 19, 2019, at 5:53 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I wrote: >> Paul Ramsey <pramsey@cleverelephant.ca> writes: >>> Is there a rule of thumb we can use in costing our wrappers to ensure that they always inline? > >> Not really, which is a weak spot of this whole approach. > > BTW, it suddenly strikes me that at least for the specific cases you've > shown in this thread, worrying about forcing inlining despite multiple > parameter occurrences is solving the wrong problem. As I understand it, > what you're doing is basically expanding some_operation(x,y) into > > x indexable_operator y AND exact_condition(x,y) > > where exact_condition(x,y) is semantically identical to > some_operation(x,y), and the point of the maneuver is to inject > an indexable operator that implements some lossy form of the > exact_condition. But this is not an ideal solution: aside > from the possible cost of evaluating x and y twice, adding > the indexable_operator is just a dead loss if you don't end up > with an indexscan on x. > > So ISTM what we *really* want is that the wrapper function is > just an alias for exact_condition() --- which eliminates the > multiple-evaluation hazards and hence the risk of not inlining --- > and then teach the planner how to extract the indexable_operator > when, and only when, it's actually useful for an indexscan. > > The machinery for that sort of thing already exists; it's what > indxpath.c calls a "special index operator". But right now it's > only applied to some built-in operators such as LIKE. I've had > a personal TODO item for a very long time to make that ability > accessible to extensions, but it never rose to the top of the > queue. Maybe we should be trying to make that happen, instead > of messing around with dubious changes to the inlining rules. > > Does this line of thought seem like it would fix your problem, > or are there places where the inlining rules are causing you > issues but the use-case doesn't look like "add a lossy indexable > operator”? While I’m not 100% sure what this new thing would “look like” there are at least a few quirky variations on the pattern you’vecorrectly identified. So, ST_Intersects(a, b), ST_Contains(a, b) these all match the pattern. One that is similar but not quite matching is ST_DWithin(a, b, radius), which expands to a && expand(b, radius) and b && expand(a, radius) and _ST_DWithin(a, b, radius) so the operator call is doubled up and includes a function all on one operand rather than a bare call. there may be a few other non-standard pattern inlines lying around, but that’s the most commonly used one. P > > regards, tom lane
Paul Ramsey <pramsey@cleverelephant.ca> writes: > On Jan 19, 2019, at 5:53 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> [ maybe what we need is special index operators for extensions ] > While I’m not 100% sure what this new thing would “look like” there are at least a few quirky variations on the patternyou’ve correctly identified. So, ST_Intersects(a, b), ST_Contains(a, b) these all match the pattern. > One that is similar but not quite matching is ST_DWithin(a, b, radius), which expands to > a && expand(b, radius) and b && expand(a, radius) and _ST_DWithin(a, b, radius) > so the operator call is doubled up and includes a function all on one operand rather than a bare call. Hm. That seems like it could be made to fit the pattern. The basic charter of the transform function I have in mind would be "given this function call, produce an indexable clause to apply to this index". Your example suggests that we might sometimes want to produce more than one indexable clause, which is an interesting point but not hard to accommodate --- just have the function return a list rather than necessarily a single clause. The business with expand() seems like something the transform function could easily deal with. I'll try to sketch up a more concrete plan soon. regards, tom lane
I wrote: > I'll try to sketch up a more concrete plan soon. I've posted some preliminary design ideas at https://www.postgresql.org/message-id/15193.1548028093@sss.pgh.pa.us and https://www.postgresql.org/message-id/15289.1548028233@sss.pgh.pa.us While there's a nontrivial amount of work needed to make that happen, I think it's doable, and it would lead to a significantly better solution than proceeding along the inlining path could do. My current feeling, therefore, is that we should reject this patch (or at least stick it in the deep freeze) and go work on that plan. regards, tom lane
Hi, On 2019-01-19 20:53:55 -0500, Tom Lane wrote: > I wrote: > > Paul Ramsey <pramsey@cleverelephant.ca> writes: > >> Is there a rule of thumb we can use in costing our wrappers to ensure that they always inline? > > > Not really, which is a weak spot of this whole approach. > > BTW, it suddenly strikes me that at least for the specific cases you've > shown in this thread, worrying about forcing inlining despite multiple > parameter occurrences is solving the wrong problem. As I understand it, > what you're doing is basically expanding some_operation(x,y) into > > x indexable_operator y AND exact_condition(x,y) > > where exact_condition(x,y) is semantically identical to > some_operation(x,y), and the point of the maneuver is to inject > an indexable operator that implements some lossy form of the > exact_condition. But this is not an ideal solution: aside > from the possible cost of evaluating x and y twice, adding > the indexable_operator is just a dead loss if you don't end up > with an indexscan on x. > > So ISTM what we *really* want is that the wrapper function is > just an alias for exact_condition() --- which eliminates the > multiple-evaluation hazards and hence the risk of not inlining --- > and then teach the planner how to extract the indexable_operator > when, and only when, it's actually useful for an indexscan. > > The machinery for that sort of thing already exists; it's what > indxpath.c calls a "special index operator". But right now it's > only applied to some built-in operators such as LIKE. I've had > a personal TODO item for a very long time to make that ability > accessible to extensions, but it never rose to the top of the > queue. Maybe we should be trying to make that happen, instead > of messing around with dubious changes to the inlining rules. > > Does this line of thought seem like it would fix your problem, > or are there places where the inlining rules are causing you > issues but the use-case doesn't look like "add a lossy indexable > operator"? Paul, is what Tom calls indexable_operator here just useful for indexability, or *also* useful to reduce the frequency of invoking the exact_condition(), which presumably will frequently be much more expensive to evaluate? With the current scheme, as long as it works, the indexable fastpath filters out rows quickly for both sequential and index scans, but I'm not sure that'd work in the transformation scheme Tom proposes. Sure, the exact_condition could always do the cheaper pre-check, but that'd be wasted effort in the index scan case, where that'd presumably already be ruled out. Actually an issue, or not? Greetings, Andres Freund
On Jan 21, 2019, at 3:17 PM, Andres Freund <andres@anarazel.de> wrote:Hi,
On 2019-01-19 20:53:55 -0500, Tom Lane wrote:I wrote:Paul Ramsey <pramsey@cleverelephant.ca> writes:Is there a rule of thumb we can use in costing our wrappers to ensure that they always inline?Not really, which is a weak spot of this whole approach.
BTW, it suddenly strikes me that at least for the specific cases you've
shown in this thread, worrying about forcing inlining despite multiple
parameter occurrences is solving the wrong problem. As I understand it,
what you're doing is basically expanding some_operation(x,y) into
x indexable_operator y AND exact_condition(x,y)
where exact_condition(x,y) is semantically identical to
some_operation(x,y), and the point of the maneuver is to inject
an indexable operator that implements some lossy form of the
exact_condition. But this is not an ideal solution: aside
from the possible cost of evaluating x and y twice, adding
the indexable_operator is just a dead loss if you don't end up
with an indexscan on x.
So ISTM what we *really* want is that the wrapper function is
just an alias for exact_condition() --- which eliminates the
multiple-evaluation hazards and hence the risk of not inlining ---
and then teach the planner how to extract the indexable_operator
when, and only when, it's actually useful for an indexscan.
The machinery for that sort of thing already exists; it's what
indxpath.c calls a "special index operator". But right now it's
only applied to some built-in operators such as LIKE. I've had
a personal TODO item for a very long time to make that ability
accessible to extensions, but it never rose to the top of the
queue. Maybe we should be trying to make that happen, instead
of messing around with dubious changes to the inlining rules.
Does this line of thought seem like it would fix your problem,
or are there places where the inlining rules are causing you
issues but the use-case doesn't look like "add a lossy indexable
operator"?
Paul, is what Tom calls indexable_operator here just useful for
indexability, or *also* useful to reduce the frequency of invoking the
exact_condition(), which presumably will frequently be much more
expensive to evaluate? With the current scheme, as long as it works,
the indexable fastpath filters out rows quickly for both sequential and
index scans, but I'm not sure that'd work in the transformation scheme
Tom proposes. Sure, the exact_condition could always do the cheaper
pre-check, but that'd be wasted effort in the index scan case, where
that'd presumably already be ruled out.
Actually an issue, or not?
As a practical matter, most of the exact-test functions have a preamble that checks the bbox, so in the seqscan case having the operator along for the ride isn’t any advantage. In any event, if we do have exact tests w/o a lossy preamble, we could add that for v12, as this renovation won’t be a small one if we go this direction.
My only concern, as a plebian, is that what he have, hacky or otherwise, works, and the quick’n’dirty solution also works, and the new hotness seems quite complex, relatively speaking. I guess the new hotness also gets us the ability to do selectivity at a function level, which is also something we are interested in the future. so there’s that, but … I’d just be happy to solve the hacky inline problem :)
P
Greetings,
Andres Freund
Hi, On 2019-01-21 15:21:29 -0800, Paul Ramsey wrote: > As a practical matter, most of the exact-test functions have a > preamble that checks the bbox, so in the seqscan case having the > operator along for the ride isn’t any advantage. In any event, if we > do have exact tests w/o a lossy preamble, we could add that for v12, > as this renovation won’t be a small one if we go this direction. How expensive are the bbox checks in comparison to the exact tests? IOW, how much of a problem is it to potentially do a bbox check twice? Greetings, Andres Freund
> On Jan 21, 2019, at 3:27 PM, Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2019-01-21 15:21:29 -0800, Paul Ramsey wrote: >> As a practical matter, most of the exact-test functions have a >> preamble that checks the bbox, so in the seqscan case having the >> operator along for the ride isn’t any advantage. In any event, if we >> do have exact tests w/o a lossy preamble, we could add that for v12, >> as this renovation won’t be a small one if we go this direction. > > How expensive are the bbox checks in comparison to the exact tests? IOW, > how much of a problem is it to potentially do a bbox check twice? Very very cheap. The geometry object usually has a bbox already instantiated and stored along with the actual coordinates.The exceptions are objects (points, two-vertex lines) that are basically their own boxes anyways. P > > Greetings, > > Andres Freund
Further question about this ... I'm thinking about exactly when indxpath.c should interrogate extensions' planner support functions. Since it'll cost some cycles to look up and call such functions, we don't want to do it if there's little or no chance of getting an index match. Currently, what match_clause_to_indexcol() does first for an operator clause is to see if match_index_to_operand() succeeds for either operator input; if so it investigates more closely, if not it ignores the clause. I'm thinking of maintaining that rule for operators and adding the same rule for function clauses appearing in WHERE. Therefore, an extension would only have the opportunity to add a lossy index clause when some argument of its function or operator is the same Var or expression that is indexed. Is that going to be sufficient, or are there weird cases where an index condition could be built but none of the top-level function or operator inputs exactly match the index column? The cases you've shown us, where you transform a function with argument x into an indexable operator with the same argument x as the potentially-indexed side, don't look like they'd have any trouble with that restriction. But I'm worried that maybe I'm missing something. regards, tom lane
I wrote: > I've posted some preliminary design ideas at > https://www.postgresql.org/message-id/15193.1548028093@sss.pgh.pa.us > and > https://www.postgresql.org/message-id/15289.1548028233@sss.pgh.pa.us > While there's a nontrivial amount of work needed to make that happen, > I think it's doable, and it would lead to a significantly better > solution than proceeding along the inlining path could do. My > current feeling, therefore, is that we should reject this patch > (or at least stick it in the deep freeze) and go work on that plan. Now that the first of those threads has reached a feature-complete state, I feel fairly comfortable in saying that we should drop the idea of messing with the inlining heuristics (at least for the particular end goal stated in this thread). So I'm going to go close this CF entry as returned-with-feedback. regards, tom lane
> On Feb 3, 2019, at 3:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I wrote: >> I've posted some preliminary design ideas at >> https://www.postgresql.org/message-id/15193.1548028093@sss.pgh.pa.us >> and >> https://www.postgresql.org/message-id/15289.1548028233@sss.pgh.pa.us >> While there's a nontrivial amount of work needed to make that happen, >> I think it's doable, and it would lead to a significantly better >> solution than proceeding along the inlining path could do. My >> current feeling, therefore, is that we should reject this patch >> (or at least stick it in the deep freeze) and go work on that plan. > > Now that the first of those threads has reached a feature-complete > state, I feel fairly comfortable in saying that we should drop the > idea of messing with the inlining heuristics (at least for the > particular end goal stated in this thread). So I'm going to go > close this CF entry as returned-with-feedback. > > regards, tom lane Hokay… I’ve read through the patch set, applied it and built it, all works. Am starting to try a test implementation in PostGISland. Robert’s comment about “PostgreSQL magic” is ringing through my head ... Nodes and Ops and Exprs, oh my! Whatever doesn’t kill me only makes me stronger, right? :) P.
Paul Ramsey <pramsey@cleverelephant.ca> writes: > Hokay… I’ve read through the patch set, applied it and built it, all works. Am starting to try a test implementation inPostGIS land. Robert’s comment about “PostgreSQL magic” is ringing through my head ... Nodes and Ops and Exprs, oh my!What ever doesn’t kill me only makes me stronger, right? :) Thanks for looking at it! I think you are going to have some issues with identifying the operators to use in PostGIS, since unlike the core code you can't assume fixed OIDs for anything. The other thread I'd started has some ideas about fixing that, and I hope to get something into v12 for it, but it's not done yet. For testing purposes it might be enough to look up your index opfamilies by name, though I'd not really recommend that as a production answer. You're also kind of jumping the gun on the documentation ;-). I need to write a lot more in supportnodes.h about how to use the IndexCondition callback, but the basic idea is to generate a list of OpExprs (at least that'd be the normal case) that represent a directly-indexable condition. It's entirely up to you to ensure that they *are* indexable conditions, ie with an operator that belongs to the index's opfamily, index key on the left, pseudoconstant value on the right. The sample code that's there now for LIKE/regex kind of punts on the last point: since it can only do anything useful with a simple Const pattern, it doesn't have to think hard about what's acceptable. If you want to allow non-simple-Const RHS then you need to reject volatile functions and Vars of the indexed relation. I was thinking of exposing a function specifically to make that test, rather than requiring extensions to copy the logic, but I didn't do it yet. Anyway, happy to help if you have questions, just ask. regards, tom lane
> On Feb 5, 2019, at 11:07 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Paul Ramsey <pramsey@cleverelephant.ca> writes: >> Hokay… I’ve read through the patch set, applied it and built it, all works. Am starting to try a test implementation inPostGIS land. Robert’s comment about “PostgreSQL magic” is ringing through my head ... Nodes and Ops and Exprs, oh my!What ever doesn’t kill me only makes me stronger, right? :) > > Thanks for looking at it! I think you are going to have some issues > with identifying the operators to use in PostGIS, since unlike the > core code you can't assume fixed OIDs for anything. The other thread > I'd started has some ideas about fixing that, and I hope to get > something into v12 for it, but it's not done yet. > > For testing purposes it might be enough to look up your index opfamilies > by name, though I'd not really recommend that as a production answer. > > You're also kind of jumping the gun on the documentation ;-). I need > to write a lot more in supportnodes.h about how to use the IndexCondition > callback, but the basic idea is to generate a list of OpExprs (at least > that'd be the normal case) that represent a directly-indexable condition. > It's entirely up to you to ensure that they *are* indexable conditions, > ie with an operator that belongs to the index's opfamily, index key on > the left, pseudoconstant value on the right. The sample code that's > there now for LIKE/regex kind of punts on the last point: since it > can only do anything useful with a simple Const pattern, it doesn't > have to think hard about what's acceptable. If you want to allow > non-simple-Const RHS then you need to reject volatile functions and > Vars of the indexed relation. I was thinking of exposing a function > specifically to make that test, rather than requiring extensions to > copy the logic, but I didn't do it yet. <fear> So just a meta-comment, this is all very cool and I can see how it will help out things like selectivity estimation and tuplereturn estimation for unnest() and generate_series() and even how I could eventually do some dynamic costing for somefunctions, but it’s getting very deep and far from our proximate problem which was just we were having trouble increasingthe costs on our functions so we could get more frequent parallelization. We’re going to have a net complexity increase in our code base, because we’ll be maintaining this new approach alongsidethe old one for about 5 years until v12 becomes our oldest supported PostgreSQL. And this paragraph above (excuse my ignorance) worries me that I might have a bunch of use patterns I now have to explicitlypredict will happen and support that I might have gotten “for free” with the old dumbass inlining approach. The only thing that will get more efficient for us, I think, will be ST_DWithin, which can pick the correct index op to useinstead of supplying both of them. (Maybe? I think I’ll end up writing a bunch of logic which I previously got “for free”to (a) find which indexes are available? (b) if I have an index on *both* columns, check the selectivity of 'expand(b)&& a' vs 'b && expand(a)’ and (c) build up an appropriate new structure that incorporates the index *and* the expand()function call wrapper on the appropriate side. </fear> Anyways, once I have done an implementation it will probably appear less daunting. P > > Anyway, happy to help if you have questions, just ask. > > regards, tom lane
Paul Ramsey <pramsey@cleverelephant.ca> writes: > So just a meta-comment, this is all very cool and I can see how it will > help out things like selectivity estimation and tuple return estimation > for unnest() and generate_series() and even how I could eventually do > some dynamic costing for some functions, but it’s getting very deep and > far from our proximate problem which was just we were having trouble > increasing the costs on our functions so we could get more frequent > parallelization. Sure, but you need to think bigger about what problem you're solving ;-). There was no chance that we were going to back-patch that inlining rule change hack, so none of this would provide you an opportunity to do better in existing releases anyway. With a bit more work --- and, honestly, I think it should not end up being more than ~100 lines of C code for you, else I've done something wrong --- we can move the goalposts quite a long way in v12. > The only thing that will get more efficient for us, I think, will be ST_DWithin, which can pick the correct index op touse instead of supplying both of them. (Maybe? I think I’ll end up writing a bunch of logic which I previously got “forfree” to (a) find which indexes are available? (b) if I have an index on *both* columns, check the selectivity of 'expand(b)&& a' vs 'b && expand(a)’ and (c) build up an appropriate new structure that incorporates the index *and* the expand()function call wrapper on the appropriate side. No, I don't think so. The only aspect of that that the support function is responsible for is verifying that the currently-looked-at index is applicable, which boils down to checking that the index opfamily is what you're expecting. As I mentioned, the core code is just doing that by testing for compiled-in OIDs, but extensions are going to have to work a little harder because their opfamilies haven't got fixed OIDs. Roughly speaking, what you've got to do is * verify the index's opfamily OID (OK, some uncertainty remains about best way to do that) * find out operator OID for && --- if nothing else, get_opfamily_member() will serve for that, once you've got the opfamily OID * build index operator expression with make_opclause(). Figuring out the rest is the domain of the core code. If there are decisions to be made, it'd boil down to whether you've attached good selectivity estimators to the index operators ... but if you don't have that then I doubt you were getting good results before. Anyway, as I said, I'm plenty happy to help. I'd like to see how this code turns out in PostGIS; if it's really all that complicated then I need to spend some effort on adding infrastructure to make it simpler. regards, tom lane