Thread: execExprInterp() questions / How to improve scalar array op expr eval?
Short version: In what I'm currently working on I had a few questions about arrays and the execExpr/execExprInterp framework that didn't seem obviously answered in the code or README. - Does the execExpr/execExprInterp framework allow a scalar array op to get an already expanded array (unless I'm missing something we can't easily lookup a given index in a flattened array)? - If not, is there a way in that framework to know if the array expr has stayed the same through multiple evaluations of the expression tree (i.e., so you could expand and sort it just once)? Long version/background: I've been looking at how a query like: select * from t where i in (<long array>); executes as a seq scan the execution time increases linearly with respect to the length of the array. That's because in execExprInterp.c's ExecEvalScalarArrayOp() we do a linear search through the array. In contrast, when setting up a btree scan with a similar saop, we first sort the array, remove duplicates, and remove nulls. Of course with btree scans this has other values, like allowing us to return results in array order. I've been considering approaches to improve the seq scan case. We might: - At plan time rewrite as a hash join to a deduped array values relation (gross, but included here since that's an approach we can take rewriting the SQL itself as a proof of concept). - At execution time build a hash and lookup. - At execution time sort the array and binary search through it. I've been working other last approach to see what results I might get (it seemed like the easiest to hack together). Putting that together left me with the questions mentioned above in the "short version". Obviously if anyone has thoughts on the above approaches I'd be interested in that too. Side question: when we do: arr = DatumGetArrayTypeP(*op->resvalue); in ExecEvalScalarArrayOp() is that going to be expensive each time through a seq scan? Or is it (likely) going to resolve to an already in-memory array and effectively be the cost of retrieving that pointer? James
Hi, Tom, CCing you because of expanded datum question at bottom. On 2020-04-11 08:58:46 -0400, James Coleman wrote: > - Does the execExpr/execExprInterp framework allow a scalar array op > to get an already expanded array (unless I'm missing something we > can't easily lookup a given index in a flattened array)? Well, I'm not quite sure what you're thinking of. If the input is constant, then expression initialization can do just about everything it wants. Including preprocessing the array into whatever form it wants. But there's no logic for doing preprocessing whenever values change. > - If not, is there a way in that framework to know if the array expr > has stayed the same through multiple evaluations of the expression > tree (i.e., so you could expand and sort it just once)? No. > Long version/background: > > I've been looking at how a query like: > select * from t where i in (<long array>); > executes as a seq scan the execution time increases linearly with > respect to the length of the array. That's because in > execExprInterp.c's ExecEvalScalarArrayOp() we do a linear search > through the array. Is "<long array>" constant in the cases you're interested in? Because it's a heck of a lot easier to add an optimization for that case, than adding runtime tracking the array values by comparing the whole array for equality with the last - the comparisons of the whole array could easily end up adding more cost than what's being saved. > I've been considering approaches to improve the seq scan case. We might: > - At plan time rewrite as a hash join to a deduped array values > relation (gross, but included here since that's an approach we can > take rewriting the SQL itself as a proof of concept). > - At execution time build a hash and lookup. > - At execution time sort the array and binary search through it. > > I've been working other last approach to see what results I might get > (it seemed like the easiest to hack together). Putting that together > left me with the questions mentioned above in the "short version". > > Obviously if anyone has thoughts on the above approaches I'd be > interested in that too. If you're content with optimizing constant arrays, I'd go for detecting that case in the T_ScalarArrayOpExpr case in ExecInitExprRec(), and preprocessing the array into an optimized form. Probably with a separate opcode for execution. > Side question: when we do: > arr = DatumGetArrayTypeP(*op->resvalue); > in ExecEvalScalarArrayOp() is that going to be expensive each time > through a seq scan? Or is it (likely) going to resolve to an already > in-memory array and effectively be the cost of retrieving that > pointer? It Depends TM. For the constant case it's likely going to be cheap-ish, because it'll not be toasted. For the case where it's the return value from a subquery or something, you cannot assume it won't change between calls. I think the worst case here is something like a nestloop, where the inner side does foo IN (outer.column). If I recall the code correctly, we'll currently end up detoasting the array value every single iteration. I wonder if it would be a good idea to change ExecEvalParamExec and ExecEvalParamExtern to detoast the to-be-returned value. If we change the value that's stored in econtext->ecxt_param_exec_vals / econtext->ecxt_param_list_info, we'd avoid repeated detaosting. It'd be nice if we somehow could make the expanded datum machinery work here. I'm not quite seeing how though? Crazy idea: I have a patch to make execExprInterp largely use NullableDatum. Tom and I had theorized a while ago about adding additional fields in the padding that currently exists in it. I wonder if we could utilize a bit in there to allow to expand in-place? Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2020-04-11 08:58:46 -0400, James Coleman wrote: >> - Does the execExpr/execExprInterp framework allow a scalar array op >> to get an already expanded array (unless I'm missing something we >> can't easily lookup a given index in a flattened array)? > Well, I'm not quite sure what you're thinking of. If the input is > constant, then expression initialization can do just about everything it > wants. Including preprocessing the array into whatever form it wants. > But there's no logic for doing preprocessing whenever values change. For the most part it seems like this is asking the question at the wrong level. It's not execExpr's job to determine the form of either values coming in from "outside" (Vars from table rows, or Params from elsewhere) or the results of intermediate functions/operators. In the specific case of an array-valued (or record-valued) Const node, you could imagine having a provision to convert the array/record to an expanded datum at execution start, or maybe better on first use. I'm not sure how to tell whether that's actually a win though. It could easily be a net loss if the immediate consumer of the value wants a flat datum. It seems like this might be somewhat related to the currently-moribund patch to allow caching of the values of stable subexpressions from one execution to the next. If we had that infrastructure you could imagine extending it to allow caching the expanded not flat form of some datums. Again I'm not entirely sure what would drive the choice. > I wonder if it would be a good idea to change ExecEvalParamExec and > ExecEvalParamExtern to detoast the to-be-returned value. If we change > the value that's stored in econtext->ecxt_param_exec_vals / > econtext->ecxt_param_list_info, we'd avoid repeated detaosting. I'd think about attaching that to the nestloop param mechanism not ExecEvalParam in general. regards, tom lane
On Sat, Apr 11, 2020 at 2:01 PM Andres Freund <andres@anarazel.de> wrote: > > Hi, > > > Tom, CCing you because of expanded datum question at bottom. > > > On 2020-04-11 08:58:46 -0400, James Coleman wrote: > > - Does the execExpr/execExprInterp framework allow a scalar array op > > to get an already expanded array (unless I'm missing something we > > can't easily lookup a given index in a flattened array)? > > Well, I'm not quite sure what you're thinking of. If the input is > constant, then expression initialization can do just about everything it > wants. Including preprocessing the array into whatever form it wants. > But there's no logic for doing preprocessing whenever values change. > > > > - If not, is there a way in that framework to know if the array expr > > has stayed the same through multiple evaluations of the expression > > tree (i.e., so you could expand and sort it just once)? > > No. Ok. Seems like it'd be likely to be interesting (maybe in other places too?) to know if: - Something is actually a param that can change, and, - When (perhaps by some kind of flag or counter) it has changed. > > Long version/background: > > > > I've been looking at how a query like: > > select * from t where i in (<long array>); > > executes as a seq scan the execution time increases linearly with > > respect to the length of the array. That's because in > > execExprInterp.c's ExecEvalScalarArrayOp() we do a linear search > > through the array. > > Is "<long array>" constant in the cases you're interested in? Because > it's a heck of a lot easier to add an optimization for that case, than > adding runtime tracking the array values by comparing the whole array > for equality with the last - the comparisons of the whole array could > easily end up adding more cost than what's being saved. In the simplest case, yes, it's a constant, though it'd be obviously better if it weren't limited to that. There are many cases where a long array can come from a subplan but we can easily tell by looking at the SQL that it will only ever execute once. An unimaginative case is something like: select * from t where a in (select i from generate_series(0,10000) n(i)); > > I've been considering approaches to improve the seq scan case. We might: > > - At plan time rewrite as a hash join to a deduped array values > > relation (gross, but included here since that's an approach we can > > take rewriting the SQL itself as a proof of concept). > > - At execution time build a hash and lookup. > > - At execution time sort the array and binary search through it. > > > > I've been working other last approach to see what results I might get > > (it seemed like the easiest to hack together). Putting that together > > left me with the questions mentioned above in the "short version". > > > > Obviously if anyone has thoughts on the above approaches I'd be > > interested in that too. > > If you're content with optimizing constant arrays, I'd go for detecting > that case in the T_ScalarArrayOpExpr case in ExecInitExprRec(), and > preprocessing the array into an optimized form. Probably with a separate > opcode for execution. At minimum constants are a good first place to try it out. Thanks for the pointers. > > Side question: when we do: > > arr = DatumGetArrayTypeP(*op->resvalue); > > in ExecEvalScalarArrayOp() is that going to be expensive each time > > through a seq scan? Or is it (likely) going to resolve to an already > > in-memory array and effectively be the cost of retrieving that > > pointer? > > It Depends TM. For the constant case it's likely going to be cheap-ish, > because it'll not be toasted. For the case where it's the return value > from a subquery or something, you cannot assume it won't change between > calls. Back to what I was saying earlier. Perhaps some kind of mechanism so we can know that is a better place to start. Perhaps something from the patch Tom referenced would help kickstart that. I'll take a look. > I think the worst case here is something like a nestloop, where the > inner side does foo IN (outer.column). If I recall the code correctly, > we'll currently end up detoasting the array value every single > iteration. Ouch. Seems like that could be a significant cost in some queries? > I wonder if it would be a good idea to change ExecEvalParamExec and > ExecEvalParamExtern to detoast the to-be-returned value. If we change > the value that's stored in econtext->ecxt_param_exec_vals / > econtext->ecxt_param_list_info, we'd avoid repeated detaosting. > > It'd be nice if we somehow could make the expanded datum machinery work > here. I'm not quite seeing how though? I'm not yet familiar enough with it to comment. > Crazy idea: I have a patch to make execExprInterp largely use > NullableDatum. Tom and I had theorized a while ago about adding > additional fields in the padding that currently exists in it. I wonder > if we could utilize a bit in there to allow to expand in-place? Effectively to store the pointers to, for example, the expanded array? James
Hi, On 2020-04-11 15:33:09 -0400, Tom Lane wrote: > For the most part it seems like this is asking the question at the wrong > level. It's not execExpr's job to determine the form of either values > coming in from "outside" (Vars from table rows, or Params from elsewhere) > or the results of intermediate functions/operators. We don't really have good place to do optimizations to transform an expression to be better for an "upper" expression node. > In the specific case of an array-valued (or record-valued) Const node, > you could imagine having a provision to convert the array/record to an > expanded datum at execution start, or maybe better on first use. I'm > not sure how to tell whether that's actually a win though. It could > easily be a net loss if the immediate consumer of the value wants a > flat datum. With execution start you do mean ExecInitExpr()? Or later? I see little reason not to do such an optimization during expression initialization. It's not going to add much if any overhead compared to all the rest of the costs to change a Const array into a different form. > It seems like this might be somewhat related to the currently-moribund > patch to allow caching of the values of stable subexpressions from > one execution to the next. If we had that infrastructure you could > imagine extending it to allow caching the expanded not flat form of > some datums. Again I'm not entirely sure what would drive the choice. > > I wonder if it would be a good idea to change ExecEvalParamExec and > > ExecEvalParamExtern to detoast the to-be-returned value. If we change > > the value that's stored in econtext->ecxt_param_exec_vals / > > econtext->ecxt_param_list_info, we'd avoid repeated detaosting. > > I'd think about attaching that to the nestloop param mechanism not > ExecEvalParam in general. That was my first though too - but especially if there's multiple variables detoasting all of them might be a waste if the params are not dereferenced. So doing it the first time a parameter is used seemed like it'd be much more likely to be beneficial. But even there it could be a waste, because e.g. a length comparison alone is enough to determine inequality. Which is why I was wondering about somehow being able to detoast the parameter "in place" in the params arrays. Greetings, Andres Freund
On Sat, Apr 11, 2020 at 3:33 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Andres Freund <andres@anarazel.de> writes: > > On 2020-04-11 08:58:46 -0400, James Coleman wrote: > >> - Does the execExpr/execExprInterp framework allow a scalar array op > >> to get an already expanded array (unless I'm missing something we > >> can't easily lookup a given index in a flattened array)? > > > Well, I'm not quite sure what you're thinking of. If the input is > > constant, then expression initialization can do just about everything it > > wants. Including preprocessing the array into whatever form it wants. > > But there's no logic for doing preprocessing whenever values change. > > For the most part it seems like this is asking the question at the wrong > level. It's not execExpr's job to determine the form of either values > coming in from "outside" (Vars from table rows, or Params from elsewhere) > or the results of intermediate functions/operators. Right, though I didn't know if the expr interpretation by any chance had expanded arrays already available in some cases that we could take advantage of. A bit of a shot in the dark as I try to grok how this all fits together. > In the specific case of an array-valued (or record-valued) Const node, > you could imagine having a provision to convert the array/record to an > expanded datum at execution start, or maybe better on first use. I'm > not sure how to tell whether that's actually a win though. It could > easily be a net loss if the immediate consumer of the value wants a > flat datum. > > It seems like this might be somewhat related to the currently-moribund > patch to allow caching of the values of stable subexpressions from > one execution to the next. If we had that infrastructure you could > imagine extending it to allow caching the expanded not flat form of > some datums. Again I'm not entirely sure what would drive the choice. I'll have to look into that patch to see if it sparks any ideas (or if it's worth working on for its own merits). > > I wonder if it would be a good idea to change ExecEvalParamExec and > > ExecEvalParamExtern to detoast the to-be-returned value. If we change > > the value that's stored in econtext->ecxt_param_exec_vals / > > econtext->ecxt_param_list_info, we'd avoid repeated detaosting. > > I'd think about attaching that to the nestloop param mechanism not > ExecEvalParam in general. Revealing my ignorance here, but is nestloop the only case where we have params like that? James
On Sat, Apr 11, 2020 at 3:57 PM James Coleman <jtc331@gmail.com> wrote: > .. > > It seems like this might be somewhat related to the currently-moribund > > patch to allow caching of the values of stable subexpressions from > > one execution to the next. If we had that infrastructure you could > > imagine extending it to allow caching the expanded not flat form of > > some datums. Again I'm not entirely sure what would drive the choice. > > I'll have to look into that patch to see if it sparks any ideas (or if > it's worth working on for its own merits). Is this the patch [1] you're thinking of? James [1]: https://www.postgresql.org/message-id/flat/da87bb6a014e029176a04f6e50033cfb%40postgrespro.ru
James Coleman <jtc331@gmail.com> writes: >>> It seems like this might be somewhat related to the currently-moribund >>> patch to allow caching of the values of stable subexpressions from >>> one execution to the next. > Is this the patch [1] you're thinking of? > [1]: https://www.postgresql.org/message-id/flat/da87bb6a014e029176a04f6e50033cfb%40postgrespro.ru Yeah. I was just digging for that in the archives, and also came across this older patch: https://www.postgresql.org/message-id/CABRT9RBdRFS8sQNsJHxZOhC0tJe1x2jnomiz%3DFOhFkS07yRwQA%40mail.gmail.com which doesn't seem to have gone anywhere but might still contain useful ideas. regards, tom lane
Hi, On 2020-04-11 15:53:11 -0400, James Coleman wrote: > On Sat, Apr 11, 2020 at 2:01 PM Andres Freund <andres@anarazel.de> wrote: > > > - If not, is there a way in that framework to know if the array expr > > > has stayed the same through multiple evaluations of the expression > > > tree (i.e., so you could expand and sort it just once)? > > > > No. > > Ok. Seems like it'd be likely to be interesting (maybe in other places > too?) to know if: > - Something is actually a param that can change, and, > - When (perhaps by some kind of flag or counter) it has changed. We do have the param logic inside the executor, which does signal which params have changed. It's just independent of expression evaluation. I'm not convinced (or well, even doubtful) this is something we want to have at the expression evaluation level. Greetings, Andres Freund
On Sat, Apr 11, 2020 at 5:32 PM Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2020-04-11 15:53:11 -0400, James Coleman wrote: > > On Sat, Apr 11, 2020 at 2:01 PM Andres Freund <andres@anarazel.de> wrote: > > > > - If not, is there a way in that framework to know if the array expr > > > > has stayed the same through multiple evaluations of the expression > > > > tree (i.e., so you could expand and sort it just once)? > > > > > > No. > > > > Ok. Seems like it'd be likely to be interesting (maybe in other places > > too?) to know if: > > - Something is actually a param that can change, and, > > - When (perhaps by some kind of flag or counter) it has changed. > > We do have the param logic inside the executor, which does signal which > params have changed. It's just independent of expression evaluation. > > I'm not convinced (or well, even doubtful) this is something we want to > have at the expression evaluation level. Perhaps I'll discover the reason as I internalize the code, but could you expound a bit? Is that because you believe there's a better way to optimize subexpressions that don't change? Or that it's likely to add a lot of cost to non-optimized cases? James
Hi, On 2020-04-12 08:55:44 -0400, James Coleman wrote: > On Sat, Apr 11, 2020 at 5:32 PM Andres Freund <andres@anarazel.de> wrote: > > On 2020-04-11 15:53:11 -0400, James Coleman wrote: > > > On Sat, Apr 11, 2020 at 2:01 PM Andres Freund <andres@anarazel.de> wrote: > > > > > - If not, is there a way in that framework to know if the array expr > > > > > has stayed the same through multiple evaluations of the expression > > > > > tree (i.e., so you could expand and sort it just once)? > > > Ok. Seems like it'd be likely to be interesting (maybe in other places > > > too?) to know if: > > > - Something is actually a param that can change, and, > > > - When (perhaps by some kind of flag or counter) it has changed. > > > > We do have the param logic inside the executor, which does signal which > > params have changed. It's just independent of expression evaluation. > > > > I'm not convinced (or well, even doubtful) this is something we want to > > have at the expression evaluation level. > > Perhaps I'll discover the reason as I internalize the code, but could > you expound a bit? Is that because you believe there's a better way to > optimize subexpressions that don't change? Or that it's likely to add > a lot of cost to non-optimized cases? I think, if you're putting it into expression evaluation itself, the likelihood of causing slowdowns outside of the cases you're trying to optimize is much higher than likely the gain. Checks whether variables haven't changed aren't free. So, while I think it makes sense to optimize a constant array for a SAO inside expression initialization (possibly with a different opcode), I don't think runtime checking logic to see whether the array is still the same in ExecEvalScalarArrayOp() or any related place is likely to be a good idea. If - I am not really convinced it's worth it - we really want to optimize SAO arrays that can change at runtime, I suspect it'd be better if we extended the param mechanism so there's a 'postprocessing' step for params that changed. Which then would have to apply the expression sub-tree that applies to the Param (i.e. ScalarArrayOpExpr->args) into some place that is accessible (or, even better, is directly accessed) by ExecEvalScalarArrayOp(). I think you'll also have to be careful about whether using binary search against the array will always come out top. I'd expect it to be worse for the pretty common case of below 8 elements in the IN() or such. Greetings, Andres Freund
I've read through all of the previous discussions related to stable subexpression caching, and I'm planning to send a summary email with all of those links in one place. But I also happened to stumble upon mention in the TODO of some email discussion way back in 2007 where Tom suggested [1] we should really try planning scalar array ops (particularly those with large IN lists) as `IN (VALUES ...)`. That actually would solve the specific case I'd had this problem with (seq scan on a large constant array IN expression). Ideally any query with forms like: select * from t where a in (1, 2,...) select * from t where a in ((select i from x)) would always be isomorphic in planning. But thinking about this overnight and scanning through things quickly this morning, I have a feeling that'd be 1.) a pretty significant undertaking, and 2.) likely to explode the number of plans considered. Also I don't know if there's a good place to slot that into planning. Do either of you happen to have any pointers into places that do similar kinds of rewrites I could look at? And in those cases do we normally always rewrite or do we consider both styles independently? I suppose _only_ handling the case where a `IN (VALUES ...)` replaces a seq scan with a scalar array op might be somewhat easier...but feels like it leaves a lot of holes. I'm still at the point where I'm trying to determine if any of the above (subexpression caching, saop optimization only on constants, re-planning as `IN (VALUES ...)`) is something reasonable enough relative to the amount of effort to be worth working on. James [1]: https://www.postgresql.org/message-id/19001.1178823208%40sss.pgh.pa.us
On Mon, Apr 13, 2020 at 10:40 AM James Coleman <jtc331@gmail.com> wrote: > > I've read through all of the previous discussions related to stable > subexpression caching, and I'm planning to send a summary email with > all of those links in one place. > > But I also happened to stumble upon mention in the TODO of some email > discussion way back in 2007 where Tom suggested [1] we should really > try planning scalar array ops (particularly those with large IN lists) > as `IN (VALUES ...)`. > > That actually would solve the specific case I'd had this problem with > (seq scan on a large constant array IN expression). Ideally any query > with forms like: > select * from t where a in (1, 2,...) > select * from t where a in ((select i from x)) > would always be isomorphic in planning. But thinking about this > overnight and scanning through things quickly this morning, I have a > feeling that'd be 1.) a pretty significant undertaking, and 2.) likely > to explode the number of plans considered. > > Also I don't know if there's a good place to slot that into planning. > Do either of you happen to have any pointers into places that do > similar kinds of rewrites I could look at? And in those cases do we > normally always rewrite or do we consider both styles independently? > ... > [1]: https://www.postgresql.org/message-id/19001.1178823208%40sss.pgh.pa.us I've kept reading the code and thinking this over some more. and it seems like implementing this would require one of: 1. Tracking on a path whether or not it handles a given IN qual, generating paths with and without that qual, tracking the cheapest for each of those, and then running the join search with and without the VALUES RTEs in play...in short that seems...not really feasible. 2. Teaching set_plain_rel_pathlist to run with and without the IN quals, and run a join search for just that base rel and the VALUES RTE, and determine which is lower cost. This seems like it'd be easier to get working, would have the limitation of not fully considering all JOIN combinations including VALUES RTEs, and would still build more than twice as many base rel paths as we do now. I suppose it could predicate all of this on some kind of heuristic about saop cost (e.g., size of the array). But the purist in me finds this attractive -- it means people shouldn't have to know to try both IN (<list>) and IN (VALUES ...) -- it seems like it'd be a large effort with relatively little gain. My conclusion currently is that I believe we'd get 90% of the speed improvement (at least) merely by optimizing scalar array ops, and, possibly (to expand beyond constants) work on the broader effort of caching stable subexpressions with preprocessing support. And I have to assume the IN (<list>) case is a far more commonly used SQL clause anyway. I do wonder though if we could automatically convert some IN clauses with subqueries into saops (parameterized, or even better, in some cases, quasi-consts lazily evaluated)... James