Thread: Optimze usage of immutable functions as relation
Hi hackers, There is a strange behavior of the query planner in some cases if stable/immutable was used a relation. In some cases, it affects costs of operations and leads to a bad plan of the execution. Oleg Bartunov noticed such behavior in queries with a to_tsvector as a relation: =# explain select '|'||subject||'|', ts_rank_cd(body_tsvector,q) from messages, to_tsquery('tuple&header&overhead') q where body_tsvector @@ q; QUERY PLAN ------------------------------------------------------------------------------------------ Nested Loop (cost=383.37..58547.70 rows=4937 width=36) -> Function Scan on to_tsquery q (cost=0.25..0.26 rows=1 width=32) -> Bitmap Heap Scan on messages (cost=383.12..58461.04 rows=4937 width=275) Recheck Cond: (body_tsvector @@ q.q) -> Bitmap Index Scan on message_body_idx (cost=0.00..381.89 rows=4937 width=0) Index Cond: (body_tsvector @@ q.q) (6 rows) =# explain select '|'||subject||'|', ts_rank_cd(body_tsvector,q) from messages, to_tsquery('tuple&header&overhead') q where body_tsvector @@ q limit 10; QUERY PLAN -------------------------------------------------------------------------------- Limit (cost=0.25..425.62 rows=10 width=36) -> Nested Loop (cost=0.25..210005.80 rows=4937 width=36) Join Filter: (messages.body_tsvector @@ q.q) -> Function Scan on to_tsquery q (cost=0.25..0.26 rows=1 width=32) -> Seq Scan on messages (cost=0.00..197625.45 rows=987445 width=275) The idea of the fix for this situation is to check is a result of the function constant or not during the planning of the query. Attached patch does this by processing Var entries at planner stage and replace them with constant value if it is possible. Plans after applying a patch (SeqScan query for comparison): =# explain select '|'||subject||'|', ts_rank_cd(body_tsvector,q) from messages, to_tsquery('tuple&header&overhead') q where body_tsvector @@ q limit 10; QUERY PLAN ---------------------------------------------------------------------------------------------- Limit (cost=224.66..268.11 rows=3 width=36) -> Nested Loop (cost=224.66..268.11 rows=3 width=36) -> Function Scan on to_tsquery q (cost=0.25..0.26 rows=1 width=0) -> Bitmap Heap Scan on messages (cost=224.41..267.04 rows=3 width=275) Recheck Cond: (body_tsvector @@ to_tsquery('tuple&header&overhead'::text)) -> Bitmap Index Scan on message_body_idx (cost=0.00..224.41 rows=3 width=0) Index Cond: (body_tsvector @@ to_tsquery('tuple&header&overhead'::text)) (7 rows) =# set enable_bitmapscan=off; SET =# explain select '|'||subject||'|', ts_rank_cd(body_tsvector,q) from messages, to_tsquery('tuple&header&overhead') q where body_tsvector @@ q limit 10; QUERY PLAN ------------------------------------------------------------------------------------------ Limit (cost=1000.25..296754.14 rows=3 width=36) -> Gather (cost=1000.25..296754.14 rows=3 width=36) Workers Planned: 2 -> Nested Loop (cost=0.25..295753.32 rows=1 width=36) -> Parallel Seq Scan on messages (cost=0.00..295752.80 rows=1 width=275) Filter: (body_tsvector @@ to_tsquery('tuple&header&overhead'::text)) -> Function Scan on to_tsquery q (cost=0.25..0.26 rows=1 width=0) (7 rows) -- Aleksandr Parfenov Postgres Professional: http://www.postgrespro.com Russian Postgres Company
Attachment
>>>>> "Aleksandr" == Aleksandr Parfenov <a.parfenov@postgrespro.ru> writes: Aleksandr> The idea of the fix for this situation is to check is a Aleksandr> result of the function constant or not during the planning Aleksandr> of the query. Attached patch does this by processing Var Aleksandr> entries at planner stage and replace them with constant Aleksandr> value if it is possible. Plans after applying a patch Aleksandr> (SeqScan query for comparison): From an implementation point of view your patch is obviously broken in many ways (starting with not checking varattno anywhere, and not actually checking anywhere if the expression is volatile). But more importantly the plans that you showed seem _worse_ in that you've created extra calls to to_tsquery even though the query has been written to try and avoid that. I think you need to take a step back and explain more precisely what you think you're trying to do and what the benefit (and cost) is. Specific coding style points (not exhaustive): Aleksandr> pointedNode->functions->length == 1 list_length() Aleksandr> pointedNode->functions->head->data.ptr_value linitial() or linitial_node() Aleksandr> if (result->type == T_FuncExpr) IsA() (what if the result is the inline expansion of a volatile function?) Aleksandr> pfree(result); result is a whole tree of nodes, pfree is pointless here (don't bother trying to fix the above points in this patch, that won't address the fundamental flaws) -- Andrew (irc:RhodiumToad)
Hello Andrew, Thank you for the review of the patch. On Fri, 04 May 2018 08:37:31 +0100 Andrew Gierth <andrew@tao11.riddles.org.uk> wrote: > From an implementation point of view your patch is obviously broken in > many ways (starting with not checking varattno anywhere, and not > actually checking anywhere if the expression is volatile). The actual checking if the expression volatile or not is done inside evaluate_function(). This is called through few more function in eval_const_experssion(). If it's volatile, the eval_const_expression() will return FuncExpr node, Const otherwise. It also checks are arguments immutable or not. I agree about varattno, it should be checked. Even in case of SRF not replaced, it is better to be sure that Var points to first (and the only) attribute. > But more importantly the plans that you showed seem _worse_ in that > you've created extra calls to to_tsquery even though the query has > been written to try and avoid that. > > I think you need to take a step back and explain more precisely what > you think you're trying to do and what the benefit (and cost) is. Sure, the first version is some kind of prototype and attempt to improve execution of the certain type of queries. The best solution I see is to use the result of the function where it is required and remove the nested loop in case of immutable functions. In this case, the planner can choose a better plan if function result is used for tuples selecting or as a join filter. But it will increase planning time due to the execution of immutable functions. It looks like I did something wrong with plans in the example, because attached patch replaces function-relation usage with result value, not function call. But nested loop is still in the plan. I'm open to any suggestions/notices/critics about the idea and approach. -- Aleksandr Parfenov Postgres Professional: http://www.postgrespro.com Russian Postgres Company
>>>>> "Aleksandr" == Aleksandr Parfenov <a.parfenov@postgrespro.ru> writes: >> From an implementation point of view your patch is obviously broken >> in many ways (starting with not checking varattno anywhere, and not >> actually checking anywhere if the expression is volatile). Aleksandr> The actual checking if the expression volatile or not is Aleksandr> done inside evaluate_function(). This is called through few Aleksandr> more function in eval_const_experssion(). If it's volatile, Aleksandr> the eval_const_expression() will return FuncExpr node, Const Aleksandr> otherwise. It also checks are arguments immutable or not. You're missing a ton of other possible cases, of which by far the most notable is function inlining: eval_const_expressions will inline even a volatile function and return a new expression tree (which could be almost anything depending on the function body). Aleksandr> I agree about varattno, it should be checked. Even in case Aleksandr> of SRF not replaced, it is better to be sure that Var points Aleksandr> to first (and the only) attribute. It's not a matter of "better", but of basic correctness. Functions can return composite values with columns. -- Andrew (irc:RhodiumToad)
Hello, I reworked a patch to make more stable in different cases. I decided to use simplify_function instead of eval_const_expression to prevent inlining of the function. The only possible outputs of the simplify_function are Const node and NULL. Also, I block pre-evaluation of functions with types other than TYPTYPE_BASE, cause there is no special logic for compound (and others) values yet. There is still a problem with memory leak in case of simplified arguments. The only way I see is a creation of temporary memory context, but it cost some performance. Maybe we can store simplified arguments in the pointed function itself for later use. But eval_const_expression and friends doesn't change the content of the nodes inside the tree, it generates new nodes and returns it as a result. The last point to mention is a fixed plan for the query in the initial letter of the thread. As I mentioned before, new versions of the patch replace var not with a function call, but with a function execution result. After the patch, the following plan is used instead of Nested Loop with Sequence Scan: explain select '|'||subject||'|', ts_rank_cd(body_tsvector,q) from messages, to_tsquery('english', 'tuple&header&overhead')q where body_tsvector @@ q limit 10; QUERY PLAN ---------------------------------------------------------------------------------------------------- Limit (cost=224.16..266.11 rows=3 width=36) -> Nested Loop (cost=224.16..266.11 rows=3 width=36) -> Function Scan on q (cost=0.00..0.01 rows=1 width=0) -> Bitmap Heap Scan on messages (cost=224.16..266.04 rows=3 width=275) Recheck Cond: (body_tsvector @@ '''tupl'' & ''header'' & ''overhead'''::tsquery) -> Bitmap Index Scan on message_body_idx (cost=0.00..224.16 rows=3 width=0) Index Cond: (body_tsvector @@ '''tupl'' & ''header'' & ''overhead'''::tsquery) -- Aleksandr Parfenov Postgres Professional: http://www.postgrespro.com Russian Postgres Company
Attachment
On 16/05/18 13:47, Aleksandr Parfenov wrote: > Hello, > > I reworked a patch to make more stable in different cases. I decided to > use simplify_function instead of eval_const_expression to prevent > inlining of the function. The only possible outputs of the > simplify_function are Const node and NULL. Hmm. That's still a bit inefficient, we'll try to simplify the function on every reference to it. We already simplify functions in function RTEs, but we currently do it after preprocessing all other expressions in the query. See subquery_planner(), around comment "/* Also need to preprocess expressions within RTEs */". If we moved that so that we simplify expressions in the range table first, then in the code that you're adding to eval_const_expression_mutator(), you could just check if the function expression is already a Const. But stepping back a bit, it's a bit weird that we're handling this differently from VALUES and other subqueries. The planner knows how to do this trick for simple subqueries: postgres=# explain select * from tenk1, (select abs(100)) as a (a) where unique1 < a; QUERY PLAN ----------------------------------------------------------- Seq Scan on tenk1 (cost=0.00..483.00 rows=100 width=248) Filter: (unique1 < 100) (2 rows) Note that it not only evaluated the function into a constant, but also got rid of the join. For a function RTE, however, it can't do that: postgres=# explain select * from tenk1, abs(100) as a (a) where unique1 < a; QUERY PLAN ------------------------------------------------------------------- Nested Loop (cost=0.00..583.01 rows=3333 width=248) Join Filter: (tenk1.unique1 < a.a) -> Function Scan on a (cost=0.00..0.01 rows=1 width=4) -> Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=244) (4 rows) Could we handle this in pull_up_subqueries(), similar to the RTE_SUBQUERY and RTE_VALUES cases? - Heikki
Heikki Linnakangas <hlinnaka@iki.fi> writes: > But stepping back a bit, it's a bit weird that we're handling this > differently from VALUES and other subqueries. The planner knows how to > do this trick for simple subqueries: > postgres=# explain select * from tenk1, (select abs(100)) as a (a) where > unique1 < a; > QUERY PLAN > ----------------------------------------------------------- > Seq Scan on tenk1 (cost=0.00..483.00 rows=100 width=248) > Filter: (unique1 < 100) > (2 rows) > Note that it not only evaluated the function into a constant, but also > got rid of the join. For a function RTE, however, it can't do that: > postgres=# explain select * from tenk1, abs(100) as a (a) where unique1 < a; > QUERY PLAN > ------------------------------------------------------------------- > Nested Loop (cost=0.00..583.01 rows=3333 width=248) > Join Filter: (tenk1.unique1 < a.a) > -> Function Scan on a (cost=0.00..0.01 rows=1 width=4) > -> Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=244) > (4 rows) > Could we handle this in pull_up_subqueries(), similar to the > RTE_SUBQUERY and RTE_VALUES cases? Perhaps. You could only do it for non-set-returning functions, which isn't the typical use of function RTEs, which is probably why we've not thought hard about it before. I'm not sure what would need to happen for lateral functions. Also to be considered, if it's not foldable to a constant, is whether we're risking evaluating it more times than before. regards, tom lane
On Tue, 10 Jul 2018 17:34:20 -0400 Tom Lane <tgl@sss.pgh.pa.us> wrote: >Heikki Linnakangas <hlinnaka@iki.fi> writes: >> But stepping back a bit, it's a bit weird that we're handling this >> differently from VALUES and other subqueries. The planner knows how >> to do this trick for simple subqueries: > >> postgres=# explain select * from tenk1, (select abs(100)) as a (a) >> where unique1 < a; >> QUERY PLAN >> ----------------------------------------------------------- >> Seq Scan on tenk1 (cost=0.00..483.00 rows=100 width=248) >> Filter: (unique1 < 100) >> (2 rows) > >> Note that it not only evaluated the function into a constant, but >> also got rid of the join. For a function RTE, however, it can't do >> that: > >> postgres=# explain select * from tenk1, abs(100) as a (a) where >> unique1 < a; QUERY PLAN >> ------------------------------------------------------------------- >> Nested Loop (cost=0.00..583.01 rows=3333 width=248) >> Join Filter: (tenk1.unique1 < a.a) >> -> Function Scan on a (cost=0.00..0.01 rows=1 width=4) >> -> Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=244) >> (4 rows) > >> Could we handle this in pull_up_subqueries(), similar to the >> RTE_SUBQUERY and RTE_VALUES cases? > >Perhaps. You could only do it for non-set-returning functions, which >isn't the typical use of function RTEs, which is probably why we've not >thought hard about it before. I'm not sure what would need to happen >for lateral functions. Also to be considered, if it's not foldable to >a constant, is whether we're risking evaluating it more times than >before. > > regards, tom lane I reworked the patch and implemented processing of FuncScan in pull_up_subqueries() in a way similar to VALUES processing. In order to prevent folding of non-foldable functions it checks provolatile of the function and are arguments const or not and return type to prevent folding of SRF. -- Aleksandr Parfenov Postgres Professional: http://www.postgrespro.com Russian Postgres Company
Attachment
The following review has been posted through the commitfest application: make installcheck-world: tested, failed Implements feature: not tested Spec compliant: not tested Documentation: not tested Hello, I was trying to review your patch, but I couldn't install it: prepjointree.c: In function ‘pull_up_simple_function’: prepjointree.c:1793:41: error: ‘functions’ undeclared (first use in this function); did you mean ‘PGFunction’? Assert(!contain_vars_of_level((Node *) functions, 0)); Was it a typo? The new status of this patch is: Waiting on Author
Hi,
Thank you for the review.
I fixed a typo and some comments. Please find new version attached.
---
Best regards,
Parfenov Aleksandr
Thank you for the review.
I fixed a typo and some comments. Please find new version attached.
---
Best regards,
Parfenov Aleksandr
On Fri, 19 Oct 2018 at 16:40, Anthony Bykov <a.bykov@postgrespro.ru> wrote:
The following review has been posted through the commitfest application:
make installcheck-world: tested, failed
Implements feature: not tested
Spec compliant: not tested
Documentation: not tested
Hello,
I was trying to review your patch, but I couldn't install it:
prepjointree.c: In function ‘pull_up_simple_function’:
prepjointree.c:1793:41: error: ‘functions’ undeclared (first use in this function); did you mean ‘PGFunction’?
Assert(!contain_vars_of_level((Node *) functions, 0));
Was it a typo?
The new status of this patch is: Waiting on Author
Attachment
Hello. # It might be better that you provided self-contained test case. As the discussion between Heikki and Tom just upthread, it would be doable but that usage isn't typical. The query would be normally written as followings and they are transformed as desired. select '|'||subject||'|', ts_rank_cd(body_tsvector,q) from messages where body_tsvector @@ to_tsquery('tuple&header&overhead'); or (this doesn't look normal, thought..) select '|'||subject||'|', ts_rank_cd(body_tsvector,q) from messages, (select to_tsquery('tuple&header&overhead') as q) q where body_tsvector @@ q; This means that the wanted pull up logic is almost already there. You should look on how it is handled. At Sat, 20 Oct 2018 01:31:04 +0700, Aleksandr Parfenov <asp437@gmail.com> wrote in <CACdpekK1oDy7-_HnXOaREa_8HM2r-fsp8iv5e6p8aWOdGdK8Mg@mail.gmail.com> > Hi, > > Thank you for the review. > I fixed a typo and some comments. Please find new version attached. I had the following error with the following query. =# explain select * from pg_stat_get_activity(NULL) a join log(100000.0) p on a.pid = p.p; ERROR: no relation entry for relid 2 As Andrew pointed, you are missing many things to do. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote: > I had the following error with the following query. > > =# explain select * from pg_stat_get_activity(NULL) a join log(100000.0) p on a.pid = p.p; > ERROR: no relation entry for relid 2 > I think that the problem is that RTE_VALUES is wrapped in a subquery by parser while RTE_FUNCTION is not. Thus some additional processing is done by pull_up_simple_subquery() for VALUES. What seems to make difference here is the call of flatten_join_alias_vars() on the query targetlist, although pull_up_simple_subquery() does it for other reasons. Actually the comment about flatten_join_alias_vars() in pull_up_simple_subquery() makes me think if it's o.k. that pull_up_simple_function() sets rvcontext.need_phvs=false regardless strictness of the function: I think PlaceHolderVar might be needed if the function is not strict. (In contrast, pull_up_simple_values() does not have to care because pull_up_simple_subquery() handles such cases when it's processing the owning subquery). -- Antonin Houska Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26, A-2700 Wiener Neustadt Web: https://www.cybertec-postgresql.com
Aleksandr Parfenov <asp437@gmail.com> wrote: > I fixed a typo and some comments. Please find new version attached. I've checked this verstion too. * is_simple_stable_function() instead of fetching HeapTuple from the syscache manually, you might want to consider using functions from lsyscache.c (get_func_rettype, get_func_retset, etc.), or adding a function that returns (subset of) the fields you need in a single call. * pull_up_simple_function(): As you assume that ret->functions is a single-item list Assert(list_length(rte->functions) == 1); the following iteration is not necessary: foreach(lc, functions_list) Also, there seems to be a lot of copy & paste from pull_up_simple_values(), so some refactoring would make sense. -- Antonin Houska Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26, A-2700 Wiener Neustadt Web: https://www.cybertec-postgresql.com
Aleksandr Parfenov <asp437@gmail.com> writes: > [ funcscan_plan_optimizer_v4.patch ] A small note here --- this would be noticeably cleaner if removal of the no-longer-needed function RTE were done using the dummy-relation infrastructure I proposed in https://commitfest.postgresql.org/20/1827/ Maybe we should get that done first and then come back to this. I'm a bit suspicious of the premise of this though, because it's going to result in a single call of a function being converted into possibly many calls, if the RTE is referenced in a lot of places. Even if the function is immutable so that those calls get evaluated at plan time not run time, it's still N calls not one, which'd be bad if the function is expensive. See e.g. https://www.postgresql.org/message-id/flat/CAOYf6edujEOGB-9fGG_V9PGQ5O9yoyWmTnE9RyBYTGw%2BDq3SpA%40mail.gmail.com for an example where we're specifically telling people that they can put a function into FROM to avoid multiple evaluation. This patch breaks that guarantee. A possible fix for this is to do eval_const_expressions() on function RTE expressions at this stage (and then not need to do it later), and then pull up only when we find that the RTE expression has been reduced to a single Const. I'm not sure whether invoking eval_const_expressions() so early would create any issues. But if it can be made to work, this would eliminate quite a lot of the ad-hoc conditions that you have in the patch now. regards, tom lane
On 2018-11-08 15:08:03 +0100, Antonin Houska wrote: > Aleksandr Parfenov <asp437@gmail.com> wrote: > > > I fixed a typo and some comments. Please find new version attached. > > I've checked this verstion too. > > * is_simple_stable_function() > > instead of fetching HeapTuple from the syscache manually, you might want to > consider using functions from lsyscache.c (get_func_rettype, get_func_retset, > etc.), or adding a function that returns (subset of) the fields you need in a > single call. > > * pull_up_simple_function(): > > As you assume that ret->functions is a single-item list > > Assert(list_length(rte->functions) == 1); > > the following iteration is not necessary: > > foreach(lc, functions_list) > > Also, there seems to be a lot of copy & paste from pull_up_simple_values(), so > some refactoring would make sense. Given this I think the appropriate state of the CF entry would have been waiting-for-author, not needs review. Or alternatively returned-with-feedback or rejected. I'm a bit confused as to why the patch was moved to the next CF twice? - Andres
Andres Freund <andres@anarazel.de> writes: > Given this I think the appropriate state of the CF entry would have been > waiting-for-author, not needs review. Or alternatively > returned-with-feedback or rejected. I'm a bit confused as to why the > patch was moved to the next CF twice? We have this review from Antonin, and mine in https://www.postgresql.org/message-id/2906.1542395026%40sss.pgh.pa.us and there's the cfbot report that the patch doesn't even apply anymore. The dummy-relation stuff I referred to has now been merged, so there's really no good reason not to revise the patch along that line. I'm going to go set this CF entry to waiting-for-author, but unless a rewritten patch appears soon, I think we should close it returned-with-feedback. I think the idea is potentially good, but as I said in my review, I don't like this implementation; it has the potential to be a net loss in some cases. regards, tom lane
On 2/18/19 03:20, Tom Lane wrote: > The dummy-relation stuff I referred to has now been merged, so there's > really no good reason not to revise the patch along that line. I'll try to post the revised implementation soon. -- Alexander Kuzmenkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 2/28/19 4:27 PM, Alexander Kuzmenkov wrote: > On 2/18/19 03:20, Tom Lane wrote: >> The dummy-relation stuff I referred to has now been merged, so there's >> really no good reason not to revise the patch along that line. > > > I'll try to post the revised implementation soon. I'll close this on March 8th if there is no new patch. Regards, -- -David david@pgmasters.net
On 3/5/19 20:22, David Steele wrote: > > I'll close this on March 8th if there is no new patch. This is taking longer than expected. I'll move it to the next commitfest and continue working on the patch. -- Alexander Kuzmenkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 11/16/18 22:03, Tom Lane wrote: > A possible fix for this is to do eval_const_expressions() on > function RTE expressions at this stage (and then not need to > do it later), and then pull up only when we find that the > RTE expression has been reduced to a single Const. Attached is a patch that does this, and transforms RTE_FUCTION that was reduced to a single Const into an RTE_RESULT. Not sure it does everything correctly, but some basic cases work. In particular, I don't understand whether it needs any handling of "append relations". -- Alexander Kuzmenkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On Thu, Mar 21, 2019 at 5:58 AM Alexander Kuzmenkov <a.kuzmenkov@postgrespro.ru> wrote: > On 11/16/18 22:03, Tom Lane wrote: > > A possible fix for this is to do eval_const_expressions() on > > function RTE expressions at this stage (and then not need to > > do it later), and then pull up only when we find that the > > RTE expression has been reduced to a single Const. > > > Attached is a patch that does this, and transforms RTE_FUCTION that was > reduced to a single Const into an RTE_RESULT. Hi Alexander, The July Commitfest is here. Could we please have a rebase of this patch? Thanks, -- Thomas Munro https://enterprisedb.com
08.07.2019 4:18, Thomas Munro: > The July Commitfest is here. Could we please have a rebase of this patch? Updated patch is in attachments. I've only resolved one small cosmetic merge conflict. Later this week I'm going to send a more thoughtful review. -- Anastasia Lubennikova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
23.07.2019 14:36, Anastasia Lubennikova : > 08.07.2019 4:18, Thomas Munro: >> The July Commitfest is here. Could we please have a rebase of this >> patch? > Updated patch is in attachments. > I've only resolved one small cosmetic merge conflict. > > Later this week I'm going to send a more thoughtful review. > Well, I looked through the patch and didn't find any issues, so I'll mark this ready for committer. Last implementation differs from the original one and is based on suggestion from this message: https://www.postgresql.org/message-id/2906.1542395026%40sss.pgh.pa.us It does eval_const_expressions() earlier and pulls up only Const functions. I also added a few more tests and comments. New version is in attachments. -- Anastasia Lubennikova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
Anastasia Lubennikova <a.lubennikova@postgrespro.ru> writes: > New version is in attachments. I took a quick look at this and I have a couple of gripes --- * The naming and documentation of transform_const_function_to_result seem pretty off-point to me. ISTM the real goal of that function is to pull up constant values from RTE_FUNCTION RTEs. Replacing the RTE with an RTE_RESULT is just a side-effect that's needed so that we don't generate a useless FunctionScan plan node. I'd probably call it pull_up_constant_function or something like that. * It would be useful for the commentary to point out that in principle we could pull up any immutable (or, probably, even just stable) expression; but we don't, (a) for fear of multiple evaluations of the result costing us more than we can save, and (b) because a primary goal is to let the constant participate in further const-folding, and of course that won't happen for a non-Const. * The test cases are really pretty bogus, because int4(1) or int4(0) are not function invocations at all. The parser thinks those are no-op type casts, and so what comes out of parse analysis is already just a plain Const. Thus, not one of these test cases is actually verifying that const-folding of an immutable function has happened before we try to pull up. While you could dodge the problem today by, say, writing int8(0) which isn't a no-op cast, I'd recommend staying away from typename() notation altogether. There's too much baggage there and too little certainty that you wrote a function call not something else. The existing test cases you changed, with int4(sin(1)) and so on, are better examples of something that has to actively be folded to a constant. regards, tom lane
I wrote: > * It would be useful for the commentary to point out that in principle we > could pull up any immutable (or, probably, even just stable) expression; > but we don't, (a) for fear of multiple evaluations of the result costing > us more than we can save, and (b) because a primary goal is to let the > constant participate in further const-folding, and of course that won't > happen for a non-Const. BTW, so far as I can see, none of the test cases demonstrate that such further const-folding can happen. Since this is all pretty processing- order-sensitive, I think an explicit test that that's possible would be a good idea. regards, tom lane
26.07.2019 21:26, Tom Lane wrote: > I took a quick look at this and I have a couple of gripes --- > > * The naming and documentation of transform_const_function_to_result > seem pretty off-point to me. ISTM the real goal of that function is to > pull up constant values from RTE_FUNCTION RTEs. Replacing the RTE > with an RTE_RESULT is just a side-effect that's needed so that we > don't generate a useless FunctionScan plan node. I'd probably call > it pull_up_constant_function or something like that. > > * It would be useful for the commentary to point out that in principle we > could pull up any immutable (or, probably, even just stable) expression; > but we don't, (a) for fear of multiple evaluations of the result costing > us more than we can save, and (b) because a primary goal is to let the > constant participate in further const-folding, and of course that won't > happen for a non-Const. Fixed > * The test cases are really pretty bogus, because int4(1) or int4(0) are > not function invocations at all. The parser thinks those are no-op type > casts, and so what comes out of parse analysis is already just a plain > Const. Thus, not one of these test cases is actually verifying that > const-folding of an immutable function has happened before we try to > pull up. While you could dodge the problem today by, say, writing > int8(0) which isn't a no-op cast, I'd recommend staying away from > typename() notation altogether. There's too much baggage there and too > little certainty that you wrote a function call not something else. > The existing test cases you changed, with int4(sin(1)) and so on, > are better examples of something that has to actively be folded to > a constant. Thank you for pointing out on specific of int4() function, I updated tests to use dummy plpgsql function. I'm not sure if tests of various join types are redundant but I left them. As far as I understand, the principal motivation of this patch was to optimize function scan joins that occur in FTS queries. For example: select * from test_tsquery, to_tsquery('english', 'new') q where txtsample @@ q; So I also added another test to tsearch.sql to illustrate difference between optimized and not optimized plans. -- Anastasia Lubennikova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
Anastasia Lubennikova <a.lubennikova@postgrespro.ru> writes: > Thank you for pointing out on specific of int4() function, > I updated tests to use dummy plpgsql function. > I'm not sure if tests of various join types are redundant but I left them. > As far as I understand, the principal motivation of this patch was to > optimize > function scan joins that occur in FTS queries. For example: > select * from test_tsquery, to_tsquery('english', 'new') q where > txtsample @@ q; > So I also added another test to tsearch.sql to illustrate difference > between optimized and not optimized plans. Fair enough. I've pushed this after a good deal of further hackery on the code. Notably * I had no faith that we still guaranteed to perform eval_const_expressions on every function-RTE expression. Even if that were true today, it'd be awfully easy to break in future, if it only happened somewhere down in the recursion in pull_up_subqueries_recurse. So I moved that responsibility to an earlier spot that clearly traverses all function RTEs. A happy side benefit is that inline_set_returning_function gets simpler because it can assume that that already happened. * It looked to me like the pullup_constant_function code wasn't covering all the places where it might need to replace Vars referencing the function RTE. I refactored things to avoid having a bunch of code duplication while ensuring it hits everyplace that pull_up_simple_subquery does. regards, tom lane
Hi, This commit is breaking some Postgis tests with custom types. Here is a minimal repro (Postgis not required) ``` -- test custom types create type t_custom_type AS ( valid bool, reason varchar, location varchar ); create or replace function f_immutable_custom_type(i integer) returns t_custom_type as $$ declare oCustom t_custom_type; begin select into oCustom true as valid, 'OK' as reason, NULL as location; return oCustom; end; $$ language plpgsql immutable; select valid, reason, location from f_immutable_custom_type(3); drop function f_immutable_custom_type; drop type t_custom_type; ``` Expected (PG12): ``` valid | reason | location -------+--------+---------- t | OK | (1 row) ``` Instead with master/HEAD (eb57bd9c1d) we are getting: ``` ERROR: could not find attribute 2 in subquery targetlist ``` Reverting 8613eda50488c27d848f8e8caa493c9d8e1b5271 fixes it, but I haven't looked into the reason behind the bug. Regards -- Raúl Marín Rodríguez carto.com
rmrodriguez@carto.com writes: > This commit is breaking some Postgis tests with custom types. Hm, yeah, the code fails to consider the possibility that the function returns a composite type. For the moment I'm just going to make it punt if the function result class isn't TYPEFUNC_SCALAR. In principle, if we have a composite result, we could disassemble it into per-column constant values, but I'm not sure it'd be worth the work. regards, tom lane