Thread: FIX : teach expression walker about RestrictInfo
Hi there, the attached trivial patch adds handling of RestrictInfo nodes into expression_tree_walker(). This is needed for example when calling pull_varnos or (or other functions using the expression walker) in clausesel.c, for example. An example of a query causing errors with pull_varnos is select * from t where (a >= 10 and a <= 20) or (b >= 15 and b <= 20); which gets translated into an expression tree like this: BoolExpr [OR_EXPR] BoolExpr [AND_EXPR] RestrictInfo OpExpr [Var >= Const] RestrictInfo OpExpr [Var <= Const] BoolExpr [AND_EXPR] RestrictInfo OpExpr [Var >= Const] RestrictInfo OpExpr [Var <= Const] and the nested RestrictInfo nodes make the walker fail because of unrecognized node. It's possible that expression walker is not supposed to know about RestrictInfo, but I don't really see why would that be the case. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > the attached trivial patch adds handling of RestrictInfo nodes into > expression_tree_walker(). RestrictInfo is not a general expression node and support for it has been deliberately omitted from expression_tree_walker(). So I think what you are proposing is a bad idea and probably a band-aid for some other bad idea. > This is needed for example when calling > pull_varnos or (or other functions using the expression walker) in > clausesel.c, for example. An example of a query causing errors with > pull_varnos is > select * from t where (a >= 10 and a <= 20) or (b >= 15 and b <= 20); Really? regression=# create table t (a int, b int); CREATE TABLE regression=# select * from t where (a >= 10 and a <= 20) or (b >= 15 and b <= 20);a | b ---+--- (0 rows) regards, tom lane
Hi, On 04/28/15 21:50, Tom Lane wrote: > Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >> the attached trivial patch adds handling of RestrictInfo nodes into >> expression_tree_walker(). > > RestrictInfo is not a general expression node and support for it has > been deliberately omitted from expression_tree_walker(). So I think > what you are proposing is a bad idea and probably a band-aid for some > other bad idea. > >> This is needed for example when calling >> pull_varnos or (or other functions using the expression walker) in >> clausesel.c, for example. An example of a query causing errors with >> pull_varnos is > >> select * from t where (a >= 10 and a <= 20) or (b >= 15 and b <= 20); > > Really? > > regression=# create table t (a int, b int); > CREATE TABLE > regression=# select * from t where (a >= 10 and a <= 20) or (b >= 15 and b <= 20); > a | b > ---+--- > (0 rows) That's not what I said, though. I said that calling pull_varnos() causes the issue - apparently that does not happen in master, but I ran into that when hacking on a patch. For example try adding this Relids tmp = pull_varnos(clause); elog(WARNING, "count = %d", bms_num_members(tmp)); into the or_clause branch in clause_selectivity(), and then running the query will give you this: db=# select * from t where (a >= 10 and a <= 20) or (b >= 15); ERROR: unrecognized node type: 524 But as I said - maybe calls to pull_varnos are not supposed to happen in this part of the code, for some reason, and it really is a bug in my patch. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > On 04/28/15 21:50, Tom Lane wrote: >> RestrictInfo is not a general expression node and support for it has >> been deliberately omitted from expression_tree_walker(). So I think >> what you are proposing is a bad idea and probably a band-aid for some >> other bad idea. > That's not what I said, though. I said that calling pull_varnos() causes > the issue - apparently that does not happen in master, but I ran into > that when hacking on a patch. pull_varnos is not, and should not be, applied to a RestrictInfo; for one thing, it'd be redundant with work that was already done when creating the RestrictInfo (cf make_restrictinfo_internal). You've not shown the context of your problem very fully, but in general most code in the planner should be well aware of whether it is working with a RestrictInfo (or list of same) or a bare expression. I don't believe that it's a very good idea to obscure that distinction. regards, tom lane
Hi, On 04/29/15 05:55, Tom Lane wrote: > Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >> On 04/28/15 21:50, Tom Lane wrote: >>> RestrictInfo is not a general expression node and support for it has >>> been deliberately omitted from expression_tree_walker(). So I think >>> what you are proposing is a bad idea and probably a band-aid for some >>> other bad idea. > >> That's not what I said, though. I said that calling pull_varnos() causes >> the issue - apparently that does not happen in master, but I ran into >> that when hacking on a patch. > > pull_varnos is not, and should not be, applied to a RestrictInfo; for one > thing, it'd be redundant with work that was already done when creating the > RestrictInfo (cf make_restrictinfo_internal). You've not shown the > context of your problem very fully, but in general most code in the planner > should be well aware of whether it is working with a RestrictInfo (or list > of same) or a bare expression. I don't believe that it's a very good idea > to obscure that distinction. OK, let me explain the context a bit more. When working on the multivariate statistics patch, I need to choose which stats to use for estimating the clauses. I do that in clauselist_selectivity(), although there might be other places where to do that. Selecting the stats that best match the clauses is based on how well they cover the vars in the clauses (and some other criteria, that is mostly irrelevant here). So at some point I do call functions like pull_varnos() and pull_varattnos() to get this information. I do handle RestrictInfo nodes explicitly - for those nodes I can do get the info from the node itself. But this does not work for the condition I posted, because that contains nested RestrictInfo nodes. So I do call pull_varnos() on a BoolExpr node, but in reality the node tree representing the clauses WHERE (a >= 10 AND a <= 20) OR (b >= 20 AND b <= 40) looks like this: BoolExpr [OR_EXPR] BoolExpr [AND_EXPR] RestrictInfo OpExpr [Var >= Const] RestrictInfo OpExpr [Var <= Const] BoolExpr [AND_EXPR] RestrictInfo OpExpr[Var >= Const] RestrictInfo OpExpr [Var <= Const] so the pull_varnos() fails because it runs into RestrictInfo node while walking the tree. So I see three possibilities: 1) pull_varnos/pull_varattnos (and such functions, using the expression walker internally) are not supposed to be calledunless you are sure there are no RestrictInfo nodes in the tree, but this seems really awkward 2) expression walker is supposed to know about RestrictInfo nodes (as I did in my patch), but you say this is not the case 3) pull_varnos_walker / pull_varattnos_walker are supposed to handle RestrictInfo nodes explicitly (either by using thecached information or by using RestrictInfo->clause in the next step) regards -- Tomas Vondra http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > On 04/29/15 05:55, Tom Lane wrote: >> pull_varnos is not, and should not be, applied to a RestrictInfo; for one >> thing, it'd be redundant with work that was already done when creating the >> RestrictInfo (cf make_restrictinfo_internal). You've not shown the >> context of your problem very fully, but in general most code in the planner >> should be well aware of whether it is working with a RestrictInfo (or list >> of same) or a bare expression. I don't believe that it's a very good idea >> to obscure that distinction. > OK, let me explain the context a bit more. When working on the > multivariate statistics patch, I need to choose which stats to use for > estimating the clauses. I do that in clauselist_selectivity(), although > there might be other places where to do that. Hm, well, clauselist_selectivity and clause_selectivity already contain extensive special-casing for RestrictInfos, so I'm not sure that I see why you're having difficulty working within that structure for this change. But there are basic reasons why expression_tree_walker should not try to deal with RestrictInfos; the most obvious one being that it's not clear whether it should descend into both the basic and OR-clause subtrees. Also, if a node has expression_tree_walker support then it should logically have expression_tree_mutator support as well, but that's got multiple issues. RestrictInfos are not supposed to be copied once created; and the mutator couldn't detect whether their derived fields are still valid. regards, tom lane
On 04/29/15 18:26, Tom Lane wrote: > Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: ... >> OK, let me explain the context a bit more. When working on the >> multivariate statistics patch, I need to choose which stats to use for >> estimating the clauses. I do that in clauselist_selectivity(), although >> there might be other places where to do that. > > Hm, well, clauselist_selectivity and clause_selectivity already contain > extensive special-casing for RestrictInfos, so I'm not sure that I see why > you're having difficulty working within that structure for this change. So let's I'm in the clause_selectivity(), and have AND or OR clause to estimate. I need to get varnos/varattnos for the clause(s). What should I do? I can't call pull_varnos() or pull_varattnos() because there might be nested RestrictInfos, as demonstrated by the query I posted. > But there are basic reasons why expression_tree_walker should not try > to deal with RestrictInfos; the most obvious one being that it's not > clear whether it should descend into both the basic and OR-clause > subtrees. Also, if a node has expression_tree_walker support then it > should logically have expression_tree_mutator support as well, but > that's got multiple issues. RestrictInfos are not supposed to be > copied once created; and the mutator couldn't detect whether their > derived fields are still valid. OK, I do understand that. So what about pull_varnos_walker and pull_varattnos_walker - what about teaching them about RestrictInfos? -- Tomas Vondra http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 04/29/15 18:33, Tomas Vondra wrote: > > OK, I do understand that. So what about pull_varnos_walker and > pull_varattnos_walker - what about teaching them about RestrictInfos? Attached is a patch fixing the issue by handling RestrictInfo in pull_varnos_walker and pull_varattnos_walker. -- Tomas Vondra http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On Wed, Apr 29, 2015 at 12:33 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > On 04/29/15 18:26, Tom Lane wrote: >> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >>> OK, let me explain the context a bit more. When working on the >>> multivariate statistics patch, I need to choose which stats to use for >>> estimating the clauses. I do that in clauselist_selectivity(), although >>> there might be other places where to do that. >> >> Hm, well, clauselist_selectivity and clause_selectivity already contain >> extensive special-casing for RestrictInfos, so I'm not sure that I see why >> you're having difficulty working within that structure for this change. > > So let's I'm in the clause_selectivity(), and have AND or OR clause to > estimate. I need to get varnos/varattnos for the clause(s). What should I > do? I can't call pull_varnos() or pull_varattnos() because there might be > nested RestrictInfos, as demonstrated by the query I posted. > >> But there are basic reasons why expression_tree_walker should not try >> to deal with RestrictInfos; the most obvious one being that it's not >> clear whether it should descend into both the basic and OR-clause >> subtrees. Also, if a node has expression_tree_walker support then it >> should logically have expression_tree_mutator support as well, but >> that's got multiple issues. RestrictInfos are not supposed to be >> copied once created; and the mutator couldn't detect whether their >> derived fields are still valid. > > OK, I do understand that. So what about pull_varnos_walker and > pull_varattnos_walker - what about teaching them about RestrictInfos? Tom: This patch has become part 1 of many under the "multivariate statistics vNNN" family of threads, but I guess it would be helpful if you could opine on the reasonable-ness of this approach. I don't want to commit anything over your objections, but this kind of tailed off without a conclusion. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Wed, Apr 29, 2015 at 12:33 PM, Tomas Vondra > <tomas.vondra@2ndquadrant.com> wrote: >> On 04/29/15 18:26, Tom Lane wrote: >>> But there are basic reasons why expression_tree_walker should not try >>> to deal with RestrictInfos; the most obvious one being that it's not >>> clear whether it should descend into both the basic and OR-clause >>> subtrees. Also, if a node has expression_tree_walker support then it >>> should logically have expression_tree_mutator support as well, but >>> that's got multiple issues. RestrictInfos are not supposed to be >>> copied once created; and the mutator couldn't detect whether their >>> derived fields are still valid. >> OK, I do understand that. So what about pull_varnos_walker and >> pull_varattnos_walker - what about teaching them about RestrictInfos? > This patch has become part 1 of many under the "multivariate > statistics vNNN" family of threads, but I guess it would be helpful if > you could opine on the reasonable-ness of this approach. I don't want > to commit anything over your objections, but this kind of tailed off > without a conclusion. I'm up to my ears in psql at the moment, but hope to get to the multivariate stats patch later, maybe next week. In the meantime, I remain of the opinion that RestrictInfo is not an expression node and wanting expression_tree_walker to handle it is wrong. I'm suspicious about pull_varnos; it might be all right given that we can assume the same Vars are in both subtrees, but I still don't really see why that one function has suddenly grown this need when others have not. regards, tom lane
On Fri, Mar 18, 2016 at 3:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Wed, Apr 29, 2015 at 12:33 PM, Tomas Vondra >> <tomas.vondra@2ndquadrant.com> wrote: >>> On 04/29/15 18:26, Tom Lane wrote: >>>> But there are basic reasons why expression_tree_walker should not try >>>> to deal with RestrictInfos; the most obvious one being that it's not >>>> clear whether it should descend into both the basic and OR-clause >>>> subtrees. Also, if a node has expression_tree_walker support then it >>>> should logically have expression_tree_mutator support as well, but >>>> that's got multiple issues. RestrictInfos are not supposed to be >>>> copied once created; and the mutator couldn't detect whether their >>>> derived fields are still valid. > >>> OK, I do understand that. So what about pull_varnos_walker and >>> pull_varattnos_walker - what about teaching them about RestrictInfos? > >> This patch has become part 1 of many under the "multivariate >> statistics vNNN" family of threads, but I guess it would be helpful if >> you could opine on the reasonable-ness of this approach. I don't want >> to commit anything over your objections, but this kind of tailed off >> without a conclusion. > > I'm up to my ears in psql at the moment, but hope to get to the > multivariate stats patch later, maybe next week. Oh, cool. > In the meantime, > I remain of the opinion that RestrictInfo is not an expression node > and wanting expression_tree_walker to handle it is wrong. I'm > suspicious about pull_varnos; it might be all right given that we > can assume the same Vars are in both subtrees, but I still don't > really see why that one function has suddenly grown this need when > others have not. I haven't studied the patch series in enough detail to have an educated opinion on that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 03/18/2016 08:53 PM, Robert Haas wrote: > On Fri, Mar 18, 2016 at 3:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Robert Haas <robertmhaas@gmail.com> writes: >>> On Wed, Apr 29, 2015 at 12:33 PM, Tomas Vondra >>> <tomas.vondra@2ndquadrant.com> wrote: >>>> On 04/29/15 18:26, Tom Lane wrote: >>>>> But there are basic reasons why expression_tree_walker should not try >>>>> to deal with RestrictInfos; the most obvious one being that it's not >>>>> clear whether it should descend into both the basic and OR-clause >>>>> subtrees. Also, if a node has expression_tree_walker support then it >>>>> should logically have expression_tree_mutator support as well, but >>>>> that's got multiple issues. RestrictInfos are not supposed to be >>>>> copied once created; and the mutator couldn't detect whether their >>>>> derived fields are still valid. >> >>>> OK, I do understand that. So what about pull_varnos_walker and >>>> pull_varattnos_walker - what about teaching them about RestrictInfos? >> >>> This patch has become part 1 of many under the "multivariate >>> statistics vNNN" family of threads, but I guess it would be helpful if >>> you could opine on the reasonable-ness of this approach. I don't want >>> to commit anything over your objections, but this kind of tailed off >>> without a conclusion. >> >> I'm up to my ears in psql at the moment, but hope to get to the >> multivariate stats patch later, maybe next week. > > Oh, cool. > >> In the meantime, >> I remain of the opinion that RestrictInfo is not an expression node >> and wanting expression_tree_walker to handle it is wrong. I'm >> suspicious about pull_varnos; it might be all right given that we >> can assume the same Vars are in both subtrees, but I still don't >> really see why that one function has suddenly grown this need when >> others have not. > > I haven't studied the patch series in enough detail to have an > educated opinion on that. FWIW I'm not convinced this part of the patch (matching the clauses to the available stats, including the pull_varnos changes) is perfectly correct either, and it's quite possible it will need reworking. But it needs a pair of fresh eyes, I guess. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services