Thread: FIX : teach expression walker about RestrictInfo

FIX : teach expression walker about RestrictInfo

From
Tomas Vondra
Date:
Hi there,

the attached trivial patch adds handling of RestrictInfo nodes into
expression_tree_walker(). This is needed for example when calling
pull_varnos or (or other functions using the expression walker) in
clausesel.c, for example. An example of a query causing errors with
pull_varnos is

select * from t where (a >= 10 and a <= 20) or (b >= 15 and b <= 20);

which gets translated into an expression tree like this:

     BoolExpr [OR_EXPR]
         BoolExpr [AND_EXPR]
             RestrictInfo
                 OpExpr [Var >= Const]
             RestrictInfo
                 OpExpr [Var <= Const]
         BoolExpr [AND_EXPR]
             RestrictInfo
                 OpExpr [Var >= Const]
             RestrictInfo
                 OpExpr [Var <= Const]

and the nested RestrictInfo nodes make the walker fail because of
unrecognized node.

It's possible that expression walker is not supposed to know about
RestrictInfo, but I don't really see why would that be the case.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: FIX : teach expression walker about RestrictInfo

From
Tom Lane
Date:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> the attached trivial patch adds handling of RestrictInfo nodes into 
> expression_tree_walker().

RestrictInfo is not a general expression node and support for it has
been deliberately omitted from expression_tree_walker().  So I think
what you are proposing is a bad idea and probably a band-aid for some
other bad idea.

> This is needed for example when calling 
> pull_varnos or (or other functions using the expression walker) in 
> clausesel.c, for example. An example of a query causing errors with 
> pull_varnos is

> select * from t where (a >= 10 and a <= 20) or (b >= 15 and b <= 20);

Really?

regression=# create table t (a int, b int);
CREATE TABLE
regression=# select * from t where (a >= 10 and a <= 20) or (b >= 15 and b <= 20);a | b 
---+---
(0 rows)
        regards, tom lane



Re: FIX : teach expression walker about RestrictInfo

From
Tomas Vondra
Date:
Hi,

On 04/28/15 21:50, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> the attached trivial patch adds handling of RestrictInfo nodes into
>> expression_tree_walker().
>
> RestrictInfo is not a general expression node and support for it has
> been deliberately omitted from expression_tree_walker().  So I think
> what you are proposing is a bad idea and probably a band-aid for some
> other bad idea.
>
>> This is needed for example when calling
>> pull_varnos or (or other functions using the expression walker) in
>> clausesel.c, for example. An example of a query causing errors with
>> pull_varnos is
>
>> select * from t where (a >= 10 and a <= 20) or (b >= 15 and b <= 20);
>
> Really?
>
> regression=# create table t (a int, b int);
> CREATE TABLE
> regression=# select * from t where (a >= 10 and a <= 20) or (b >= 15 and b <= 20);
>   a | b
> ---+---
> (0 rows)

That's not what I said, though. I said that calling pull_varnos() causes 
the issue - apparently that does not happen in master, but I ran into 
that when hacking on a patch.

For example try adding this
    Relids tmp = pull_varnos(clause);    elog(WARNING, "count = %d", bms_num_members(tmp));

into the or_clause branch in clause_selectivity(), and then running the 
query will give you this:

db=# select * from t where (a >= 10 and a <= 20) or (b >= 15);
ERROR:  unrecognized node type: 524

But as I said - maybe calls to pull_varnos are not supposed to happen in 
this part of the code, for some reason, and it really is a bug in my patch.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: FIX : teach expression walker about RestrictInfo

From
Tom Lane
Date:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> On 04/28/15 21:50, Tom Lane wrote:
>> RestrictInfo is not a general expression node and support for it has
>> been deliberately omitted from expression_tree_walker().  So I think
>> what you are proposing is a bad idea and probably a band-aid for some
>> other bad idea.

> That's not what I said, though. I said that calling pull_varnos() causes 
> the issue - apparently that does not happen in master, but I ran into 
> that when hacking on a patch.

pull_varnos is not, and should not be, applied to a RestrictInfo; for one
thing, it'd be redundant with work that was already done when creating the
RestrictInfo (cf make_restrictinfo_internal).  You've not shown the
context of your problem very fully, but in general most code in the planner
should be well aware of whether it is working with a RestrictInfo (or list
of same) or a bare expression.  I don't believe that it's a very good idea
to obscure that distinction.
        regards, tom lane



Re: FIX : teach expression walker about RestrictInfo

From
Tomas Vondra
Date:
Hi,

On 04/29/15 05:55, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> On 04/28/15 21:50, Tom Lane wrote:
>>> RestrictInfo is not a general expression node and support for it has
>>> been deliberately omitted from expression_tree_walker().  So I think
>>> what you are proposing is a bad idea and probably a band-aid for some
>>> other bad idea.
>
>> That's not what I said, though. I said that calling pull_varnos() causes
>> the issue - apparently that does not happen in master, but I ran into
>> that when hacking on a patch.
>
> pull_varnos is not, and should not be, applied to a RestrictInfo; for one
> thing, it'd be redundant with work that was already done when creating the
> RestrictInfo (cf make_restrictinfo_internal).  You've not shown the
> context of your problem very fully, but in general most code in the planner
> should be well aware of whether it is working with a RestrictInfo (or list
> of same) or a bare expression.  I don't believe that it's a very good idea
> to obscure that distinction.

OK, let me explain the context a bit more. When working on the 
multivariate statistics patch, I need to choose which stats to use for 
estimating the clauses. I do that in clauselist_selectivity(), although 
there might be other places where to do that.

Selecting the stats that best match the clauses is based on how well 
they cover the vars in the clauses (and some other criteria, that is 
mostly irrelevant here). So at some point I do call functions like 
pull_varnos() and pull_varattnos() to get this information.

I do handle RestrictInfo nodes explicitly - for those nodes I can do get 
the info from the node itself. But this does not work for the condition 
I posted, because that contains nested RestrictInfo nodes. So I do call 
pull_varnos() on a BoolExpr node, but in reality the node tree 
representing the clauses
    WHERE (a >= 10 AND a <= 20) OR (b >= 20 AND b <= 40)

looks like this:
    BoolExpr [OR_EXPR]        BoolExpr [AND_EXPR]            RestrictInfo                OpExpr [Var >= Const]
 RestrictInfo                OpExpr [Var <= Const]        BoolExpr [AND_EXPR]            RestrictInfo
OpExpr[Var >= Const]            RestrictInfo                OpExpr [Var <= Const]
 

so the pull_varnos() fails because it runs into RestrictInfo node while 
walking the tree.

So I see three possibilities:

1) pull_varnos/pull_varattnos (and such functions, using the   expression walker internally) are not supposed to be
calledunless   you are sure there are no RestrictInfo nodes in the tree, but this   seems really awkward
 

2) expression walker is supposed to know about RestrictInfo nodes (as I   did in my patch), but you say this is not the
case

3) pull_varnos_walker / pull_varattnos_walker are supposed to handle   RestrictInfo nodes explicitly (either by using
thecached information   or by using RestrictInfo->clause in the next step)
 


regards

--
Tomas Vondra                   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: FIX : teach expression walker about RestrictInfo

From
Tom Lane
Date:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> On 04/29/15 05:55, Tom Lane wrote:
>> pull_varnos is not, and should not be, applied to a RestrictInfo; for one
>> thing, it'd be redundant with work that was already done when creating the
>> RestrictInfo (cf make_restrictinfo_internal).  You've not shown the
>> context of your problem very fully, but in general most code in the planner
>> should be well aware of whether it is working with a RestrictInfo (or list
>> of same) or a bare expression.  I don't believe that it's a very good idea
>> to obscure that distinction.

> OK, let me explain the context a bit more. When working on the 
> multivariate statistics patch, I need to choose which stats to use for 
> estimating the clauses. I do that in clauselist_selectivity(), although 
> there might be other places where to do that.

Hm, well, clauselist_selectivity and clause_selectivity already contain
extensive special-casing for RestrictInfos, so I'm not sure that I see why
you're having difficulty working within that structure for this change.

But there are basic reasons why expression_tree_walker should not try to
deal with RestrictInfos; the most obvious one being that it's not clear
whether it should descend into both the basic and OR-clause subtrees.
Also, if a node has expression_tree_walker support then it should
logically have expression_tree_mutator support as well, but that's
got multiple issues.  RestrictInfos are not supposed to be copied
once created; and the mutator couldn't detect whether their derived
fields are still valid.
        regards, tom lane



Re: FIX : teach expression walker about RestrictInfo

From
Tomas Vondra
Date:
On 04/29/15 18:26, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
...
>> OK, let me explain the context a bit more. When working on the
>> multivariate statistics patch, I need to choose which stats to use for
>> estimating the clauses. I do that in clauselist_selectivity(), although
>> there might be other places where to do that.
>
> Hm, well, clauselist_selectivity and clause_selectivity already contain
> extensive special-casing for RestrictInfos, so I'm not sure that I see why
> you're having difficulty working within that structure for this change.

So let's I'm in the clause_selectivity(), and have AND or OR clause to 
estimate. I need to get varnos/varattnos for the clause(s). What should 
I do? I can't call pull_varnos() or pull_varattnos() because there might 
be nested RestrictInfos, as demonstrated by the query I posted.

> But there are basic reasons why expression_tree_walker should not try
> to deal with RestrictInfos; the most obvious one being that it's not
> clear whether it should descend into both the basic and OR-clause
> subtrees. Also, if a node has expression_tree_walker support then it
> should logically have expression_tree_mutator support as well, but
> that's got multiple issues. RestrictInfos are not supposed to be
> copied once created; and the mutator couldn't detect whether their
> derived fields are still valid.

OK, I do understand that. So what about pull_varnos_walker and 
pull_varattnos_walker - what about teaching them about RestrictInfos?



--
Tomas Vondra                   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: FIX : teach expression walker about RestrictInfo

From
Tomas Vondra
Date:

On 04/29/15 18:33, Tomas Vondra wrote:
>
> OK, I do understand that. So what about pull_varnos_walker and
> pull_varattnos_walker - what about teaching them about RestrictInfos?

Attached is a patch fixing the issue by handling RestrictInfo in
pull_varnos_walker and pull_varattnos_walker.

--
Tomas Vondra                   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: FIX : teach expression walker about RestrictInfo

From
Robert Haas
Date:
On Wed, Apr 29, 2015 at 12:33 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> On 04/29/15 18:26, Tom Lane wrote:
>> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>>> OK, let me explain the context a bit more. When working on the
>>> multivariate statistics patch, I need to choose which stats to use for
>>> estimating the clauses. I do that in clauselist_selectivity(), although
>>> there might be other places where to do that.
>>
>> Hm, well, clauselist_selectivity and clause_selectivity already contain
>> extensive special-casing for RestrictInfos, so I'm not sure that I see why
>> you're having difficulty working within that structure for this change.
>
> So let's I'm in the clause_selectivity(), and have AND or OR clause to
> estimate. I need to get varnos/varattnos for the clause(s). What should I
> do? I can't call pull_varnos() or pull_varattnos() because there might be
> nested RestrictInfos, as demonstrated by the query I posted.
>
>> But there are basic reasons why expression_tree_walker should not try
>> to deal with RestrictInfos; the most obvious one being that it's not
>> clear whether it should descend into both the basic and OR-clause
>> subtrees. Also, if a node has expression_tree_walker support then it
>> should logically have expression_tree_mutator support as well, but
>> that's got multiple issues. RestrictInfos are not supposed to be
>> copied once created; and the mutator couldn't detect whether their
>> derived fields are still valid.
>
> OK, I do understand that. So what about pull_varnos_walker and
> pull_varattnos_walker - what about teaching them about RestrictInfos?

Tom:

This patch has become part 1 of many under the "multivariate
statistics vNNN" family of threads, but I guess it would be helpful if
you could opine on the reasonable-ness of this approach.  I don't want
to commit anything over your objections, but this kind of tailed off
without a conclusion.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: FIX : teach expression walker about RestrictInfo

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Apr 29, 2015 at 12:33 PM, Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
>> On 04/29/15 18:26, Tom Lane wrote:
>>> But there are basic reasons why expression_tree_walker should not try
>>> to deal with RestrictInfos; the most obvious one being that it's not
>>> clear whether it should descend into both the basic and OR-clause
>>> subtrees. Also, if a node has expression_tree_walker support then it
>>> should logically have expression_tree_mutator support as well, but
>>> that's got multiple issues. RestrictInfos are not supposed to be
>>> copied once created; and the mutator couldn't detect whether their
>>> derived fields are still valid.

>> OK, I do understand that. So what about pull_varnos_walker and
>> pull_varattnos_walker - what about teaching them about RestrictInfos?

> This patch has become part 1 of many under the "multivariate
> statistics vNNN" family of threads, but I guess it would be helpful if
> you could opine on the reasonable-ness of this approach.  I don't want
> to commit anything over your objections, but this kind of tailed off
> without a conclusion.

I'm up to my ears in psql at the moment, but hope to get to the
multivariate stats patch later, maybe next week.  In the meantime,
I remain of the opinion that RestrictInfo is not an expression node
and wanting expression_tree_walker to handle it is wrong.  I'm
suspicious about pull_varnos; it might be all right given that we
can assume the same Vars are in both subtrees, but I still don't
really see why that one function has suddenly grown this need when
others have not.
        regards, tom lane



Re: FIX : teach expression walker about RestrictInfo

From
Robert Haas
Date:
On Fri, Mar 18, 2016 at 3:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Wed, Apr 29, 2015 at 12:33 PM, Tomas Vondra
>> <tomas.vondra@2ndquadrant.com> wrote:
>>> On 04/29/15 18:26, Tom Lane wrote:
>>>> But there are basic reasons why expression_tree_walker should not try
>>>> to deal with RestrictInfos; the most obvious one being that it's not
>>>> clear whether it should descend into both the basic and OR-clause
>>>> subtrees. Also, if a node has expression_tree_walker support then it
>>>> should logically have expression_tree_mutator support as well, but
>>>> that's got multiple issues. RestrictInfos are not supposed to be
>>>> copied once created; and the mutator couldn't detect whether their
>>>> derived fields are still valid.
>
>>> OK, I do understand that. So what about pull_varnos_walker and
>>> pull_varattnos_walker - what about teaching them about RestrictInfos?
>
>> This patch has become part 1 of many under the "multivariate
>> statistics vNNN" family of threads, but I guess it would be helpful if
>> you could opine on the reasonable-ness of this approach.  I don't want
>> to commit anything over your objections, but this kind of tailed off
>> without a conclusion.
>
> I'm up to my ears in psql at the moment, but hope to get to the
> multivariate stats patch later, maybe next week.

Oh, cool.

> In the meantime,
> I remain of the opinion that RestrictInfo is not an expression node
> and wanting expression_tree_walker to handle it is wrong.  I'm
> suspicious about pull_varnos; it might be all right given that we
> can assume the same Vars are in both subtrees, but I still don't
> really see why that one function has suddenly grown this need when
> others have not.

I haven't studied the patch series in enough detail to have an
educated opinion on that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: FIX : teach expression walker about RestrictInfo

From
Tomas Vondra
Date:
On 03/18/2016 08:53 PM, Robert Haas wrote:
> On Fri, Mar 18, 2016 at 3:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> On Wed, Apr 29, 2015 at 12:33 PM, Tomas Vondra
>>> <tomas.vondra@2ndquadrant.com> wrote:
>>>> On 04/29/15 18:26, Tom Lane wrote:
>>>>> But there are basic reasons why expression_tree_walker should not try
>>>>> to deal with RestrictInfos; the most obvious one being that it's not
>>>>> clear whether it should descend into both the basic and OR-clause
>>>>> subtrees. Also, if a node has expression_tree_walker support then it
>>>>> should logically have expression_tree_mutator support as well, but
>>>>> that's got multiple issues. RestrictInfos are not supposed to be
>>>>> copied once created; and the mutator couldn't detect whether their
>>>>> derived fields are still valid.
>>
>>>> OK, I do understand that. So what about pull_varnos_walker and
>>>> pull_varattnos_walker - what about teaching them about RestrictInfos?
>>
>>> This patch has become part 1 of many under the "multivariate
>>> statistics vNNN" family of threads, but I guess it would be helpful if
>>> you could opine on the reasonable-ness of this approach.  I don't want
>>> to commit anything over your objections, but this kind of tailed off
>>> without a conclusion.
>>
>> I'm up to my ears in psql at the moment, but hope to get to the
>> multivariate stats patch later, maybe next week.
>
> Oh, cool.
>
>> In the meantime,
>> I remain of the opinion that RestrictInfo is not an expression node
>> and wanting expression_tree_walker to handle it is wrong.  I'm
>> suspicious about pull_varnos; it might be all right given that we
>> can assume the same Vars are in both subtrees, but I still don't
>> really see why that one function has suddenly grown this need when
>> others have not.
>
> I haven't studied the patch series in enough detail to have an
> educated opinion on that.

FWIW I'm not convinced this part of the patch (matching the clauses to 
the available stats, including the pull_varnos changes) is perfectly 
correct either, and it's quite possible it will need reworking. But it 
needs a pair of fresh eyes, I guess.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services