Some problems regarding the self-join elimination code - Mailing list pgsql-hackers

From Richard Guo
Subject Some problems regarding the self-join elimination code
Date
Msg-id CAMbWs49PE3CvnV8vrQ0Dr=HqgZZmX0tdNbzVNJxqc8yg-8kDQQ@mail.gmail.com
Whole thread Raw
Responses Re: Some problems regarding the self-join elimination code
List pgsql-hackers
While working on another patch, I looked at ChangeVarNodes() and found
some issues introduced by the self-join elimination commit.  I think
it'd better to fix or improve them.

* ChangeVarNodes_walker includes special handling for RestrictInfo
nodes.  And it adjusts RestrictInfo.orclause only if the rt_index of
the relation to be removed is contained in RestrictInfo.clause_relids.

I don't think this is correct.  Even if the to-be-removed relation is
not present in clause_relids, it can still be found somewhere within
the orclause.  As an example, consider:

create table t (a int primary key, b int, c int);

explain (costs off)
select * from t t1
   inner join t t2 on t1.a = t2.a
    left join t t3 on t1.c = 1 and (t2.b = t3.b or t2.b = 1);
server closed the connection unexpectedly

This query triggers an Assert failure in match_orclause_to_indexcol,
and the root cause is that the removed relation is still present
in the outer_relids of the OR-clauses.

To fix, I think we may need to run ChangeVarNodes_walker on the
orclause in all cases.


* After the adjustment, the num_base_rels is recalculated as:

+      rinfo->num_base_rels = bms_num_members(rinfo->clause_relids);

I don't think this is correct either.  num_base_rels is the number of
base rels in clause_relids and should not count outer-join relids.
And we have no guarantee that clause_relids does not include any
outer-join relids.

To fix, I think we should exclude all outer-join relids.  Perhaps we
can leverage root->outer_join_rels to achieve this.


*  I'm not sure why the self-join elimination commit removed all the
comments in ChangeVarNodes(Extended).  I think these comments are very
helpful for understanding the code.

-       /*
-        * Must be prepared to start with a Query or a bare expression tree; if
-        * it's a Query, go straight to query_tree_walker to make sure that
-        * sublevels_up doesn't get incremented prematurely.
-        */
        if (node && IsA(node, Query))
        {
                Query      *qry = (Query *) node;

-               /*
-                * If we are starting at a Query, and sublevels_up is
zero, then we
-                * must also fix rangetable indexes in the Query
itself --- namely
-                * resultRelation, mergeTargetRelation, exclRelIndex
and rowMarks
-                * entries.  sublevels_up cannot be zero when recursing into a
-                * subquery, so there's no need to have the same logic inside
-                * ChangeVarNodes_walker.
-                */
                if (sublevels_up == 0)
                {
                        ListCell   *l;
@@ -701,7 +772,6 @@ ChangeVarNodes(Node *node, int rt_index, int
new_index, int sublevels_up)
                        if (qry->mergeTargetRelation == rt_index)
                                qry->mergeTargetRelation = new_index;

-                       /* this is unlikely to ever be used, but ... */


* Speaking of comments, I’ve noticed that some changes are lacking
comments.  For example, there are almost no comments regarding the
special handling of RestrictInfo nodes in ChangeVarNodes_walker,
except for an explanation about adding the NullTest.

BTW, there's a typo in that explanation: qual() should be equal().

This is not a thorough review of the code; I just randomly looked at
ChangeVarNodes().  It's possible that there are other issues that
haven't been discovered, but I think we should at least fix these.
I'll provide a patch for the fixes later.  Apologies for being so
nitpicky.

Thanks
Richard



pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: Reducing the log spam
Next
From: "Zhijie Hou (Fujitsu)"
Date:
Subject: RE: Fix slot synchronization with two_phase decoding enabled