Pulling up direct-correlated ANY_SUBLINK - Mailing list pgsql-hackers

From Richard Guo
Subject Pulling up direct-correlated ANY_SUBLINK
Date
Msg-id CAN_9JTx7N+CxEQLnu_uHxx+EscSgxLLuNgaZT6Sjvdpt7toy3w@mail.gmail.com
Whole thread Raw
Responses Re: Pulling up direct-correlated ANY_SUBLINK
Re: Pulling up direct-correlated ANY_SUBLINK
List pgsql-hackers
Hi,

Currently we do not try to pull up sub-select of type ANY_SUBLINK if it
refers to any Vars of the parent query, as indicated in the code snippet
below:

JoinExpr *
convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
                            Relids available_rels)
{
    ...

    if (contain_vars_of_level((Node *) subselect, 1))
        return NULL;


Why do we have this check?

Can we try to pull up direct-correlated ANY SubLink with the help of
LATERAL? That is, do the pull up in the same way as uncorrelated ANY
SubLink, by adding the SubLink's subselect to the query's rangetable,
but explicitly set LATERAL for its RangeTblEntry, like:

--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -1226,13 +1226,6 @@ convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
        Assert(sublink->subLinkType == ANY_SUBLINK);

        /*
-        * The sub-select must not refer to any Vars of the parent query. (Vars of
-        * higher levels should be okay, though.)
-        */
-       if (contain_vars_of_level((Node *) subselect, 1))
-               return NULL;
-
-       /*
         * The test expression must contain some Vars of the parent query, else
         * it's not gonna be a join.  (Note that it won't have Vars referring to
         * the subquery, rather Params.)
@@ -1267,7 +1260,7 @@ convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
        rte = addRangeTableEntryForSubquery(pstate,
                                            subselect,
                                            makeAlias("ANY_subquery", NIL),
-                                           false,
+                                          contain_vars_of_level((Node *) subselect, 1), /* lateral */
                                            false);
        parse->rtable = lappend(parse->rtable, rte);
        rtindex = list_length(parse->rtable);


By this way, we can convert the query:

select * from a where a.i = ANY(select i from b where a.j > b.j);

To:

select * from a SEMI JOIN lateral(select * from b where a.j > b.j) sub on a.i = sub.i;


Does this make sense?

Thanks
Richard

pgsql-hackers by date:

Previous
From: "Tsunakawa, Takayuki"
Date:
Subject: RE: [bug fix] Produce a crash dump before main() on Windows
Next
From: Alexander Korotkov
Date:
Subject: Re: WIP: BRIN multi-range indexes