Re: Pulling up direct-correlated ANY_SUBLINK - Mailing list pgsql-hackers

From Andy Fan
Subject Re: Pulling up direct-correlated ANY_SUBLINK
Date
Msg-id CAKU4AWpi9oztiomUQt4JCxXEr6EaQ2thY-7JYDm6c9he0A7oCA@mail.gmail.com
Whole thread Raw
In response to Re: Pulling up direct-correlated ANY_SUBLINK  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi All:

On Tue, Sep 10, 2019 at 9:49 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Richard Guo <riguo@pivotal.io> writes:
> Currently we do not try to pull up sub-select of type ANY_SUBLINK if it
> refers to any Vars of the parent query, as indicated in the code snippet
> below:
>     if (contain_vars_of_level((Node *) subselect, 1))
>         return NULL;
> Why do we have this check?

Because the result would not be a join between two independent tables.

I think this situation is caused by we pull-up the ANY-sublink with 2 
steps, the first step is to pull up the sublink as a subquery,  and the
next step is to pull up the subquery if it is allowed.  The benefits of
this method are obvious,  pulling up the subquery has more requirements,
even if we can just finish the first step, we still get huge benefits.  
However the bad stuff happens if varlevelsup = 1 involves,  step 1 fails! 

The solution here is to use the lateral join to overcome the two
independent tables, the issue of this solution includes:

1.  LATERAL pretty much constrains things to use a nestloop like below, 
but this reason is questioned since if we can pull-up the subquery,  if so the
constraint gone. [1]
2.  It has something with unique-ify the inner path. [2] , but Richard thought 
it should be fixed but without an agreement for all people [3].
3.  Richard [4] found it would fail to get a plan for some query. (the error is
below per my testing)

> ERROR:  failed to build any 3-way joins

So back to the root cause of this issue,  IIUC,  if varlevelsup = 1 involves,
can we just bypass the 2-steps method,  just as what we do for EXISTS
sublinks?  If so, we just need to convert the ANY-SUBLINK to EXIST-SUBLINK
under the case. 

The attached is the one commit which includes the 2 methods discussed
here, controlled by different GUC separately, for easy testing.  Per my test, 
Query 2 choosed the Unique Join with the IN-to-EXISTS method, but not
with the Lateral method, and query 3 raises error with the lateral method,
but not with the IN-to-EXISTS method. 




> Can we try to pull up direct-correlated ANY SubLink with the help of
> LATERAL?

Perhaps.  But what's the argument that you'd end up with a better
plan?  LATERAL pretty much constrains things to use a nestloop,
so I'm not sure there's anything fundamentally different.

                        regards, tom lane


--
Best Regards
Andy Fan
Attachment

pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: 16: Collation versioning and dependency helpers
Next
From: Andy Fan
Date:
Subject: Re: How to started with Contributions