Thread: Unnecessary lateral dependencies implied by PHVs

Unnecessary lateral dependencies implied by PHVs

From
Richard Guo
Date:
Hi hackers,

As we know when we pull up a simple subquery, if the subquery is within
the nullable side of an outer join, lateral references to non-nullable
items may have to be turned into PlaceHolderVars. I happened to wonder
what should we do about the PHVs if the outer join is reduced to inner
join afterwards. Should we unwrap the related PHVs? I'm asking because
PHVs may imply lateral dependencies which may make us have to use
nestloop join. As an example, consider

explain (costs off)
select * from a left join lateral (select a.i as ai, b.i as bi from b) ss on true where ss.bi = ss.ai;
        QUERY PLAN
---------------------------
 Nested Loop
   ->  Seq Scan on a
   ->  Seq Scan on b
         Filter: (i = a.i)
(4 rows)

Although the JOIN_LEFT has been reduced to JOIN_INNER, the lateral
reference implied by the PHV makes us have no choice but the nestloop
with parameterized inner path. Considering there is no index on b, this
plan is very inefficient.

Is there anything we can do to improve this situation?

Thanks
Richard

Re: Unnecessary lateral dependencies implied by PHVs

From
Richard Guo
Date:

On Mon, Oct 10, 2022 at 10:35 AM Richard Guo <guofenglinux@gmail.com> wrote:
As we know when we pull up a simple subquery, if the subquery is within
the nullable side of an outer join, lateral references to non-nullable
items may have to be turned into PlaceHolderVars. I happened to wonder
what should we do about the PHVs if the outer join is reduced to inner
join afterwards. Should we unwrap the related PHVs? I'm asking because
PHVs may imply lateral dependencies which may make us have to use
nestloop join. 
 
At first I considered about unwrapping the related PHVs after we've
successfully reduced outer joins to inner joins. But that requires a lot
of coding which seems not worth the trouble.

I think maybe the problem here is about the order we pull up subqueries
and we reduce outer joins. But simply flipping the order for them two is
definitely incorrect. I'm not sure how to make it right.

Any thoughts?

Thanks
Richard

Re: Unnecessary lateral dependencies implied by PHVs

From
Andy Fan
Date:
Hi Richard:

On Mon, Oct 10, 2022 at 10:35 AM Richard Guo <guofenglinux@gmail.com> wrote:
... I'm asking because
PHVs may imply lateral dependencies which may make us have to use
nestloop join.

I thought lateral join imply nestloop join,  am I missing something?  Here is my simple
testing. 

postgres=# explain (costs off) select * from r1 join lateral (select r1.a from r2) on true;
         QUERY PLAN
----------------------------
 Nested Loop
   ->  Seq Scan on r1
   ->  Materialize
         ->  Seq Scan on r2
(4 rows)

Time: 0.349 ms
postgres=# set enable_nestloop to off;
SET
Time: 0.123 ms

postgres=# explain (costs off) select * from r1 join lateral (select r1.a from r2) on true;
         QUERY PLAN
----------------------------
 Nested Loop
   ->  Seq Scan on r1
   ->  Materialize
         ->  Seq Scan on r2
(4 rows)

--
Best Regards
Andy Fan

Re: Unnecessary lateral dependencies implied by PHVs

From
Richard Guo
Date:

On Tue, Oct 18, 2022 at 9:15 AM Andy Fan <zhihui.fan1213@gmail.com> wrote:
On Mon, Oct 10, 2022 at 10:35 AM Richard Guo <guofenglinux@gmail.com> wrote:
... I'm asking because
PHVs may imply lateral dependencies which may make us have to use
nestloop join.

I thought lateral join imply nestloop join,  am I missing something?  Here is my simple
testing. 
 
I think it's true most of the time.  And that's why we should try to
avoid unnecessary lateral dependencies, as discussed in this thread.

ISTM in your example nestloop is chosen because there is no available
mergejoinable/hashjoinable clause. In your query the lateral subquery
would be pulled up into the parent query and there would be no lateral
dependencies afterwards.

Thanks
Richard