Failure to reordering in case of a lateral join in combination with aleft join (not inner join) resulting in suboptimal nested loop plan - Mailing list pgsql-performance

From Peter Billen
Subject Failure to reordering in case of a lateral join in combination with aleft join (not inner join) resulting in suboptimal nested loop plan
Date
Msg-id CAMTXbE8yTu9Yo=KFcYt2VqAQF_VBEZj4EXuM3JVut0rdFq5SxQ@mail.gmail.com
Whole thread Raw
Responses Re: Failure to reordering in case of a lateral join in combination with a left join (not inner join) resulting in suboptimal nested loop plan
List pgsql-performance
Hi,

The queries in what follows can be executed on the following fiddle: https://dbfiddle.uk/?rdbms=postgres_10&fiddle=64542f2d987d3ce0d85bbc40ddadf7d6 - Please note that the queries/functions might look silly/pointless, I extracted the performance issue I am seeing to a minimal reproducible example.

I have the following schema:

create table parent
(
    id           int    primary key
);
create table child
(
    id           int    primary key,
    parent_id    int    references parent(id)
);
create index on child(parent_id);

Let's start with the following inlinable setof-returning function which basically returns the child rows for a given parent identifier:

create function child_get_1(int) returns table(a int) as
$$
    select    child.id
    from      child
    left join parent
    on        parent.id = child.parent_id
    where     child.parent_id = $1;
$$
language sql stable;

Note that the left join branch is intentionally not used, and thus could be eliminated by the planner.

When executing the following query, I get a satisfying hash (left) join (and the left join to parent is indeed eliminated):

explain analyze
select ch.* from parent p, child_get_1(p.id) ch;

+--------------------------------------------------------------------------------------------------------------------+
| Hash Join  (cost=3.25..17163.23 rows=999900 width=4) (actual time=0.025..194.279 rows=999900 loops=1)              |
|   Hash Cond: (child.parent_id = p.id)                                                                              |
|   ->  Seq Scan on child  (cost=0.00..14424.00 rows=999900 width=8) (actual time=0.005..47.215 rows=999900 loops=1) |
|   ->  Hash  (cost=2.00..2.00 rows=100 width=4) (actual time=0.016..0.016 rows=100 loops=1)                         |
|         Buckets: 1024  Batches: 1  Memory Usage: 12kB                                                              |
|         ->  Seq Scan on parent p  (cost=0.00..2.00 rows=100 width=4) (actual time=0.001..0.007 rows=100 loops=1)   |
+--------------------------------------------------------------------------------------------------------------------+


Now, I introduce a convenience function, also inlinable, which fetches the child rows by its parent id:

create function t(int) returns setof child as
$$
    select child.* from child where child.parent_id = $1;
$$
language sql stable;

I refactor `child_get_1(int)` from above as following:

create function child_get_2(int) returns table(a int) as
$$
    select    child.id
    from      t($1) child
    left join parent
    on        parent.id = child.parent_id;
$$
language sql stable;

explain analyze
select ch.* from parent p, child_get_2(p.id) ch;

Now, I get a nested loop, which as expected performs quite badly:

+--------------------------------------------------------------------------------------------------------------------------------------------+
| Nested Loop  (cost=189.92..493990.48 rows=999900 width=4) (actual time=1.519..713.680 rows=999900 loops=1)                                 |
|   ->  Seq Scan on parent p  (cost=0.00..2.00 rows=100 width=4) (actual time=0.004..0.081 rows=100 loops=1)                                 |
|   ->  Bitmap Heap Scan on child  (cost=189.92..4739.90 rows=9999 width=4) (actual time=1.365..6.332 rows=9999 loops=100)                   |
|         Recheck Cond: (parent_id = p.id)                                                                                                   |
|         Heap Blocks: exact=442476                                                                                                          |
|         ->  Bitmap Index Scan on child_parent_id_idx  (cost=0.00..187.42 rows=9999 width=0) (actual time=0.838..0.838 rows=9999 loops=100) |
|               Index Cond: (parent_id = p.id)                                                                                               |
+--------------------------------------------------------------------------------------------------------------------------------------------+

For some reason I cannot explain we now end up with a nested loop, instead an hash join. The fairly trivial introduction of `t(int)` messes up with reordering, but I fail to see why. I manually upped the from and join collapse limit to 32 - just to be sure -, but no effect. Also, the left join branch could not be eliminated. I believe this is related to the usage of the implicit lateral join to `child_get_2(p.id)` in the main query, which somehow messes up with the reordering of `from t($1) as child` in `child_get_2(int)`, though I am not 100% sure.

Also, note that when we apply an inner join instead of a left join, the problem goes away. The planner now manages to end up with a hash join in both cases.

I am seeing this on v10 and v11.

Any ideas?

Thank you. Best regards.

pgsql-performance by date:

Previous
From: "Naik, Sameer"
Date:
Subject: RE: Re: Generic Plans for Prepared Statement are 158155 times slower thanCustom Plans
Next
From: Tom Lane
Date:
Subject: Re: Failure to reordering in case of a lateral join in combination with a left join (not inner join) resulting in suboptimal nested loop plan