Home > mailing lists

Re: planner weirdness: a join uses nestloop with checking condition when there are two subplan-or-hashed subqueries - Mailing list pgsql-bugs

From	Tom Lane
Subject	Re: planner weirdness: a join uses nestloop with checking condition when there are two subplan-or-hashed subqueries
Date	February 25, 2020 20:34:28
Msg-id	8617.1582662868@sss.pgh.pa.us Whole thread Raw
In response to	planner weirdness: a join uses nestloop with checking condition whenthere are two subplan-or-hashed subqueries (Alexey Bashtanov <bashtanov@imap.cc>)
Responses	Re: planner weirdness: a join uses nestloop with checking conditionwhen there are two subplan-or-hashed subqueries
List	pgsql-bugs

Tree view

Alexey Bashtanov <bashtanov@imap.cc> writes:
> Planner seems to make a weird choice of join algorithm -- O(M*N)
> nestloop -- under certain circumstances.

In the particular case you've got here, the estimated cost of the
"foo" scan is so large that it swamps everything else:

 Nested Loop Left Join  (cost=0.00..53543550.45 rows=1135 width=8) (actual time=714.969..1407.137 rows=1051 loops=1)
   ...
   ->  Seq Scan on foo  (cost=0.00..53045460.77 rows=149 width=4) (actual time=711.364..714.312 rows=21 loops=1)
   ...
   ->  Materialize  (cost=0.00..4100.92 rows=175728 width=8) (actual time=0.003..19.522 rows=175728 loops=21)
         ->  Seq Scan on qux  (cost=0.00..2535.28 rows=175728 width=8) (actual time=0.017..19.754 rows=175728 loops=1)

Yeah, an indexscan on qux would probably be a little cheaper,
but it could not make as much as 1% difference to the total cost,
because the "foo" scan is accounting for more than 99% already.
I don't recall the exact rules, but the planner stops worrying
about cost differences less than 1%, in order to save planning
time by cutting the number of cases considered.  So it's probably
just discarding the indexed-join alternative as not being
meaningfully better.

What seems actually out of line with reality is the cost estimate
for the "foo" scan, which probably is down to the fact that
cost_qual_eval_walker is not very bright about what to do with
the AlternativeSubPlan constructs.  It looks like it's assuming
that the non-hashed alternatives will be chosen, which they aren't
(if they were, this estimate might not be so far out of line).
But we can't just switch it to make the other assumption, because
that would skew the results for other cases.

            regards, tom lane

pgsql-bugs by date:

From: Tom Lane
Date: 25 February 2020, 19:59:53
Subject: Re: BUG #16277: xmlelement allows invalid XML characters when XML version is set to 1.0

From: Andreas Lennartsson
Date: 25 February 2020, 21:14:13
Subject: Re: BUG #16277: xmlelement allows invalid XML characters when XMLversion is set to 1.0

Re: planner weirdness: a join uses nestloop with checking condition when there are two subplan-or-hashed subqueries - Mailing list pgsql-bugs

Previous

Next