planner weirdness: a join uses nestloop with checking condition whenthere are two subplan-or-hashed subqueries - Mailing list pgsql-bugs

From Alexey Bashtanov
Subject planner weirdness: a join uses nestloop with checking condition whenthere are two subplan-or-hashed subqueries
Date
Msg-id ff42b25b-ff03-27f8-ed11-b8255d658cd5@imap.cc
Whole thread Raw
Responses Re: planner weirdness: a join uses nestloop with checking condition when there are two subplan-or-hashed subqueries  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
Hello,

Planner seems to make a weird choice of join algorithm -- O(M*N) 
nestloop -- under certain circumstances.
I'm not exactly sure what is the criteria but I have a self-contained 
example, albeit a large one.

So if you unpack the pb-dump.sql.bz2 attached and run the pb-test.sql 
you'll have some plan explained.
Despite an index present on "qux" it's accessed using a plain seq-seq 
nestloop, see pb-plan.txt

Why? Cardinalities look like predicted reasonably well, and with those 
predictions hash join or index scan would
be obviously faster: the planner thinks we are joining 149 and 175728 rows.
The distribution for "qux"."foo_id" is not too skew, the average number 
of rows per "foo_id" in "qux" is about 9.
Slight data or query variations make it use the index.
With "set enable_nestloop to off; set enable_mergejoin to off;" the plan 
generated is better. It has smaller cost of the final join, though the 
costs for the outer relation increase, probably due to the "never 
executed" path.

Playing with json/from_collapse_limit does not make any difference.

I can observe this on both master and v 10.11 .

I haven't investigated it any further yet, so for now just asking 
whether it's a known behavior?
If not, I'll try to find out what's going on.

Best, Alex

Attachment

pgsql-bugs by date:

Previous
From: eli.mach@mailbox.org
Date:
Subject: Re: pretty_bool in pg_get_constraintdef has no effect since pg >= 9
Next
From: Tomas Vondra
Date:
Subject: Re: BUG #16275: we are facing error aspsycopg2.errors.ProgramLimitExceeded: row is too big: size 24520, maximum