Re: Postgres Optimizer ignores information about foreign key relationship, severely misestimating number of returned rows in join - Mailing list pgsql-performance

From Justin Pryzby
Subject Re: Postgres Optimizer ignores information about foreign key relationship, severely misestimating number of returned rows in join
Date
Msg-id 20201026203413.GA9241@telsasoft.com
Whole thread Raw
In response to Postgres Optimizer ignores information about foreign key relationship, severly misestimating number of returned rows in join  ("Ehrenreich, Sigrid" <Ehrenreich@consist.de>)
List pgsql-performance
On Mon, Oct 26, 2020 at 03:58:05PM +0000, Ehrenreich, Sigrid wrote:
> Hi Performance Guys,
> 
> I hope you can help me. I am joining two tables, that have a foreign key relationship. So I expect the optimizer to
estimatethe number of the resulting rows to be the same as the number of the returned rows of one of the tables. But
theestimate is way too low.
 
> 
> I have built a test case, where the problem is easily to be seen.

I reproduced the problem on v14dev.

Note the different estimates between these:

postgres=# explain analyze SELECT * FROM fact INNER JOIN dim USING (low_card,anydata1,anydata2) WHERE fact.low_card=2;
 Hash Join  (cost=161.58..358.85 rows=112 width=12) (actual time=8.707..15.717 rows=3289 loops=1)

postgres=# explain analyze SELECT * FROM fact INNER JOIN dim USING (low_card,anydata1,anydata2) WHERE fact.low_card
BETWEEN2 AND 2;
 
 Hash Join  (cost=324.71..555.61 rows=3289 width=12) (actual time=15.966..23.394 rows=3289 loops=1)

I think because low_card has an equality comparison in addition to the equijoin,
it's being disqualified from the planner's mechanism to consider FKs in join
selectivity.
https://doxygen.postgresql.org/costsize_8c_source.html#l05024

I don't know enough about this to help more than that.



pgsql-performance by date:

Previous
From: Philip Semanchuk
Date:
Subject: Re: Understanding bad estimate (related to FKs?)
Next
From: David Rowley
Date:
Subject: Re: Postgres Optimizer ignores information about foreign key relationship, severly misestimating number of returned rows in join