Thread: Postgres Optimizer ignores information about foreign key relationship, severly misestimating number of returned rows in join

Hi Performance Guys,

 

I hope you can help me. I am joining two tables, that have a foreign key relationship. So I expect the optimizer to estimate the number of the resulting rows to be the same as the number of the returned rows of one of the tables. But the estimate is way too low.

 

I have built a test case, where the problem is easily to be seen.

 

Testcase:
-- create a large table with one column with only 3 possible values, the other rows are only there to increase the selectivity

create table fact (low_card integer, anydata1 integer, anydata2 integer);

insert into fact (low_card, anydata1, anydata2) select floor(random()*3+1),floor(random()*1000+1),floor(random()*100+1) from generate_series(1,10000);

 

-- create a smaller table with only unique values to be referenced by foreign key

create table dim as (select distinct low_card, anydata1, anydata2 from fact);

create unique index on dim (low_card, anydata1, anydata2);

alter table fact add constraint fk foreign key (low_card, anydata1, anydata2) references dim (low_card, anydata1, anydata2);

 

analyze fact;

analyze dim;

 

And here comes the query:

explain analyze

select count(*) from fact inner join dim on (fact.low_card=dim.low_card and fact.anydata1=dim.anydata1 and fact.anydata2=dim.anydata2)

where fact.low_card=1;

 

Aggregate  (cost=424.11..424.12 rows=1 width=8) (actual time=7.899..7.903 rows=1 loops=1)

  ->  Hash Join  (cost=226.27..423.82 rows=115 width=0) (actual time=3.150..7.511 rows=3344 loops=1)   <=========== With the FK, the estimation should be 3344, but it is 115 rows

        Hash Cond: ((fact.anydata1 = dim.anydata1) AND (fact.anydata2 = dim.anydata2))                                 

        ->  Seq Scan on fact  (cost=0.00..180.00 rows=3344 width=12) (actual time=0.025..2.289 rows=3344 loops=1)

              Filter: (low_card = 1)

              Rows Removed by Filter: 6656

        ->  Hash  (cost=176.89..176.89 rows=3292 width=12) (actual time=3.105..3.107 rows=3292 loops=1)

              Buckets: 4096  Batches: 1  Memory Usage: 174kB

              ->  Seq Scan on dim  (cost=0.00..176.89 rows=3292 width=12) (actual time=0.014..2.103 rows=3292 loops=1)

                    Filter: (low_card = 1)

                    Rows Removed by Filter: 6539

Planning Time: 0.619 ms

Execution Time: 7.973 ms

 

 

My problem is, that I am joining a lot more tables in reality and since the row estimates are so low, the optimizer goes for nested loops, leading to inacceptable execution times.

 

Question: How can I get the optimizer to use the information about the foreign key relationship and get accurate estimates?

 

Sigrid Ehrenreich

 

On Mon, Oct 26, 2020 at 03:58:05PM +0000, Ehrenreich, Sigrid wrote:
> Hi Performance Guys,
> 
> I hope you can help me. I am joining two tables, that have a foreign key relationship. So I expect the optimizer to
estimatethe number of the resulting rows to be the same as the number of the returned rows of one of the tables. But
theestimate is way too low.
 
> 
> I have built a test case, where the problem is easily to be seen.

I reproduced the problem on v14dev.

Note the different estimates between these:

postgres=# explain analyze SELECT * FROM fact INNER JOIN dim USING (low_card,anydata1,anydata2) WHERE fact.low_card=2;
 Hash Join  (cost=161.58..358.85 rows=112 width=12) (actual time=8.707..15.717 rows=3289 loops=1)

postgres=# explain analyze SELECT * FROM fact INNER JOIN dim USING (low_card,anydata1,anydata2) WHERE fact.low_card
BETWEEN2 AND 2;
 
 Hash Join  (cost=324.71..555.61 rows=3289 width=12) (actual time=15.966..23.394 rows=3289 loops=1)

I think because low_card has an equality comparison in addition to the equijoin,
it's being disqualified from the planner's mechanism to consider FKs in join
selectivity.
https://doxygen.postgresql.org/costsize_8c_source.html#l05024

I don't know enough about this to help more than that.



On Tue, 27 Oct 2020 at 06:54, Ehrenreich, Sigrid <Ehrenreich@consist.de> wrote:
>   ->  Hash Join  (cost=226.27..423.82 rows=115 width=0) (actual time=3.150..7.511 rows=3344 loops=1)   <===========
Withthe FK, the estimation should be 3344, but it is 115 rows
 

I'd have expected this to find the foreign key and have the join
selectivity of 1.0, but I see it does not due to the fact that one of
the EquivalenceClass has a constant due to the fact.low_card = 1 qual.

In build_join_rel() we call build_joinrel_restrictlist() to get the
join quals that need to be evaluated at the join level, but we only
get the fact.anydata1=dim.anydata1 and fact.anydata2=dim.anydata2
quals there.  The low_card qual gets pushed down to the scan level on
each side of the join, so no need for it to get evaluated at the join
level. Later in build_join_rel() we do set_joinrel_size_estimates().
The restrictlist with just the two quals is what we pass to
get_foreign_key_join_selectivity().  Only two of the foreign key
columns are matched there, therefore we don't class that as a match
and just leave it up to the normal selectivity functions.

I feel like we could probably do better there and perhaps somehow
count ECs with ec_has_const as matched, but there seems to be some
assumptions later in get_foreign_key_join_selectivity() where we
determine the selectivity based on the base rel's tuple count.  We'd
need to account for how many rows remainder after filtering the ECs
with ec_has_const == true, else we'd be doing the wrong thing.  That
needs more thought than I have time for right now.

Your case would work if the foreign key had been on just anydata1 and
anydata2, but there's not much chance of that working without a unique
index on those two columns.

Extended statistics won't help you here either since they're currently
not used for join estimations.

David



David Rowley <dgrowleyml@gmail.com> writes:
> On Tue, 27 Oct 2020 at 06:54, Ehrenreich, Sigrid <Ehrenreich@consist.de> wrote:
>> ->  Hash Join  (cost=226.27..423.82 rows=115 width=0) (actual time=3.150..7.511 rows=3344 loops=1)   <===========
Withthe FK, the estimation should be 3344, but it is 115 rows 

> I'd have expected this to find the foreign key and have the join
> selectivity of 1.0, but I see it does not due to the fact that one of
> the EquivalenceClass has a constant due to the fact.low_card = 1 qual.

Right.

> I feel like we could probably do better there and perhaps somehow
> count ECs with ec_has_const as matched, but there seems to be some
> assumptions later in get_foreign_key_join_selectivity() where we
> determine the selectivity based on the base rel's tuple count.  We'd
> need to account for how many rows remainder after filtering the ECs
> with ec_has_const == true, else we'd be doing the wrong thing.  That
> needs more thought than I have time for right now.

Yeah, I'm fooling with a patch for that now.  The basic problem is
that the selectivity of the x = constant clauses has already been
factored into the sizes of both join input relations, so we're
double-counting it if we just apply the existing FK-based
selectivity estimate.  I think though that we can recover the
selectivity associated with that qual on the FK side (or should
it be the PK side?) and cancel it out of the FK selectivity.

            regards, tom lane



Hi Tom,

A patch would be very much appreciated.
We are currently running on Version 12, but could upgrade to 13, if necessary.

Could you send me a notification if you managed to program a patch for that?

Regards,
Sigrid

-----Original Message-----
From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Monday, October 26, 2020 11:54 PM
To: David Rowley <dgrowleyml@gmail.com>
Cc: Ehrenreich, Sigrid <Ehrenreich@consist.de>; pgsql-performance@lists.postgresql.org
Subject: Re: Postgres Optimizer ignores information about foreign key relationship, severly misestimating number of
returnedrows in join 

David Rowley <dgrowleyml@gmail.com> writes:
> On Tue, 27 Oct 2020 at 06:54, Ehrenreich, Sigrid <Ehrenreich@consist.de> wrote:
>> ->  Hash Join  (cost=226.27..423.82 rows=115 width=0) (actual time=3.150..7.511 rows=3344 loops=1)   <===========
Withthe FK, the estimation should be 3344, but it is 115 rows 

> I'd have expected this to find the foreign key and have the join
> selectivity of 1.0, but I see it does not due to the fact that one of
> the EquivalenceClass has a constant due to the fact.low_card = 1 qual.

Right.

> I feel like we could probably do better there and perhaps somehow
> count ECs with ec_has_const as matched, but there seems to be some
> assumptions later in get_foreign_key_join_selectivity() where we
> determine the selectivity based on the base rel's tuple count.  We'd
> need to account for how many rows remainder after filtering the ECs
> with ec_has_const == true, else we'd be doing the wrong thing.  That
> needs more thought than I have time for right now.

Yeah, I'm fooling with a patch for that now.  The basic problem is
that the selectivity of the x = constant clauses has already been
factored into the sizes of both join input relations, so we're
double-counting it if we just apply the existing FK-based
selectivity estimate.  I think though that we can recover the
selectivity associated with that qual on the FK side (or should
it be the PK side?) and cancel it out of the FK selectivity.

            regards, tom lane



"Ehrenreich, Sigrid" <Ehrenreich@consist.de> writes:
> A patch would be very much appreciated.
> We are currently running on Version 12, but could upgrade to 13, if necessary.
> Could you send me a notification if you managed to program a patch for that?

I've pushed a patch for this to HEAD, but current thinking is that we
will not be back-patching it.  Still, if you're desperate you could
consider running a custom build of v13 with that patch --- a quick
check suggests that it would back-patch easily.  v12 would be a
slightly harder lift (I see one hunk doesn't apply) but probably
not by much.

https://git.postgresql.org/gitweb/?p=postgresql.git;a=patch;h=ad1c36b0709e47cdb3cc4abd6c939fe64279b63f

            regards, tom lane



Hi Tom,

Thanks a lot for your help!

If it is in the HEAD, does it mean, it will be included in v14?

I'll have to see, if we dare building our own v13 version with the patch.
(I would love to, because I am simply thrilled to pieces, having a patch made by you for us 😉)

Regards,
Sigrid

-----Original Message-----
From: Tom Lane <tgl@sss.pgh.pa.us> 
Sent: Wednesday, October 28, 2020 5:55 PM
To: Ehrenreich, Sigrid <Ehrenreich@consist.de>
Cc: David Rowley <dgrowleyml@gmail.com>; pgsql-performance@lists.postgresql.org
Subject: Re: Postgres Optimizer ignores information about foreign key relationship, severly misestimating number of
returnedrows in join
 

"Ehrenreich, Sigrid" <Ehrenreich@consist.de> writes:
> A patch would be very much appreciated.
> We are currently running on Version 12, but could upgrade to 13, if necessary.
> Could you send me a notification if you managed to program a patch for that?

I've pushed a patch for this to HEAD, but current thinking is that we
will not be back-patching it.  Still, if you're desperate you could
consider running a custom build of v13 with that patch --- a quick
check suggests that it would back-patch easily.  v12 would be a
slightly harder lift (I see one hunk doesn't apply) but probably
not by much.

https://git.postgresql.org/gitweb/?p=postgresql.git;a=patch;h=ad1c36b0709e47cdb3cc4abd6c939fe64279b63f

            regards, tom lane

On Thu, Oct 29, 2020 at 9:43 AM Ehrenreich, Sigrid
<Ehrenreich@consist.de> wrote:
>
> Hi Tom,
>
> Thanks a lot for your help!
>
> If it is in the HEAD, does it mean, it will be included in v14?


Yes, that's precisely what it means. Unless someone finds something
bad with it and it has to be removed of course, but in principle it
means that it will be in 14.

//Magnus


>
>
> I'll have to see, if we dare building our own v13 version with the patch.
> (I would love to, because I am simply thrilled to pieces, having a patch made by you for us )
>
> Regards,
> Sigrid
>
> -----Original Message-----
> From: Tom Lane <tgl@sss.pgh.pa.us>
> Sent: Wednesday, October 28, 2020 5:55 PM
> To: Ehrenreich, Sigrid <Ehrenreich@consist.de>
> Cc: David Rowley <dgrowleyml@gmail.com>; pgsql-performance@lists.postgresql.org
> Subject: Re: Postgres Optimizer ignores information about foreign key relationship, severly misestimating number of
returnedrows in join
 
>
> "Ehrenreich, Sigrid" <Ehrenreich@consist.de> writes:
> > A patch would be very much appreciated.
> > We are currently running on Version 12, but could upgrade to 13, if necessary.
> > Could you send me a notification if you managed to program a patch for that?
>
> I've pushed a patch for this to HEAD, but current thinking is that we
> will not be back-patching it.  Still, if you're desperate you could
> consider running a custom build of v13 with that patch --- a quick
> check suggests that it would back-patch easily.  v12 would be a
> slightly harder lift (I see one hunk doesn't apply) but probably
> not by much.
>
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=patch;h=ad1c36b0709e47cdb3cc4abd6c939fe64279b63f
>
>                         regards, tom lane



Thanks!

Regards,
Sigrid

-----Original Message-----
From: Magnus Hagander <magnus@hagander.net> 
Sent: Thursday, October 29, 2020 10:57 AM
To: Ehrenreich, Sigrid <Ehrenreich@consist.de>
Cc: Tom Lane <tgl@sss.pgh.pa.us>; David Rowley <dgrowleyml@gmail.com>; pgsql-performance@lists.postgresql.org
Subject: Re: Postgres Optimizer ignores information about foreign key relationship, severly misestimating number of
returnedrows in join
 

On Thu, Oct 29, 2020 at 9:43 AM Ehrenreich, Sigrid
<Ehrenreich@consist.de> wrote:
>
> Hi Tom,
>
> Thanks a lot for your help!
>
> If it is in the HEAD, does it mean, it will be included in v14?


Yes, that's precisely what it means. Unless someone finds something
bad with it and it has to be removed of course, but in principle it
means that it will be in 14.

//Magnus


>
>
> I'll have to see, if we dare building our own v13 version with the patch.
> (I would love to, because I am simply thrilled to pieces, having a patch made by you for us )
>
> Regards,
> Sigrid
>
> -----Original Message-----
> From: Tom Lane <tgl@sss.pgh.pa.us>
> Sent: Wednesday, October 28, 2020 5:55 PM
> To: Ehrenreich, Sigrid <Ehrenreich@consist.de>
> Cc: David Rowley <dgrowleyml@gmail.com>; pgsql-performance@lists.postgresql.org
> Subject: Re: Postgres Optimizer ignores information about foreign key relationship, severly misestimating number of
returnedrows in join
 
>
> "Ehrenreich, Sigrid" <Ehrenreich@consist.de> writes:
> > A patch would be very much appreciated.
> > We are currently running on Version 12, but could upgrade to 13, if necessary.
> > Could you send me a notification if you managed to program a patch for that?
>
> I've pushed a patch for this to HEAD, but current thinking is that we
> will not be back-patching it.  Still, if you're desperate you could
> consider running a custom build of v13 with that patch --- a quick
> check suggests that it would back-patch easily.  v12 would be a
> slightly harder lift (I see one hunk doesn't apply) but probably
> not by much.
>
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=patch;h=ad1c36b0709e47cdb3cc4abd6c939fe64279b63f
>
>                         regards, tom lane