Re: [HACKERS] Regression in join selectivity estimations when using foreign keys - Mailing list pgsql-hackers
| From | David Rowley |
|---|---|
| Subject | Re: [HACKERS] Regression in join selectivity estimations when using foreign keys |
| Date | |
| Msg-id | CAKJS1f_86xU5ht4Tw8e=jyoegZKDZVCWO-2YXRUHJLMrSskhXQ@mail.gmail.com Whole thread |
| In response to | [HACKERS] Regression in join selectivity estimations when using foreign keys (David Rowley <david.rowley@2ndquadrant.com>) |
| List | pgsql-hackers |
On 18 May 2017 at 20:28, David Rowley <david.rowley@2ndquadrant.com> wrote:
> A vastly simplified example case is:
>
> create table fkest (a int, b int, c int unique, primary key(a,b));
> create table fkest1 (a int, b int, primary key(a,b));
>
> insert into fkest select x/10,x%10, x from generate_Series(1,400) x;
> insert into fkest1 select x/10,x%10 from generate_Series(1,400) x;
>
> alter table fkest1 add constraint fkest1_a_b_fkey foreign key (a,b)
> references fkest;
>
> analyze fkest;
> analyze fkest1;
>
> explain (costs on) select * from fkest f
> left join fkest1 f1 on f.a = f1.a and f.b = f1.b
> left join fkest1 f2 on f.a = f2.a and f.b = f2.b
> left join fkest1 f3 on f.a = f3.a and f.b = f3.b
> where f.c = 1;
I should have shown the EXPLAIN ANALYZE of this instead.
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------Hash
LeftJoin (cost=24.15..41.89 rows=996 width=36) (actual
time=0.430..0.463 rows=1 loops=1) Hash Cond: ((f.a = f3.a) AND (f.b = f3.b)) -> Hash Left Join (cost=12.15..28.36
rows=100width=28) (actual
time=0.255..0.288 rows=1 loops=1) Hash Cond: ((f.a = f2.a) AND (f.b = f2.b)) -> Nested Loop Left Join
(cost=0.15..16.21rows=10
width=20) (actual time=0.046..0.079 rows=1 loops=1) -> Seq Scan on fkest f (cost=0.00..8.00 rows=1
width=12) (actual time=0.013..0.045 rows=1 loops=1) Filter: (c = 1) Rows Removed
byFilter: 399 -> Index Only Scan using fkest1_pkey on fkest1 f1
(cost=0.15..8.17 rows=1 width=8) (actual time=0.031..0.031 rows=1
loops=1) Index Cond: ((a = f.a) AND (b = f.b)) Heap Fetches: 1 -> Hash
(cost=6.00..6.00rows=400 width=8) (actual
time=0.180..0.180 rows=400 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 24kB -> Seq
Scanon fkest1 f2 (cost=0.00..6.00 rows=400
width=8) (actual time=0.006..0.041 rows=400 loops=1) -> Hash (cost=6.00..6.00 rows=400 width=8) (actual
time=0.162..0.162 rows=400 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 24kB -> Seq Scan on fkest1
f3 (cost=0.00..6.00 rows=400 width=8)
(actual time=0.010..0.040 rows=400 loops=1)Planning time: 0.409 msExecution time: 0.513 ms
(19 rows)
which you can obviously see the poor estimate propagating to up the plan tree.
If we add another left join the final estimate is even worse:
explain analyze select * from fkest f
left join fkest1 f1 on f.a = f1.a and f.b = f1.b
left join fkest1 f2 on f.a = f2.a and f.b = f2.b
left join fkest1 f3 on f.a = f3.a and f.b = f3.b
left join fkest1 f4 on f.a = f4.a and f.b = f4.b where f.c = 1;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------Hash
LeftJoin (cost=36.15..69.06 rows=9915 width=44) (actual
time=0.535..0.569 rows=1 loops=1) Hash Cond: ((f.a = f4.a) AND (f.b = f4.b)) -> Hash Left Join (cost=24.15..41.89
rows=996width=36) (actual
time=0.371..0.404 rows=1 loops=1) Hash Cond: ((f.a = f3.a) AND (f.b = f3.b)) -> Hash Left Join
(cost=12.15..28.36rows=100 width=28)
(actual time=0.208..0.241 rows=1 loops=1) Hash Cond: ((f.a = f2.a) AND (f.b = f2.b)) ->
NestedLoop Left Join (cost=0.15..16.21 rows=10
width=20) (actual time=0.029..0.062 rows=1 loops=1) -> Seq Scan on fkest f (cost=0.00..8.00
rows=1
width=12) (actual time=0.014..0.047 rows=1 loops=1) Filter: (c = 1)
RowsRemoved by Filter: 399 -> Index Only Scan using fkest1_pkey on fkest1
f1 (cost=0.15..8.17 rows=1 width=8) (actual time=0.012..0.012 rows=1
loops=1) Index Cond: ((a = f.a) AND (b = f.b)) Heap Fetches: 1
-> Hash (cost=6.00..6.00 rows=400 width=8) (actual
time=0.168..0.168 rows=400 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 24kB
-> Seq Scan on fkest1 f2 (cost=0.00..6.00
rows=400 width=8) (actual time=0.008..0.043 rows=400 loops=1) -> Hash (cost=6.00..6.00 rows=400 width=8)
(actual
time=0.156..0.156 rows=400 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 24kB -> Seq
Scanon fkest1 f3 (cost=0.00..6.00 rows=400
width=8) (actual time=0.006..0.035 rows=400 loops=1) -> Hash (cost=6.00..6.00 rows=400 width=8) (actual
time=0.155..0.155 rows=400 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 24kB -> Seq Scan on fkest1
f4 (cost=0.00..6.00 rows=400 width=8)
(actual time=0.004..0.034 rows=400 loops=1)Planning time: 0.864 msExecution time: 0.698 ms
-- David Rowley http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: