Home > mailing lists

Poor plan when using EXISTS in the expression list - Mailing list pgsql-hackers

From	Pierre Ducroquet
Subject	Poor plan when using EXISTS in the expression list
Date	October 4, 2018 15:11:51
Msg-id	2269606.VgtPoMEsXU@pierred-pdoc Whole thread
Responses	Re: Poor plan when using EXISTS in the expression list
List	pgsql-hackers

Tree view

Hello

Our developpers ORM (Django's) sadly can not use EXISTS in the where clauses 
without having it in the expression part of the SELECT statement.
I was expecting it to create queries performing a bit worse than queries 
without this useless expression, but it turns out this trigger an extremely 
poor planning, with an additional Seq Scan of the table referenced in EXISTS.
Thus the query select a.*, exists (select * from b where a_id = a.id) from a 
where exists (select * from b where a_id = a.id); can be orders of magnitude 
slower than select a.* from a where exists (select * from b where a_id = 
a.id);

This has been reproduced on PostgreSQL 9.6 and 11 beta4.

Example :

test=> create table a (id serial primary key, b text);       
CREATE TABLE

test=> create table b (id serial primary key, a_id integer not null references 
a(id), c text);              
CREATE TABLE

test=> explain select a.* from a where exists (select * from b  where a_id = 
a.id); 
                              QUERY PLAN                               
-----------------------------------------------------------------------
 Hash Join  (cost=29.50..62.60 rows=635 width=36)
   Hash Cond: (a.id = b.a_id)
   ->  Seq Scan on a  (cost=0.00..22.70 rows=1270 width=36)
   ->  Hash  (cost=27.00..27.00 rows=200 width=4)
         ->  HashAggregate  (cost=25.00..27.00 rows=200 width=4)
               Group Key: b.a_id
               ->  Seq Scan on b  (cost=0.00..22.00 rows=1200 width=4)
(7 rows)

test=> explain select a.*, exists (select * from b where a_id = a.id) from a;
                           QUERY PLAN                            
-----------------------------------------------------------------
 Seq Scan on a  (cost=0.00..5314.37 rows=1270 width=37)
   SubPlan 1
     ->  Seq Scan on b  (cost=0.00..25.00 rows=6 width=0)
           Filter: (a_id = a.id)
   SubPlan 2
     ->  Seq Scan on b b_1  (cost=0.00..22.00 rows=1200 width=4)
(6 rows)

test=> explain select a.*, exists (select * from b where a_id = a.id)  from a 
where exists (select * from b  where a_id = a.id);                         
                              QUERY PLAN                               
-----------------------------------------------------------------------
 Hash Join  (cost=29.50..2708.43 rows=635 width=37)
   Hash Cond: (a.id = b.a_id)
   ->  Seq Scan on a  (cost=0.00..22.70 rows=1270 width=36)
   ->  Hash  (cost=27.00..27.00 rows=200 width=4)
         ->  HashAggregate  (cost=25.00..27.00 rows=200 width=4)
               Group Key: b.a_id
               ->  Seq Scan on b  (cost=0.00..22.00 rows=1200 width=4)
   SubPlan 1
     ->  Seq Scan on b b_1  (cost=0.00..25.00 rows=6 width=0)
           Filter: (a_id = a.id)
   SubPlan 2
     ->  Seq Scan on b b_2  (cost=0.00..22.00 rows=1200 width=4)
(12 rows)


Thanks

 Pierre

pgsql-hackers by date:

From: Chris Travers
Date: 04 October 2018, 15:11:12
Subject: Possible important data point on stats collection, wondering aboutpossible improvement

From: Laurenz Albe
Date: 04 October 2018, 15:23:34
Subject: Re: pg_ls_tmpdir()

Poor plan when using EXISTS in the expression list - Mailing list pgsql-hackers

Previous

Next