Re: Slow query with join - Mailing list pgsql-general

From David G. Johnston
Subject Re: Slow query with join
Date
Msg-id CAKFQuwZDRodob2yYYjFy4L3QEruCvAer8FXmhaVnBLZ0MJAP2g@mail.gmail.com
Whole thread Raw
In response to Slow query with join  (Marc Watson <mark.watson@jurisconcept.ca>)
List pgsql-general


On Mon, Mar 16, 2015 at 11:50 AM, Marc Watson <mark.watson@jurisconcept.ca> wrote:
Hello all,
I am using PostgreSQL 9.4.1, compiled by Visual C++ build 1800, 64-bit, as downloaded from EnterpriseDB, and is running on my dev system under Win 7 64-bit.
​[...]​
 

However, when I combine the two queries into one, the result set takes 6742 ms:
explain analyze select * from v_actor where v_actor.actor_id in(select ir_actor_id from f_intervenant_ref where ir_dos_id = '5226' order by ir_actor_id);

​You might want to consider whether the following is acceptable; but it would depend on the relationship between f_intervenant_ref and v_actor:

SELECT *
FROM v_actor
JOIN f_intervenant_ref ON (actor_id = ir_actor_id)
WHERE ir_dos_id = '5226';


"IN" has issues due to necessary consideration of possible NULLs in the list.

Furthermore; even in your original query there is no value to incorporating an ORDER BY into the IN subquery.

I suspect that this second problem is preventing the planner from pushing the subquery down into the view and so is forced to perform a "Merge Semi Join" against the full (and thus expensive) view while the constants in the second query can be pushed down and the planner is able to choose the "Nested Loop Left Join" over 8 keys (4 rows) which ends up being very fast.
 
​The other option is to use EXISTS:

SELECT *
FROM v_actor
WHERE EXIST (SELECT 1 FROM f_intervenant_ref WHERE actor_id = ir_actor_id AND ir_dos_id = '5226')



"Merge Semi Join  (cost=71.79..108061.92 rows=8 width=1461) (actual time=7884.994..7927.699 rows=4 loops=1)"
"  Merge Cond: (actor.id = f_intervenant_ref.ir_actor_id)"
"  ->  Merge Left Join  (cost=0.85..554314.28 rows=20670 width=138) (actual time=2.820..7926.001 rows=3072 loops=1)"
​[...]​
 
"  ->  Materialize  (cost=17.28..17.40 rows=8 width=4) (actual time=0.024..0.027 rows=8 loops=1)"
"        ->  Sort  (cost=17.28..17.30 rows=8 width=4) (actual time=0.021..0.022 rows=8 loops=1)"
"              Sort Key: f_intervenant_ref.ir_actor_id"
"              Sort Method: quicksort  Memory: 25kB"
"              ->  Index Scan using ir_dos_id_idx on f_intervenant_ref  (cost=0.28..17.16 rows=8 width=4) (actual time=0.012..0.017 rows=8 loops=1)"
"                    Index Cond: ((ir_dos_id)::text = '5226'::text)"
"Planning time: 0.820 ms"
"Execution time: 7927.838 ms"

Any suggestions to help me speed this up will be greatly appreciated.

​David J.​

 

pgsql-general by date:

Previous
From: Marc Watson
Date:
Subject: Slow query with join
Next
From: Tomas Vondra
Date:
Subject: Re: Slow query with join