Home > mailing lists

Re: Slow query with join - Mailing list pgsql-general

From	David G. Johnston
Subject	Re: Slow query with join
Date	March 16, 2015 23:40:04
Msg-id	CAKFQuwZDRodob2yYYjFy4L3QEruCvAer8FXmhaVnBLZ0MJAP2g@mail.gmail.com Whole thread Raw
In response to	Slow query with join (Marc Watson <mark.watson@jurisconcept.ca>)
List	pgsql-general

Tree view

On Mon, Mar 16, 2015 at 11:50 AM, Marc Watson <mark.watson@jurisconcept.ca> wrote:

Hello all,
I am using PostgreSQL 9.4.1, compiled by Visual C++ build 1800, 64-bit, as downloaded from EnterpriseDB, and is running on my dev system under Win 7 64-bit.

[...]

However, when I combine the two queries into one, the result set takes 6742 ms:
explain analyze select * from v_actor where v_actor.actor_id in(select ir_actor_id from f_intervenant_ref where ir_dos_id = '5226' order by ir_actor_id);

You might want to consider whether the following is acceptable; but it would depend on the relationship between f_intervenant_ref and v_actor:

SELECT *

FROM v_actor

JOIN f_intervenant_ref ON (actor_id = ir_actor_id)

WHERE ir_dos_id = '5226';

"IN" has issues due to necessary consideration of possible NULLs in the list.

Furthermore; even in your original query there is no value to incorporating an ORDER BY into the IN subquery.

I suspect that this second problem is preventing the planner from pushing the subquery down into the view and so is forced to perform a "Merge Semi Join" against the full (and thus expensive) view while the constants in the second query can be pushed down and the planner is able to choose the "Nested Loop Left Join" over 8 keys (4 rows) which ends up being very fast.

The other option is to use EXISTS:

SELECT *

FROM v_actor

WHERE EXIST (SELECT 1 FROM f_intervenant_ref WHERE actor_id = ir_actor_id AND ir_dos_id = '5226')

"Merge Semi Join (cost=71.79..108061.92 rows=8 width=1461) (actual time=7884.994..7927.699 rows=4 loops=1)"
" Merge Cond: (actor.id = f_intervenant_ref.ir_actor_id)"
" -> Merge Left Join (cost=0.85..554314.28 rows=20670 width=138) (actual time=2.820..7926.001 rows=3072 loops=1)"

[...]

" -> Materialize (cost=17.28..17.40 rows=8 width=4) (actual time=0.024..0.027 rows=8 loops=1)"
" -> Sort (cost=17.28..17.30 rows=8 width=4) (actual time=0.021..0.022 rows=8 loops=1)"
" Sort Key: f_intervenant_ref.ir_actor_id"
" Sort Method: quicksort Memory: 25kB"
" -> Index Scan using ir_dos_id_idx on f_intervenant_ref (cost=0.28..17.16 rows=8 width=4) (actual time=0.012..0.017 rows=8 loops=1)"
" Index Cond: ((ir_dos_id)::text = '5226'::text)"
"Planning time: 0.820 ms"
"Execution time: 7927.838 ms"

Any suggestions to help me speed this up will be greatly appreciated.

David J.

pgsql-general by date:

From: Marc Watson
Date: 16 March 2015, 22:24:43
Subject: Slow query with join

From: Tomas Vondra
Date: 16 March 2015, 23:50:47
Subject: Re: Slow query with join

Re: Slow query with join - Mailing list pgsql-general

Previous

Next