Re: Slow query with join - Mailing list pgsql-general

From Marc Watson
Subject Re: Slow query with join
Date
Msg-id 02F1BBB48ABD3245A3BC519B57CFC8CA7E9E2A80@Exchange.JurisConcept.local
Whole thread Raw
In response to Re: Slow query with join  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Slow query with join  (Marc Watson <mark.watson@jurisconcept.ca>)
List pgsql-general
>-----Message d'origine-----
>De : Tom Lane [mailto:tgl@sss.pgh.pa.us]
>Envoyé : March-16-15 5:07 PM
>À : Tomas Vondra
>Cc : pgsql-general@postgresql.org; Marc Watson
>Objet : Re: [GENERAL] Slow query with join
>
>Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> On 16.3.2015 19:50, Marc Watson wrote:
>>> I hope someone can help me with a problem I'm having when joining a
>>> view with a table. The view is somewhat involved, but I can provide the
>>> details if necessary
>
>> First, get rid of the ORDER BY clauses in the subselects - it's
>> completely pointless, and might prevent proper optimization (e.g.
>> replacing the IN() with optimized joins.
>
>I'm suspicious that the cause may be an ORDER BY in the view.  It's
>hard to tell when we've not seen the view definition, but I see that both
>plans we've been shown are going to produce output sorted by actor.id.
>Maybe that's happenstance, or maybe not.
>
>            regards, tom lane

Thanks for replying.
First of all, there is an ORDER BY in the view - ORDER BY actor.id. Removing this, the query takes about 47 secs.,
whichis also the same as if I simply do a select * from v_actor, with or without the ORDER BY in the view. 
It is something in the view, which I will explore further. If I create a materialized view mv_actor, select * from
mv_actortakes 8 sec and my original query using the materialized view  takes 16ms. 

David G. Johnston <david.g.johnston@gmail.com> wrote:
>You might want to consider whether the following is acceptable; but it would depend on the relationship between
f_intervenant_refand v_actor: 
>
>SELECT *
>FROM v_actor
>JOIN f_intervenant_ref ON (actor_id = ir_actor_id)
>WHERE ir_dos_id = '5226'

The relationship between f_intervenant_ref and v_actor is simple. ir_actor_id is an integer that contains the value
fromthe table actor.id, a serial and the primary key.  This gives the same 7 secs, or 49 secs with out the ORDER BY in
theview. The explain analyze output (with the ORDER BY in the view) is : 
 "Merge Join  (cost=54.78..108097.04 rows=8 width=1540) (actual time=6925.158..6962.631 rows=8 loops=1)"
"  Merge Cond: (actor.id = f_intervenant_ref.ir_actor_id)"
"  ->  Merge Left Join  (cost=0.85..554262.60 rows=20670 width=138) (actual time=2.840..6960.625 rows=3072 loops=1)"
"        Merge Cond: (actor.id = physical_actor.id)"
"        ->  Merge Left Join  (cost=0.57..1665.30 rows=20670 width=114) (actual time=0.014..3.791 rows=3072 loops=1)"
"              Merge Cond: (actor.id = moral_actor.id)"
"              ->  Index Scan using actor_pkey on actor  (cost=0.29..1275.50 rows=20670 width=78) (actual
time=0.006..1.667rows=3072 loops=1)" 
"              ->  Index Scan using moral_actor_pkey on moral_actor  (cost=0.28..268.78 rows=5548 width=40) (actual
time=0.005..0.005rows=1 loops=1)" 
"        ->  Index Scan using physical_actor_pkey on physical_actor  (cost=0.29..725.98 rows=15122 width=28) (actual
time=0.003..3.072rows=3072 loops=1)" 
"        SubPlan 1"
"          ->  Limit  (cost=8.30..8.31 rows=1 width=8) (never executed)"
"                ->  Sort  (cost=8.30..8.31 rows=1 width=8) (never executed)"
"                      Sort Key: contact.rank"
"                      ->  Index Scan using contact_actor_idx on contact  (cost=0.28..8.29 rows=1 width=8) (never
executed)"
"                            Index Cond: (actor_id = actor.id)"
"        SubPlan 2"
"          ->  Limit  (cost=8.30..8.31 rows=1 width=8) (actual time=0.006..0.006 rows=0 loops=3072)"
"                ->  Sort  (cost=8.30..8.31 rows=1 width=8) (actual time=0.005..0.005 rows=0 loops=3072)"
"                      Sort Key: contact_1.rank"
"                      Sort Method: quicksort  Memory: 25kB"
"                      ->  Index Scan using contact_actor_idx on contact contact_1  (cost=0.28..8.29 rows=1 width=8)
(actualtime=0.002..0.003 rows=0 loops=3072)" 
"                            Index Cond: (actor_id = actor.id)"
"  ->  Index Scan using ir_actor_id_idx on f_intervenant_ref  (cost=0.28..62.58 rows=8 width=79) (actual
time=0.386..0.449rows=8 loops=1)" 
"        Filter: ((ir_dos_id)::text = '5226'::text)"
"        Rows Removed by Filter: 891"
"Planning time: 0.864 ms"
"Execution time: 6962.773 ms"

>The other option is to use EXISTS:
>...
Same 7 secs.

A scaled-down version of my view runs quickly, so I'll gradually start inserting the missing pieces until I find the
culprit.I'll let you know the cause when I find it. 
Thx again,
Mark Watson



pgsql-general by date:

Previous
From: "Deole, Pushkar (Pushkar)"
Date:
Subject: Re: BDR with Postgres
Next
From: Craig Ringer
Date:
Subject: Re: BDR with Postgres