Thread: left join query does not perform well

left join query does not perform well

From
Manuel Sugawara
Date:
Hi all,

I have three tables that are related in the following way:

'----------------------------------------
| epr_ord_grupo < 1 - n > epr_ord_profesor < 0 - n > epr_ord_horario
`----------------------------------------

In average there is 1.0407716514 rows in epr_ord_horario for each row
in epr_ord_profesor, but since there may be 0 I use the query
explained below:

'----------------------------------------
| test=# explain analyze SELECT grupo.grupo_id FROM (epr_vord_grupo AS grupo INNER JOIN (epr_vord_profesor AS profesor
LEFTJOIN epr_vord_horario AS horario ON (profesor.profesor_id = horario.profesor_id)) ON (grupo.grupo_id =
profesor.grupo_id))WHERE (grupo.año_semestre = cast('20022' as integer)) AND (grupo.grupo = cast('4021' as int)) ORDER
BYgrupo.grupo_id, profesor.cargo, profesor.profesor_id, horario.horario_id; 
| NOTICE:  QUERY PLAN:
|
| Sort  (cost=1165.25..1165.25 rows=2 width=24) (actual time=3113.30..3113.31 rows=3 loops=1)
|   ->  Hash Join  (cost=670.71..1165.24 rows=2 width=24) (actual time=3016.64..3113.08 rows=3 loops=1)
|         ->  Merge Join  (cost=665.46..1110.66 rows=9859 width=20) (actual time=2774.61..3058.77 rows=10157 loops=1)
|               ->  Index Scan using epr_ord_profesor_pkey on epr_ord_profesor  (cost=0.00..306.45 rows=9859 width=12)
(actualtime=124.07..275.04 rows=9859 loops=1) 
|               ->  Sort  (cost=665.46..665.46 rows=7607 width=52) (actual time=2650.49..2660.55 rows=7607 loops=1)
|                     ->  Subquery Scan horario  (cost=0.00..175.07 rows=7607 width=52) (actual time=20.62..2593.15
rows=7607loops=1) 
|                           ->  Seq Scan on epr_ord_horario h  (cost=0.00..175.07 rows=7607 width=52) (actual
time=20.61..2538.44rows=7607 loops=1) 
|         ->  Hash  (cost=5.25..5.25 rows=1 width=4) (actual time=26.26..26.26 rows=0 loops=1)
|               ->  Index Scan using llave primaria-asemg on epr_ord_grupo g  (cost=0.00..5.25 rows=1 width=4) (actual
time=26.24..26.25rows=1 loops=1) 
| Total runtime: 3117.48 msec
|
| EXPLAIN
`----------------------------------------

However the full join (which is correct almost always) of the three
tables looks like:

'----------------------------------------
| test=# explain analyze SELECT grupo.grupo_id FROM epr_vord_grupo grupo, epr_vord_profesor AS profesor,
epr_vord_horarioAS horario WHERE  (profesor.profesor_id = horario.profesor_id) AND (grupo.grupo_id = profesor.grupo_id)
AND(grupo.año_semestre = cast('20022' as integer)) AND (grupo.grupo = cast('4021' as int)) ORDER BY grupo.grupo_id,
profesor.cargo,profesor.profesor_id, horario.horario_id; 
| NOTICE:  QUERY PLAN:
|
| Sort  (cost=14.91..14.91 rows=2 width=24) (actual time=0.53..0.53 rows=3 loops=1)
|   ->  Nested Loop  (cost=0.00..14.90 rows=2 width=24) (actual time=0.19..0.30 rows=3 loops=1)
|         ->  Nested Loop  (cost=0.00..8.48 rows=2 width=16) (actual time=0.14..0.17 rows=2 loops=1)
|               ->  Index Scan using llave primaria-asemg on epr_ord_grupo g  (cost=0.00..5.25 rows=1 width=4) (actual
time=0.08..0.08rows=1 loops=1) 
|               ->  Index Scan using epr_ord_profe_gpo on epr_ord_profesor  (cost=0.00..3.20 rows=2 width=12) (actual
time=0.04..0.06rows=2 loops=1) 
|         ->  Index Scan using epr_ord_horario_prf on epr_ord_horario h  (cost=0.00..3.04 rows=1 width=8) (actual
time=0.03..0.05rows=2 loops=2) 
| Total runtime: 1.15 msec
|
| EXPLAIN
`----------------------------------------

Ouch, 3117.48 msec vs. 1.15 msec is a huge difference. I need
something else? or may be postgres optimizer can't cope with
left/right joins?

I was thinking to add a dummy row in epr_ord_horario to ensure that
the full join is always correct but may be a better solution arises
from the expertise of the list :-)

Regards,
Manuel.


Re: left join query does not perform well

From
Tom Lane
Date:
Manuel Sugawara <masm@fciencias.unam.mx> writes:
> Ouch, 3117.48 msec vs. 1.15 msec is a huge difference. I need
> something else? or may be postgres optimizer can't cope with
> left/right joins?

I think the problem is you're constraining the join order into a very
inefficient one.  See

http://www.ca.postgresql.org/users-lounge/docs/7.2/postgres/explicit-joins.html

But it's difficult to be sure, when you are showing us EXPLAIN output
that manifestly doesn't correspond to what you say the queries are.
For instance the nearest match to "epr_vord_grupo AS grupo" in the
explain output is "epr_ord_grupo g" ... I'm also wondering if any of
the tables used in the queries are really views, and if so what the
view definitions are.
        regards, tom lane


unsubscribe

From
"Gaetano Mendola"
Date:
unsubscribe