left join performance problem - Mailing list pgsql-performance

From pginfo
Subject left join performance problem
Date
Msg-id 3EE74C2D.88390062@t1.unisoftbg.com
Whole thread Raw
Responses Re: left join performance problem
List pgsql-performance
Hi,
I am using pg 7.3.1 on a dual r.h. 7.3 box.

I have a big problem with pg left join performance.

My plan is:
=# explain analyze select D.IDS AS DIDS ,D.IDS_SKLAD, D.IDS_KO AS
DIDSKO,KL.MNAME AS KLNAME, D.NOMER AS DNOMER,D.DATE_OP,  S.MED AS
MEDNAME, NOM.MNAME AS NOMNAME,S.IDS_NUM, S.KOL,
S.CENA,S.VAL,S.TOT,S.DTO,S.PTO ,M.OTN AS MOTN  FROM A_KLIENTI KL ,
A_NOMEN NOM, A_DOC D,A_SKLA
D S  left outer join A_MESKLAD M ON(S.IDS=M.IDS)   WHERE  D.OP=4 AND
D.IDS=S.IDS_DOC AND D.IDS_KO=KL.IDS AND S.IDS_NUM=NOM.IDS AND  KL.IDS_G
RUPA  = 'SOF_112' ;

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------

 Nested Loop  (cost=460869.55..470785.29 rows=20 width=1034) (actual
time=50139.27..57565.34 rows=12990 loops=1)
   ->  Hash Join  (cost=460869.55..470662.48 rows=20 width=862) (actual
time=50139.02..57246.35 rows=12990 loops=1)
         Hash Cond: ("outer".ids_doc = "inner".ids)
         ->  Merge Join  (cost=457324.89..463038.60 rows=815792
width=356) (actual time=48128.32..53430.02 rows=815926 loops=1)
               Merge Cond: ("outer".ids = "inner".ids)
               ->  Index Scan using a_mesklad_pkey on a_mesklad m
(cost=0.00..1395.47 rows=15952 width=72) (actual time=0.21..109.19
rows=15952 loops=1)
               ->  Sort  (cost=457324.89..459364.37 rows=815792
width=284) (actual time=48128.05..49380.06 rows=815926 loops=1)
                     Sort Key: s.ids
                     ->  Seq Scan on a_sklad s  (cost=0.00..74502.92
rows=815792 width=284) (actual time=4.32..16777.16 rows=815926 loops=1)
         ->  Hash  (cost=3544.65..3544.65 rows=3 width=506) (actual
time=1104.34..1104.34 rows=0 loops=1)
               ->  Hash Join  (cost=905.35..3544.65 rows=3 width=506)
(actual time=428.32..1098.52 rows=1966 loops=1)
                     Hash Cond: ("outer".ids_ko = "inner".ids)
                     ->  Index Scan using i_doc_op on a_doc d
(cost=0.00..2625.71 rows=677 width=244) (actual time=29.27..690.86
rows=1981 loops=1)
                           Index Cond: (op = 4)
                     ->  Hash  (cost=905.19..905.19 rows=65 width=262)
(actual time=398.97..398.97 rows=0 loops=1)
                           ->  Seq Scan on a_klienti kl
(cost=0.00..905.19 rows=65 width=262) (actual time=396.68..398.93 rows=7
loops=1)
                                 Filter: (ids_grupa = 'SOF_112'::name)
   ->  Index Scan using a_nomen_pkey on a_nomen nom  (cost=0.00..6.01
rows=1 width=172) (actual time=0.01..0.02 rows=1 loops=12990)
         Index Cond: ("outer".ids_num = nom.ids)
 Total runtime: 57749.24 msec
(20 rows)



If I remove the join ( I know it is not very correct and I receive 19
rows as answer) it is working very fast.
The plan is:

 explain analyze select D.IDS AS DIDS ,D.IDS_SKLAD, D.IDS_KO AS
DIDSKO,KL.MNAME AS KLNAME, D.NOMER AS DNOMER,D.DATE_OP,  S.MED AS
MEDNAME, NOM.MNAME AS NOMNAME,S.IDS_NUM, S.KOL,
S.CENA,S.VAL,S.TOT,S.DTO,S.PTO ,M.OTN AS MOTN  FROM A_KLIENTI KL ,
A_NOMEN NOM, A_DOC D,A_SKLAD S ,A_MESKLAD M  WHERE  S.IDS=M.IDS AND
D.OP=4 AND D.IDS=S.IDS_DOC AND D.IDS_KO=KL.IDS AND S.IDS_NUM=NOM.IDS
AND  D.NOMER  like '%0905'  AND  KL.IDS_GRUPA  = 'SOF_112'  ORDER BY
D.IDS,S.IDS_NUM,S.ORDER_NUM ;

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------

 Sort  (cost=18897.33..18897.33 rows=1 width=1038) (actual
time=36.33..36.35 rows=48 loops=1)
   Sort Key: d.ids, s.ids_num, s.order_num
   ->  Nested Loop  (cost=0.00..18897.32 rows=1 width=1038) (actual
time=30.90..35.93 rows=48 loops=1)
         ->  Nested Loop  (cost=0.00..18891.29 rows=1 width=866) (actual
time=30.70..33.34 rows=48 loops=1)
               ->  Nested Loop  (cost=0.00..18885.28 rows=1 width=794)
(actual time=30.44..31.98 rows=48 loops=1)
                     ->  Nested Loop  (cost=0.00..2633.93 rows=1
width=506) (actual time=30.18..30.62 rows=1 loops=1)
                           ->  Index Scan using i_doc_op on a_doc d
(cost=0.00..2627.40 rows=1 width=244) (actual time=29.93..30.36 rows=1
loops=1)
                                 Index Cond: (op = 4)
                                 Filter: (nomer ~~ '%0905'::text)
                           ->  Index Scan using a_klienti_pkey on
a_klienti kl  (cost=0.00..6.01 rows=1 width=262) (actual time=0.23..0.23
rows=1 loops=1)
                                 Index Cond: ("outer".ids_ko = kl.ids)
                                 Filter: (ids_grupa = 'SOF_112'::name)
                     ->  Index Scan using i_sklad_ids_doc on a_sklad s
(cost=0.00..16200.36 rows=4079 width=288) (actual time=0.24..0.95
rows=48 loops=1)
                           Index Cond: ("outer".ids = s.ids_doc)
               ->  Index Scan using a_mesklad_pkey on a_mesklad m
(cost=0.00..6.01 rows=1 width=72) (actual time=0.02..0.02 rows=1
loops=48)
                     Index Cond: ("outer".ids = m.ids)
         ->  Index Scan using a_nomen_pkey on a_nomen nom
(cost=0.00..6.01 rows=1 width=172) (actual time=0.04..0.04 rows=1
loops=48)
               Index Cond: ("outer".ids_num = nom.ids)
 Total runtime: 36.98 msec
(19 rows)


Also S.IDS and M.IDS are name and primary key's.
I can not find my problem.
Any idea will help.
Of cours I can make the query with two selects and will work fast, but I
think it is not good solution.
regards,
ivan.


pgsql-performance by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: FW: [ADMIN] Shared_buffers and kernel parameters, tuning
Next
From: Vincent van Leeuwen
Date:
Subject: tweaking costs to favor nestloop