Big diference in response time (query plan question) - Mailing list pgsql-performance

From Luiz K. Matsumura
Subject Big diference in response time (query plan question)
Date
Msg-id 44E26939.30903@planit.com.br
Whole thread Raw
Responses Re: Big diference in response time (query plan question)
List pgsql-performance
Hi all,

I have PostgreSQL 8.1.4  running on a P 4 2.8 GHz , 512 MB with Linux
(Fedora Core 3)

The SQL comands below have a performance diference that I think is not
so much acceptable ( 1035.427 ms vs 7.209 ms ), since the tables isn´t
so much big ( contrato have 1907 rows and prog have 40.002 rows )
Can I make some optimization here ?

 EXPLAIN ANALYZE
 SELECT   Contrato.Id
      , Min( prog.dtsemeio   ) AS DtSemIni
      , Max( prog.dtsemeio   ) AS DtSemFim
      , Min( prog.dtembarque ) AS DtEmbIni
      , Max( prog.dtembarque ) AS DtEmbFim
      , Min( prog.dtentrega  ) AS DtEntIni
      , Max( prog.dtentrega  ) AS DtEntFim
      , COUNT(prog.*) AS QtSem
      , SUM( CASE   WHEN Prog.DtSemeio >= '20060814' THEN 1 ELSE 0 END )
AS QtSemAb
 FROM     bvz.Contrato
        LEFT OUTER JOIN bvz.Prog ON prog.Fk_Contrato = Contrato.Id
 WHERE    Contrato.Fk_Clifor = 243
 GROUP BY 1;
                                                                   QUERY
PLAN

------------------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=5477.34..5706.84 rows=41 width=48) (actual
time=883.721..1031.159 rows=41 loops=1)
   ->  Merge Left Join  (cost=5477.34..5686.15 rows=860 width=48)
(actual time=868.038..1026.988 rows=1366 loops=1)
         Merge Cond: ("outer".id = "inner".fk_contrato)
         ->  Sort  (cost=50.39..50.49 rows=41 width=4) (actual
time=0.614..0.683 rows=41 loops=1)
               Sort Key: contrato.id
               ->  Bitmap Heap Scan on contrato  (cost=2.14..49.29
rows=41 width=4) (actual time=0.163..0.508 rows=41 loops=1)
                     Recheck Cond: (fk_clifor = 243)
                     ->  Bitmap Index Scan on fki_contrato_clifor
(cost=0.00..2.14 rows=41 width=0) (actual time=0.146..0.146 rows=41 loops=1)
                           Index Cond: (fk_clifor = 243)
         ->  Sort  (cost=5426.95..5526.95 rows=40002 width=48) (actual
time=862.192..956.903 rows=38914 loops=1)
               Sort Key: prog.fk_contrato
               ->  Seq Scan on prog  (cost=0.00..1548.02 rows=40002
width=48) (actual time=0.044..169.795 rows=40002 loops=1)
 Total runtime: 1035.427 ms


EXPLAIN ANALYZE
SELECT   Contrato.Id
     , Min( prog.dtsemeio   ) AS DtSemIni
     , Max( prog.dtsemeio   ) AS DtSemFim
     , Min( prog.dtembarque ) AS DtEmbIni
     , Max( prog.dtembarque ) AS DtEmbFim
     , Min( prog.dtentrega  ) AS DtEntIni
     , Max( prog.dtentrega  ) AS DtEntFim
     , COUNT(prog.*) AS QtSem
     , SUM( CASE   WHEN Prog.DtSemeio >= '20060814' THEN 1 ELSE 0 END )
AS QtSemAb
FROM     bvz.Contrato
      LEFT OUTER JOIN bvz.Prog ON prog.Fk_Contrato = Contrato.Id
WHERE    Contrato.Fk_Clifor = 352
GROUP BY 1;
                                                               QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=2.16..4588.74 rows=28 width=48) (actual
time=2.196..7.027 rows=28 loops=1)
   ->  Nested Loop Left Join  (cost=2.16..4574.63 rows=587 width=48)
(actual time=2.042..6.154 rows=223 loops=1)
         ->  Index Scan using pk_contrato on contrato
(cost=0.00..100.92 rows=28 width=4) (actual time=1.842..3.045 rows=28
loops=1)
               Filter: (fk_clifor = 352)
         ->  Bitmap Heap Scan on prog  (cost=2.16..159.19 rows=47
width=48) (actual time=0.040..0.080 rows=8 loops=28)
               Recheck Cond: (prog.fk_contrato = "outer".id)
               ->  Bitmap Index Scan on fki_prog_contrato
(cost=0.00..2.16 rows=47 width=0) (actual time=0.018..0.018 rows=8 loops=28)
                     Index Cond: (prog.fk_contrato = "outer".id)
 Total runtime: 7.209 ms



I think that the problem is in "LEFT OUTER JOIN" because when I run the
queries with a inner join I have more consistent times,
although the query plan above is a champion :


EXPLAIN ANALYZE
SELECT   Contrato.Id
     , Min( prog.dtsemeio   ) AS DtSemIni
     , Max( prog.dtsemeio   ) AS DtSemFim
     , Min( prog.dtembarque ) AS DtEmbIni
     , Max( prog.dtembarque ) AS DtEmbFim
     , Min( prog.dtentrega  ) AS DtEntIni
     , Max( prog.dtentrega  ) AS DtEntFim
     , COUNT(prog.*) AS QtSem
     , SUM( CASE   WHEN Prog.DtSemeio >= '20060814' THEN 1 ELSE 0 END )
AS QtSemAb
FROM     bvz.Contrato
       JOIN bvz.Prog ON prog.Fk_Contrato = Contrato.Id
WHERE    Contrato.Fk_Clifor = 243
GROUP BY 1;
                                                         QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=1825.38..1826.71 rows=41 width=48) (actual
time=222.671..222.788 rows=41 loops=1)
   ->  Hash Join  (cost=49.40..1806.03 rows=860 width=48) (actual
time=2.040..217.963 rows=1366 loops=1)
         Hash Cond: ("outer".fk_contrato = "inner".id)
         ->  Seq Scan on prog  (cost=0.00..1548.02 rows=40002 width=48)
(actual time=0.047..150.636 rows=40002 loops=1)
         ->  Hash  (cost=49.29..49.29 rows=41 width=4) (actual
time=0.766..0.766 rows=41 loops=1)
               ->  Bitmap Heap Scan on contrato  (cost=2.14..49.29
rows=41 width=4) (actual time=0.146..0.669 rows=41 loops=1)
                     Recheck Cond: (fk_clifor = 243)
                     ->  Bitmap Index Scan on fki_contrato_clifor
(cost=0.00..2.14 rows=41 width=0) (actual time=0.101..0.101 rows=41 loops=1)
                           Index Cond: (fk_clifor = 243)
 Total runtime: 223.230 ms


EXPLAIN ANALYZE
SELECT   Contrato.Id
     , Min( prog.dtsemeio   ) AS DtSemIni
     , Max( prog.dtsemeio   ) AS DtSemFim
     , Min( prog.dtembarque ) AS DtEmbIni
     , Max( prog.dtembarque ) AS DtEmbFim
     , Min( prog.dtentrega  ) AS DtEntIni
     , Max( prog.dtentrega  ) AS DtEntFim
     , COUNT(prog.*) AS QtSem
     , SUM( CASE   WHEN Prog.DtSemeio >= '20060814' THEN 1 ELSE 0 END )
AS QtSemAb
FROM     bvz.Contrato
       JOIN bvz.Prog ON prog.Fk_Contrato = Contrato.Id
WHERE    Contrato.Fk_Clifor = 352
GROUP BY 1;
                                                                   QUERY
PLAN

------------------------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=1811.50..1812.41 rows=28 width=48) (actual
time=215.214..215.291 rows=28 loops=1)
   ->  Hash Join  (cost=44.39..1798.29 rows=587 width=48) (actual
time=3.853..214.178 rows=223 loops=1)
         Hash Cond: ("outer".fk_contrato = "inner".id)
         ->  Seq Scan on prog  (cost=0.00..1548.02 rows=40002 width=48)
(actual time=0.075..150.701 rows=40002 loops=1)
         ->  Hash  (cost=44.32..44.32 rows=28 width=4) (actual
time=0.248..0.248 rows=28 loops=1)
               ->  Bitmap Heap Scan on contrato  (cost=2.10..44.32
rows=28 width=4) (actual time=0.111..0.187 rows=28 loops=1)
                     Recheck Cond: (fk_clifor = 352)
                     ->  Bitmap Index Scan on fki_contrato_clifor
(cost=0.00..2.10 rows=28 width=0) (actual time=0.101..0.101 rows=28 loops=1)
                           Index Cond: (fk_clifor = 352)
 Total runtime: 215.483 ms

Well, in this case the queries with LEFT OUTER join and with inner join
returns the same result set. I don´t have the sufficient knowledge to
affirm , but I suspect that if the query plan used for fk_clifor = 352
and with left outer join is applied for the first query (fk_clifor = 243
with left outer join)
we will have a better total runtime.
There are some manner to make this test ?
By the way (If this is a stupid idea, ignore this), this same (or a
similar) query plan cannot be used in the queries with inner join since
the difference in times ( 215.483 ms vs 7.209 ms) still significative ?



pgsql-performance by date:

Previous
From: David Lang
Date:
Subject: Re: Postgresql Performance on an HP DL385 and
Next
From: Tom Lane
Date:
Subject: Re: Postgresql Performance on an HP DL385 and