Problem with large query - Mailing list pgsql-performance

From Marc Cousin
Subject Problem with large query
Date
Msg-id 200409081549.49537.mcousin@sigma.fr
Whole thread Raw
Responses Re: Problem with large query
List pgsql-performance
Hi. I hope I'm not asking a too trivial question here...

I'm having trouble with a (quite big) query, and can't find a way to make it
faster.

Here is the information :

Tables :
============================
sces_vte -> 2753539 rows
sces_art -> 602327
sces_fsf -> 8126
sces_frc -> 7763
sces_tps -> 38
sces_gtr -> 35


Query :
===========================
SELECT
sces_gtr_art.gtr_cod,
sces_gtr_art.gtr_lib,
sces_frc_art.fou_cod,
sces_frc_art.fou_lib,
sces_tps.tps_annee_mois,
TO_NUMBER('200401','999999'),
TO_NUMBER('200405','999999'),
sces_tps.tps_libc,
sum(sces_vte.vte_mnt),
sum(sces_vte.vte_qte),
sum(sces_vte.vte_ton),
sces_famille.fsf_codfam,
sces_famille.fsf_lib,
sces_s_famille.fsf_codsfm,
sces_s_famille.fsf_lib
FROM
sces_vte,
sces_art,
sces_fsf  sces_famille,
sces_fsf  sces_s_famille,
sces_frc  sces_frc_art,
sces_tps,
sces_gtr  sces_gtr_art
WHERE
( sces_famille.fsf_codfam=sces_s_famille.fsf_codfam  )
AND  ( sces_famille.fsf_codseg= 0 and sces_famille.fsf_codsfm = 0  )
AND  ( sces_vte.tps_annee_mois=sces_tps.tps_annee_mois  )
AND  ( sces_vte.art_cod=sces_art.art_cod and
sces_vte.dos_cod=sces_art.dos_cod  )
AND  ( sces_gtr_art.gtr_cod=sces_frc_art.gtr_cod  )
AND  ( sces_frc_art.gtr_cod=sces_art.gtr_cod and
sces_frc_art.fou_cod=sces_art.fou_cod )
AND  ( sces_s_famille.fsf_codfam=sces_art.fsf_codfam and
sces_s_famille.fsf_codsfm=sces_art.fsf_codsfm  )
AND  ( sces_s_famille.fsf_codseg = 0  )
AND  (
( ( ( sces_tps.tps_annee_mois ) >= ( TO_NUMBER('200401','999999') ) and
( sces_tps.tps_annee_mois ) <= (
TO_NUMBER('200405','999999') )
)
OR
(
( sces_tps.tps_annee_mois ) >= ( TO_NUMBER('200401','999999') )-100 and
( sces_tps.tps_annee_mois ) <= (
TO_NUMBER('200405','999999') )-100
)  )
AND  ( sces_gtr_art.gtr_cod  in  (2))
)
GROUP BY
sces_gtr_art.gtr_cod,
sces_gtr_art.gtr_lib,
sces_frc_art.fou_cod,
sces_frc_art.fou_lib,
sces_tps.tps_annee_mois,
TO_NUMBER('200401','999999'),
TO_NUMBER('200405','999999'),
sces_tps.tps_libc,
sces_famille.fsf_codfam,
sces_famille.fsf_lib,
sces_s_famille.fsf_codsfm,
sces_s_famille.fsf_lib

Explain Analyze Plan :
====================================
 GroupAggregate  (cost=27161.91..27938.72 rows=16354 width=280) (actual time=484509.210..544436.148 rows=4115 loops=1)
   ->  Sort  (cost=27161.91..27202.79 rows=16354 width=280) (actual time=484496.188..485334.151 rows=799758 loops=1)
         Sort Key: sces_gtr_art.gtr_cod, sces_gtr_art.gtr_lib, sces_frc_art.fou_cod, sces_frc_art.fou_lib,
sces_tps.tps_annee_mois,200401::numeric, 200405::numeric, sces_tps.tps_libc, sces_famille.fsf_codfam,
sces_famille.fsf_lib,sces_s_famille.fsf_codsfm, sces_s_famille.fsf_lib 
         ->  Merge Join  (cost=25727.79..26017.34 rows=16354 width=280) (actual time=58945.821..69321.146 rows=799758
loops=1)
               Merge Cond: (("outer".fsf_codfam = "inner".fsf_codfam) AND ("outer".fsf_codsfm = "inner".fsf_codsfm))
               ->  Sort  (cost=301.36..304.60 rows=1298 width=83) (actual time=27.926..28.256 rows=332 loops=1)
                     Sort Key: sces_s_famille.fsf_codfam, sces_s_famille.fsf_codsfm
                     ->  Seq Scan on sces_fsf sces_s_famille  (cost=0.00..234.24 rows=1298 width=83) (actual
time=0.042..19.124rows=1341 loops=1) 
                           Filter: (fsf_codseg = 0::numeric)
               ->  Sort  (cost=25426.43..25448.05 rows=8646 width=225) (actual time=58917.106..59693.810 rows=799758
loops=1)
                     Sort Key: sces_art.fsf_codfam, sces_art.fsf_codsfm
                     ->  Merge Join  (cost=24726.32..24861.08 rows=8646 width=225) (actual time=19036.709..29404.943
rows=799758loops=1) 
                           Merge Cond: ("outer".tps_annee_mois = "inner".tps_annee_mois)
                           ->  Sort  (cost=2.49..2.53 rows=17 width=23) (actual time=0.401..0.428 rows=20 loops=1)
                                 Sort Key: sces_tps.tps_annee_mois
                                 ->  Seq Scan on sces_tps  (cost=0.00..2.14 rows=17 width=23) (actual time=0.068..0.333
rows=20loops=1) 
                                       Filter: (((tps_annee_mois >= 200301::numeric) OR (tps_annee_mois >=
200401::numeric))AND ((tps_annee_mois <= 200305::numeric) OR (tps_annee_mois >= 200401::numeric)) AND ((tps_annee_mois
>=200301::numeric) OR (tps_annee_mois <= 200405::numeric)) AND ((tps_annee_mois <= 200305::numeric) OR (tps_annee_mois
<=200405::numeric))) 
                           ->  Sort  (cost=24723.83..24747.97 rows=9656 width=214) (actual time=19036.223..19917.214
rows=799757loops=1) 
                                 Sort Key: sces_vte.tps_annee_mois
                                 ->  Nested Loop  (cost=21825.09..24084.74 rows=9656 width=214) (actual
time=417.603..8644.294rows=399879 loops=1) 
                                       ->  Nested Loop  (cost=21825.09..21837.50 rows=373 width=195) (actual
time=417.444..672.741rows=14158 loops=1) 
                                             ->  Seq Scan on sces_gtr sces_gtr_art  (cost=0.00..1.44 rows=1 width=40)
(actualtime=0.026..0.085 rows=1 loops=1) 
                                                   Filter: (gtr_cod = 2::numeric)
                                             ->  Merge Join  (cost=21825.09..21832.34 rows=373 width=165) (actual
time=417.400..568.247rows=14158 loops=1) 
                                                   Merge Cond: ("outer".fsf_codfam = "inner".fsf_codfam)
                                                   ->  Sort  (cost=255.24..255.30 rows=24 width=74) (actual
time=16.597..16.692rows=106 loops=1) 
                                                         Sort Key: sces_famille.fsf_codfam
                                                         ->  Seq Scan on sces_fsf sces_famille  (cost=0.00..254.69
rows=24width=74) (actual time=0.029..15.971 rows=155 loops=1) 
                                                               Filter: ((fsf_codseg = 0::numeric) AND (fsf_codsfm =
0::numeric))
                                                   ->  Sort  (cost=21569.85..21571.64 rows=715 width=91) (actual
time=400.631..416.871rows=14162 loops=1) 
                                                         Sort Key: sces_art.fsf_codfam
                                                         ->  Nested Loop  (cost=0.00..21535.95 rows=715 width=91)
(actualtime=1.320..230.975 rows=14162 loops=1) 
                                                               ->  Seq Scan on sces_frc sces_frc_art
(cost=0.00..182.75rows=728 width=51) (actual time=1.195..14.316 rows=761 loops=1) 
                                                                     Filter: (2::numeric = gtr_cod)
                                                               ->  Index Scan using ind_art_02 on sces_art
(cost=0.00..29.24rows=7 width=61) (actual time=0.040..0.160 rows=19 loops=761) 
                                                                     Index Cond: ((2::numeric = sces_art.gtr_cod) AND
("outer".fou_cod= sces_art.fou_cod)) 
                                       ->  Index Scan using idx_vte_02 on sces_vte  (cost=0.00..6.01 rows=1 width=62)
(actualtime=0.037..0.259 rows=28 loops=14158) 
                                             Index Cond: ((sces_vte.art_cod = "outer".art_cod) AND (sces_vte.dos_cod =
"outer".dos_cod))
 Total runtime: 545435.989 ms



From what I understand from the plan, the worst part of it is the sort. Is there a way I can improve this query ?
(Obviously, as it has many rows, it will still be a slow query, but here it's too slow for us...).

I allready extended the sort_mem (up to 500 MB to be sure, the server has plenty of RAM),
the query has become faster, but I don't know what else to do, to speed up the sort.

BTW the query was generated, not written. We allready are trying to write something better, but we are still facing a
gigantic sort at the end (we need the group by, and there are many lines from the main table (sces_vte) to be retrieved
andaggregated)... 

Thanks in advance...

pgsql-performance by date:

Previous
From: Gaetano Mendola
Date:
Subject: Re: The usual sequential scan, but with LIMIT !
Next
From: Tom Lane
Date:
Subject: Re: Problem with large query