Re: Help optimize view - Mailing list pgsql-performance

From Relyea, Mike
Subject Re: Help optimize view
Date
Msg-id 1806D1F73FCB7F439F2C842EE0627B18065BF2C0@USA0300MS01.na.xerox.net
Whole thread Raw
In response to Re: Help optimize view  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Help optimize view  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
List pgsql-performance
> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: Friday, August 10, 2007 5:44 PM
> To: Relyea, Mike
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] Help optimize view
>
> Try increasing join_collapse_limit --- you have just enough
> tables here that the planner isn't going to consider all
> possible join orders.
> And it sorta looks like it's picking a bad one.
>
>             regards, tom lane
>

I tried increasing join_collapse_limit with no significant change in run
time although a different plan was chosen.

I've included a re-send of my original post, it looks like it didn't go
through - it's not in the archives.  I've also included an explain
analyze before and after the join_collapse_limit change.

I'm have the following view as part of a larger, aggregate query that is
running slower than I'd like.  There are 4 views total, each very
similar to this one.  Each of the views is then left joined with data
from some other tables to give me the final result that I'm looking for.

I'm hoping that if I can get some insight in to how to make this view
execute faster, I can apply that learning to the other 3 views and
thereby decrease the run time for my aggregate query.

I'm running 8.2.4 on Windows XP with a single 10K rpm disk dedicated to
the data directory and 1.5 GB memory.
shared_buffers = 12288
work_mem = 262144
maintenance_work_mem = 131072
max_fsm_pages = 204800
random_page_cost = 2.0
effective_cache_size = 10000
autovacuum = on

====================================

EXPLAIN ANALYZE SELECT "PrintSamples"."MachineID",
"PrintSamples"."PrintCopyID", "tblColors"."ColorID",
avg("ParameterValues"."ParameterValue") AS "Mottle_NMF"
   FROM "AnalysisModules"
   JOIN ("tblColors"
   JOIN ("tblTPNamesAndColors"
   JOIN "PrintSamples" ON "tblTPNamesAndColors"."TestPatternName"::text
= "PrintSamples"."TestPatternName"::text
   JOIN ("DigitalImages"
   JOIN "PrintSampleAnalyses" ON "DigitalImages"."ImageID" =
"PrintSampleAnalyses"."ImageID"
   JOIN ("ParameterNames"
   JOIN ("Measurements"
   JOIN "ParameterValues" ON "Measurements"."MeasurementID" =
"ParameterValues"."MeasurementID") ON "ParameterNames"."ParameterID" =
"ParameterValues"."ParameterID") ON "PrintSampleAnalyses"."psaID" =
"Measurements"."psaID") ON "PrintSamples"."PrintSampleID" =
"DigitalImages"."PrintSampleID") ON "tblColors"."ColorID" =
"tblTPNamesAndColors"."ColorID") ON "AnalysisModules"."MetricID" =
"Measurements"."MetricID"
  GROUP BY "PrintSamples"."MachineID", "PrintSamples"."PrintCopyID",
"tblColors"."ColorID", "AnalysisModules"."AnalysisModuleName",
"ParameterNames"."ParameterName", "PrintSamples"."TestPatternName"
 HAVING "AnalysisModules"."AnalysisModuleName"::text = 'NMF'::text AND
"ParameterNames"."ParameterName"::text = 'NMF'::text AND
"tblColors"."ColorID" <> 3 AND "PrintSamples"."TestPatternName"::text ~~
'IQAF-TP8%'::text;

HashAggregate (cost=519801.96..519898.00 rows=7683 width=70) (actual
time=121101.027..121146.385 rows=14853 loops=1)
 -> Hash Join (cost=286101.76..519667.51 rows=7683 width=70) (actual
time=52752.600..120989.713 rows=15123 loops=1)
    Hash Cond: ("Measurements"."MetricID" =
"AnalysisModules"."MetricID")
    -> Hash Join (cost=286099.98..519260.45 rows=87588 width=61) (actual
time=52752.502..120933.784 rows=15123 loops=1)
       Hash Cond: ("ParameterValues"."MeasurementID" =
"Measurements"."MeasurementID")
       -> Nested Loop (cost=8054.81..238636.75 rows=454040 width=21)
(actual time=165.510..67811.086 rows=289724 loops=1)
          -> Seq Scan on "ParameterNames" (cost=0.00..1.94 rows=1
width=17) (actual time=0.012..0.026 rows=1 loops=1)
             Filter: (("ParameterName")::text = 'NMF'::text)
          -> Bitmap Heap Scan on "ParameterValues"
(cost=8054.81..231033.70 rows=608089 width=12) (actual
time=165.481..67094.656 rows=289724 loops=1)
             Recheck Cond: ("ParameterNames"."ParameterID" =
"ParameterValues"."ParameterID")
             -> Bitmap Index Scan on "PVParameterID_idx"
(cost=0.00..7902.79 rows=608089 width=0) (actual time=141.013..141.013
rows=289724 loops=1)
                Index Cond: ("ParameterNames"."ParameterID" =
"ParameterValues"."ParameterID")
       -> Hash (cost=259861.12..259861.12 rows=1454724 width=48) (actual
time=52573.270..52573.270 rows=961097 loops=1)
          -> Hash Join (cost=8139.75..259861.12 rows=1454724 width=48)
(actual time=1399.575..50896.641 rows=961097 loops=1)
             Hash Cond: ("Measurements"."psaID" =
"PrintSampleAnalyses"."psaID")
             -> Seq Scan on "Measurements" (cost=0.00..199469.09
rows=7541009 width=12) (actual time=6.697..37199.702 rows=7539838
loops=1)
             -> Hash (cost=7949.67..7949.67 rows=15206 width=44) (actual
time=1392.743..1392.743 rows=18901 loops=1)
                -> Hash Join (cost=5069.24..7949.67 rows=15206 width=44)
(actual time=986.589..1358.908 rows=18901 loops=1)
                   Hash Cond: ("PrintSampleAnalyses"."ImageID" =
"DigitalImages"."ImageID")
                   -> Seq Scan on "PrintSampleAnalyses"
(cost=0.00..2334.25 rows=78825 width=8) (actual time=13.747..158.867
rows=78859 loops=1)
                   -> Hash (cost=4879.10..4879.10 rows=15211 width=44)
(actual time=972.787..972.787 rows=18901 loops=1)
                      -> Hash Join (cost=2220.11..4879.10 rows=15211
width=44) (actual time=341.158..938.970 rows=18901 loops=1)
                         Hash Cond: ("DigitalImages"."PrintSampleID" =
"PrintSamples"."PrintSampleID")
                         -> Seq Scan on "DigitalImages"
(cost=0.00..1915.50 rows=78850 width=8) (actual time=34.028..418.113
rows=78859 loops=1)
                         -> Hash (cost=2029.98..2029.98 rows=15211
width=44) (actual time=307.073..307.073 rows=18645 loops=1)
                            -> Hash Join (cost=564.39..2029.98
rows=15211 width=44) (actual time=92.565..275.879 rows=18645 loops=1)
                               Hash Cond:
(("PrintSamples"."TestPatternName")::text =
("tblTPNamesAndColors"."TestPatternName")::text)
                               -> Bitmap Heap Scan on "PrintSamples"
(cost=561.39..1781.53 rows=24891 width=40) (actual time=92.296..208.635
rows=24914 loops=1)
                                  Filter: (("TestPatternName")::text ~~
'IQAF-TP8%'::text)
                                  -> Bitmap Index Scan on
"PSTestPatternName_idx" (cost=0.00..555.17 rows=24891 width=0) (actual
time=76.711..76.711 rows=24914 loops=1)
                                     Index Cond:
((("TestPatternName")::text >= 'IQAF-TP8'::character varying) AND
(("TestPatternName")::text < 'IQAF-TP9'::character varying))
                               -> Hash (cost=2.72..2.72 rows=22
width=30) (actual time=0.238..0.238 rows=21 loops=1)
                                  -> Hash Join (cost=1.09..2.72 rows=22
width=30) (actual time=0.097..0.196 rows=21 loops=1)
                                     Hash Cond:
("tblTPNamesAndColors"."ColorID" = "tblColors"."ColorID")
                                     -> Seq Scan on
"tblTPNamesAndColors" (cost=0.00..1.30 rows=30 width=30) (actual
time=0.046..0.080 rows=30 loops=1)
                                     -> Hash (cost=1.05..1.05 rows=3
width=4) (actual time=0.028..0.028 rows=3 loops=1)
                                        -> Seq Scan on "tblColors"
(cost=0.00..1.05 rows=3 width=4) (actual time=0.009..0.016 rows=3
loops=1)
                                           Filter: ("ColorID" <> 3)
    -> Hash (cost=1.71..1.71 rows=5 width=17) (actual time=0.072..0.072
rows=5 loops=1)
       -> Seq Scan on "AnalysisModules" (cost=0.00..1.71 rows=5
width=17) (actual time=0.036..0.054 rows=5 loops=1)
          Filter: (("AnalysisModuleName")::text = 'NMF'::text)
Total runtime: 121178.595 ms

============================================

SELECT set_config('join_collapse_limit', '20', false);

EXPLAIN ANALYZE SELECT "PrintSamples"."MachineID",
"PrintSamples"."PrintCopyID", "tblColors"."ColorID",
avg("ParameterValues"."ParameterValue") AS "Mottle_NMF"
   FROM "AnalysisModules"
   JOIN ("tblColors"
   JOIN ("tblTPNamesAndColors"
   JOIN "PrintSamples" ON "tblTPNamesAndColors"."TestPatternName"::text
= "PrintSamples"."TestPatternName"::text
   JOIN ("DigitalImages"
   JOIN "PrintSampleAnalyses" ON "DigitalImages"."ImageID" =
"PrintSampleAnalyses"."ImageID"
   JOIN ("ParameterNames"
   JOIN ("Measurements"
   JOIN "ParameterValues" ON "Measurements"."MeasurementID" =
"ParameterValues"."MeasurementID") ON "ParameterNames"."ParameterID" =
"ParameterValues"."ParameterID") ON "PrintSampleAnalyses"."psaID" =
"Measurements"."psaID") ON "PrintSamples"."PrintSampleID" =
"DigitalImages"."PrintSampleID") ON "tblColors"."ColorID" =
"tblTPNamesAndColors"."ColorID") ON "AnalysisModules"."MetricID" =
"Measurements"."MetricID"
  GROUP BY "PrintSamples"."MachineID", "PrintSamples"."PrintCopyID",
"tblColors"."ColorID", "AnalysisModules"."AnalysisModuleName",
"ParameterNames"."ParameterName", "PrintSamples"."TestPatternName"
 HAVING "AnalysisModules"."AnalysisModuleName"::text = 'NMF'::text AND
"ParameterNames"."ParameterName"::text = 'NMF'::text AND
"tblColors"."ColorID" <> 3 AND "PrintSamples"."TestPatternName"::text ~~
'IQAF-TP8%'::text;

HashAggregate (cost=489274.71..489372.94 rows=7858 width=70) (actual
time=120391.220..120420.367 rows=14853 loops=1)
 -> Hash Join (cost=256774.03..489137.20 rows=7858 width=70) (actual
time=51021.953..120276.494 rows=15123 loops=1)
    Hash Cond: ("ParameterValues"."MeasurementID" =
"Measurements"."MeasurementID")
    -> Nested Loop (cost=8054.81..238636.75 rows=454040 width=21)
(actual time=159.781..68959.258 rows=289724 loops=1)
       -> Seq Scan on "ParameterNames" (cost=0.00..1.94 rows=1 width=17)
(actual time=0.021..0.039 rows=1 loops=1)
          Filter: (("ParameterName")::text = 'NMF'::text)
       -> Bitmap Heap Scan on "ParameterValues" (cost=8054.81..231033.70
rows=608089 width=12) (actual time=159.740..68235.713 rows=289724
loops=1)
          Recheck Cond: ("ParameterNames"."ParameterID" =
"ParameterValues"."ParameterID")
          -> Bitmap Index Scan on "PVParameterID_idx"
(cost=0.00..7902.79 rows=608089 width=0) (actual time=135.166..135.166
rows=289724 loops=1)
             Index Cond: ("ParameterNames"."ParameterID" =
"ParameterValues"."ParameterID")
    -> Hash (cost=247087.84..247087.84 rows=130510 width=57) (actual
time=50844.324..50844.324 rows=15123 loops=1)
       -> Hash Join (cost=8141.52..247087.84 rows=130510 width=57)
(actual time=11034.877..50791.185 rows=15123 loops=1)
          Hash Cond: ("Measurements"."psaID" =
"PrintSampleAnalyses"."psaID")
          -> Hash Join (cost=1.77..234364.57 rows=661492 width=21)
(actual time=31.302..48949.943 rows=289724 loops=1)
             Hash Cond: ("Measurements"."MetricID" =
"AnalysisModules"."MetricID")
             -> Seq Scan on "Measurements" (cost=0.00..199469.09
rows=7541009 width=12) (actual time=10.700..37931.726 rows=7539838
loops=1)
             -> Hash (cost=1.71..1.71 rows=5 width=17) (actual
time=0.066..0.066 rows=5 loops=1)
                -> Seq Scan on "AnalysisModules" (cost=0.00..1.71 rows=5
width=17) (actual time=0.033..0.049 rows=5 loops=1)
                   Filter: (("AnalysisModuleName")::text = 'NMF'::text)
          -> Hash (cost=7949.67..7949.67 rows=15206 width=44) (actual
time=1325.797..1325.797 rows=18901 loops=1)
             -> Hash Join (cost=5069.24..7949.67 rows=15206 width=44)
(actual time=906.105..1290.289 rows=18901 loops=1)
                Hash Cond: ("PrintSampleAnalyses"."ImageID" =
"DigitalImages"."ImageID")
                -> Seq Scan on "PrintSampleAnalyses" (cost=0.00..2334.25
rows=78825 width=8) (actual time=4.456..153.999 rows=78859 loops=1)
                -> Hash (cost=4879.10..4879.10 rows=15211 width=44)
(actual time=901.596..901.596 rows=18901 loops=1)
                   -> Hash Join (cost=2220.11..4879.10 rows=15211
width=44) (actual time=293.264..866.364 rows=18901 loops=1)
                      Hash Cond: ("DigitalImages"."PrintSampleID" =
"PrintSamples"."PrintSampleID")
                      -> Seq Scan on "DigitalImages" (cost=0.00..1915.50
rows=78850 width=8) (actual time=21.967..380.287 rows=78859 loops=1)
                      -> Hash (cost=2029.98..2029.98 rows=15211
width=44) (actual time=271.232..271.232 rows=18645 loops=1)
                         -> Hash Join (cost=564.39..2029.98 rows=15211
width=44) (actual time=60.780..237.748 rows=18645 loops=1)
                            Hash Cond:
(("PrintSamples"."TestPatternName")::text =
("tblTPNamesAndColors"."TestPatternName")::text)
                            -> Bitmap Heap Scan on "PrintSamples"
(cost=561.39..1781.53 rows=24891 width=40) (actual time=60.482..168.602
rows=24914 loops=1)
                               Filter: (("TestPatternName")::text ~~
'IQAF-TP8%'::text)
                               -> Bitmap Index Scan on
"PSTestPatternName_idx" (cost=0.00..555.17 rows=24891 width=0) (actual
time=52.269..52.269 rows=24914 loops=1)
                                  Index Cond:
((("TestPatternName")::text >= 'IQAF-TP8'::character varying) AND
(("TestPatternName")::text < 'IQAF-TP9'::character varying))
                            -> Hash (cost=2.72..2.72 rows=22 width=30)
(actual time=0.266..0.266 rows=21 loops=1)
                               -> Hash Join (cost=1.09..2.72 rows=22
width=30) (actual time=0.120..0.223 rows=21 loops=1)
                                  Hash Cond:
("tblTPNamesAndColors"."ColorID" = "tblColors"."ColorID")
                                  -> Seq Scan on "tblTPNamesAndColors"
(cost=0.00..1.30 rows=30 width=30) (actual time=0.025..0.059 rows=30
loops=1)
                                  -> Hash (cost=1.05..1.05 rows=3
width=4) (actual time=0.068..0.068 rows=3 loops=1)
                                     -> Seq Scan on "tblColors"
(cost=0.00..1.05 rows=3 width=4) (actual time=0.048..0.054 rows=3
loops=1)
                                        Filter: ("ColorID" <> 3)
Total runtime: 120443.640 ms

pgsql-performance by date:

Previous
From: "Jonathan Ellis"
Date:
Subject: Re: How to ENABLE SQL capturing???
Next
From: "Joshua D. Drake"
Date:
Subject: Re: How to ENABLE SQL capturing???