Re: Help optimize view - Mailing list pgsql-performance

From Relyea, Mike
Subject Re: Help optimize view
Date
Msg-id 1806D1F73FCB7F439F2C842EE0627B18065F7A86@USA0300MS01.na.xerox.net
Whole thread Raw
In response to Re: Help optimize view  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Responses Re: Help optimize view  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
List pgsql-performance
> >>> On Mon, Aug 13, 2007 at  1:48 PM, in message
> <1806D1F73FCB7F439F2C842EE0627B18065F78DF@USA0300MS01.na.xerox
> .net>, "Relyea, Mike" <Mike.Relyea@xerox.com> wrote:
> > I've increased shared_buffers to 128MB, and restarted the
> server.  My
> > total run time didn't really change.
>
> Please forgive me if this guess doesn't help either, but
> could you try eliminating the GROUP BY options which don't
> echo values in the select value list, and move the HAVING
> conditions to a WHERE clause?  Something like:
>
> explain analyze
> SELECT
>     "PrintSamples"."MachineID",
>     "PrintSamples"."PrintCopyID",
>     "tblColors"."ColorID",
>     avg("ParameterValues"."ParameterValue") AS "Mottle_NMF"
>   FROM "AnalysisModules"
>   JOIN
>   (
>     "tblColors"
>     JOIN
>     (
>       "tblTPNamesAndColors"
>       JOIN "PrintSamples"
>         ON ("tblTPNamesAndColors"."TestPatternName"::text =
> "PrintSamples"."TestPatternName"::text)
>       JOIN
>       (
>         "DigitalImages"
>         JOIN "PrintSampleAnalyses"
>           ON ("DigitalImages"."ImageID" =
> "PrintSampleAnalyses"."ImageID")
>         JOIN
>         (
>           "ParameterNames"
>           JOIN
>           (
>             "Measurements"
>             JOIN "ParameterValues"
>               ON "Measurements"."MeasurementID" =
> "ParameterValues"."MeasurementID"
>           ) ON "ParameterNames"."ParameterID" =
> "ParameterValues"."ParameterID"
>         ) ON "PrintSampleAnalyses"."psaID" = "Measurements"."psaID"
>       ) ON "PrintSamples"."PrintSampleID" =
> "DigitalImages"."PrintSampleID"
>     ) ON "tblColors"."ColorID" = "tblTPNamesAndColors"."ColorID"
>   ) ON "AnalysisModules"."MetricID" = "Measurements"."MetricID"
>   WHERE "AnalysisModules"."AnalysisModuleName"::text = 'NMF'::text
>     AND "ParameterNames"."ParameterName"::text = 'NMF'::text
>     AND "PrintSamples"."TestPatternName"::text ~~ 'IQAF-TP8%'::text
>     AND "tblColors"."ColorID" <> 3
>   GROUP BY
>     "PrintSamples"."MachineID",
>     "PrintSamples"."PrintCopyID",
>     "tblColors"."ColorID"
> ;
>
> I'd also be inclined to simplify the FROM clause by
> eliminating the parentheses and putting the ON conditions
> closer to where they are used, but that would be more for
> readability than any expectation that it would affect the plan.
>
> -Kevin

Thanks for your help.  Re-writing the view like this maybe bought me
something.  I've pasted the explain analyze results below.  Tough to
tell because I also increased some of the statistics.  From what Tom
says, it sounds like if I want the data returned faster I'm likely to
have to get beefier hardware.

ALTER TABLE "ParameterValues" ALTER "MeasurementID" SET STATISTICS 500;

ALTER TABLE "ParameterValues" ALTER "ParameterID" SET STATISTICS 500;

ANALYZE "ParameterValues";

ALTER TABLE "Measurements" ALTER COLUMN "MetricID" SET STATISTICS 500;

ALTER TABLE "Measurements" ALTER COLUMN "psaID" SET STATISTICS 500;

ANALYZE "Measurements";

Running the above SQL:

HashAggregate (cost=461541.53..461634.88 rows=7468 width=16) (actual
time=110002.041..110024.777 rows=14853 loops=1)
 -> Hash Join (cost=230789.57..461464.70 rows=7683 width=16) (actual
time=56847.814..109936.722 rows=15123 loops=1)
    Hash Cond: ("Measurements"."MetricID" =
"AnalysisModules"."MetricID")
    -> Hash Join (cost=230787.80..461057.64 rows=87588 width=20) (actual
time=56847.697..109884.122 rows=15123 loops=1)
       Hash Cond: ("ParameterValues"."MeasurementID" =
"Measurements"."MeasurementID")
       -> Nested Loop (cost=6353.15..234044.47 rows=454038 width=8)
(actual time=179.154..52780.680 rows=289724 loops=1)
          -> Seq Scan on "ParameterNames" (cost=0.00..1.94 rows=1
width=4) (actual time=0.012..0.027 rows=1 loops=1)
             Filter: (("ParameterName")::text = 'NMF'::text)
          -> Bitmap Heap Scan on "ParameterValues"
(cost=6353.15..228047.32 rows=479617 width=12) (actual
time=179.123..52102.572 rows=289724 loops=1)
             Recheck Cond: ("ParameterNames"."ParameterID" =
"ParameterValues"."ParameterID")
             -> Bitmap Index Scan on "PVParameterID_idx"
(cost=0.00..6233.25 rows=479617 width=0) (actual time=152.752..152.752
rows=289724 loops=1)
                Index Cond: ("ParameterNames"."ParameterID" =
"ParameterValues"."ParameterID")
       -> Hash (cost=206253.42..206253.42 rows=1454498 width=20) (actual
time=56657.022..56657.022 rows=961097 loops=1)
          -> Nested Loop (cost=5069.24..206253.42 rows=1454498 width=20)
(actual time=932.249..55176.315 rows=961097 loops=1)
             -> Hash Join (cost=5069.24..7949.67 rows=15206 width=16)
(actual time=908.275..1257.120 rows=18901 loops=1)
                Hash Cond: ("PrintSampleAnalyses"."ImageID" =
"DigitalImages"."ImageID")
                -> Seq Scan on "PrintSampleAnalyses" (cost=0.00..2334.25
rows=78825 width=8) (actual time=10.440..139.945 rows=78859 loops=1)
                -> Hash (cost=4879.10..4879.10 rows=15211 width=16)
(actual time=897.776..897.776 rows=18901 loops=1)
                   -> Hash Join (cost=2220.11..4879.10 rows=15211
width=16) (actual time=297.330..868.632 rows=18901 loops=1)
                      Hash Cond: ("DigitalImages"."PrintSampleID" =
"PrintSamples"."PrintSampleID")
                      -> Seq Scan on "DigitalImages" (cost=0.00..1915.50
rows=78850 width=8) (actual time=15.859..408.784 rows=78859 loops=1)
                      -> Hash (cost=2029.98..2029.98 rows=15211
width=16) (actual time=281.413..281.413 rows=18645 loops=1)
                         -> Hash Join (cost=564.39..2029.98 rows=15211
width=16) (actual time=84.182..251.833 rows=18645 loops=1)
                            Hash Cond:
(("PrintSamples"."TestPatternName")::text =
("tblTPNamesAndColors"."TestPatternName")::text)
                            -> Bitmap Heap Scan on "PrintSamples"
(cost=561.39..1781.53 rows=24891 width=40) (actual time=83.925..184.775
rows=24914 loops=1)
                               Filter: (("TestPatternName")::text ~~
'IQAF-TP8%'::text)
                               -> Bitmap Index Scan on
"PSTestPatternName_idx" (cost=0.00..555.17 rows=24891 width=0) (actual
time=74.198..74.198 rows=24914 loops=1)
                                  Index Cond:
((("TestPatternName")::text >= 'IQAF-TP8'::character varying) AND
(("TestPatternName")::text < 'IQAF-TP9'::character varying))
                            -> Hash (cost=2.72..2.72 rows=22 width=30)
(actual time=0.225..0.225 rows=21 loops=1)
                               -> Hash Join (cost=1.09..2.72 rows=22
width=30) (actual time=0.086..0.184 rows=21 loops=1)
                                  Hash Cond:
("tblTPNamesAndColors"."ColorID" = "tblColors"."ColorID")
                                  -> Seq Scan on "tblTPNamesAndColors"
(cost=0.00..1.30 rows=30 width=30) (actual time=0.025..0.060 rows=30
loops=1)
                                  -> Hash (cost=1.05..1.05 rows=3
width=4) (actual time=0.040..0.040 rows=3 loops=1)
                                     -> Seq Scan on "tblColors"
(cost=0.00..1.05 rows=3 width=4) (actual time=0.021..0.027 rows=3
loops=1)
                                        Filter: ("ColorID" <> 3)
             -> Index Scan using "MpsaID_idx" on "Measurements"
(cost=0.00..11.13 rows=153 width=12) (actual time=1.615..2.728 rows=51
loops=18901)
                Index Cond: ("PrintSampleAnalyses"."psaID" =
"Measurements"."psaID")
    -> Hash (cost=1.71..1.71 rows=5 width=4) (actual time=0.092..0.092
rows=5 loops=1)
       -> Seq Scan on "AnalysisModules" (cost=0.00..1.71 rows=5 width=4)
(actual time=0.060..0.077 rows=5 loops=1)
          Filter: (("AnalysisModuleName")::text = 'NMF'::text)
Total runtime: 110047.601 ms


pgsql-performance by date:

Previous
From: Philipp Specht
Date:
Subject: Stable function optimisation
Next
From: Tom Lane
Date:
Subject: Re: Stable function optimisation