Poor query plan chosen in 9.0.3 vs 8.3.7 - Mailing list pgsql-performance

From Brian Connolly
Subject Poor query plan chosen in 9.0.3 vs 8.3.7
Date
Msg-id BANLkTin2HYy2m_Db354XMf=gDxxf7R1eCg@mail.gmail.com
Whole thread Raw
Responses Re: Poor query plan chosen in 9.0.3 vs 8.3.7  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
We have a generated query from our web application which takes far longer to complete in 9.0.4, than in 8.3.7 (>60sec in 9.0.4 ~10sec in 8.3.7)

The query plan generated in 9.0, includes a Materialize step which takes the bulk of the time for the query.  If I disable materialize (by running set enable_material='off';) the query takes ~6 seconds to run (NOTE: the query plan chosen in this case is very similar to the plan chosen in 8.3.7).   From reading http://rhaas.blogspot.com/2010/04/materialization-in-postgresql-90.html, I realize that the planner will be more aggressive in choosing Materialize in 9.0.  Is there a way to modify the planner cost settings to minimize it's use in cases like this?  

Below, I have included the query in question and the "explain analyze" output for the case where enable_material is on. .   

NOTE: I will send explain analyze output for when enable_material is 'off' and information on the postgresql version, settings and server configuration in a follow up email due the length of email restrictions on the mailing list. 

(A vacuum analyze was run prior to running the queries for this email)  

Any help that you can provide would be greatly appreciated.



Query 
------------------------------------------------------------------------------------------------
 SELECT *
FROM (
 SELECT
 Peter_SizeByType.RowId AS RowId,
 Peter_SizeByType.Name AS Name,
 Peter_SizeByType.Type AS Type,
 Peter_SizeByType.ZAve AS ZAve,
 Peter_SizeByType.TimeLabel AS TimeLabel,
 Peter_SizeByType.StorageTemperature AS StorageTemperature,
 Peter_SizeByType.Pdl AS Pdl,
 Peter_SizeByType.Cumulants AS Cumulants,
 Peter_SizeByType.AnalysisTool AS AnalysisTool,
 Peter_SizeByType.meanCountRate AS meanCountRate,
 Peter_SizeByType.ExtractionNumber AS ExtractionNumber,
 Peter_SizeByType.TestNumber AS TestNumber,
 Peter_SizeByType.Sort AS Sort
 FROM (SELECT PS_2.rowid AS RowId,
 F_3.Name AS Name,
 F_3.Type AS Type,
 PS_2.zave AS ZAve,
 PS_2.timelabel AS TimeLabel,
 PS_2.storagetemperature AS StorageTemperature,
 PS_2.pdl AS Pdl,
 PS_2.cumulants AS Cumulants,
 PS_2.analysistool AS AnalysisTool,
 PS_2.meancountrate AS meanCountRate,
 PS_2.extractionnumber AS ExtractionNumber,
 PS_2.testnumber AS TestNumber,
 T_4.sort AS Sort
 FROM (SELECT c69d129_particle_size_result_fields_5.analysistool AS analysistool, c69d129_particle_size_result_fields_5.cumulants AS cumulants, c69d129_particle_size_result_fields_5.extractionnumber AS extractionnumber, c69d129_particle_size_result_fields_5.pdl AS pdl, c69d129_particle_size_result_fields_5.rowid AS rowid,
 (SELECT RunId FROM exp.Data WHERE RowId = c69d129_particle_size_result_fields_5.DataId)
  AS Run, c69d129_particle_size_result_fields_5$Run$.container AS Run_Folder, c69d129_particle_size_result_fields_5$Run$Folder$.entityid AS Run_Folder_EntityId, c69d129_particle_size_result_fields_5$Run$.name AS Run_Name, c69d129_particle_size_result_fields_5.storagetemperature AS storagetemperature, c69d129_particle_size_result_fields_5.testnumber AS testnumber, c69d129_particle_size_result_fields_5.timelabel AS timelabel, c69d129_particle_size_result_fields_5.zave AS zave, c69d129_particle_size_result_fields_5.meancountrate AS meancountrate, c69d129_particle_size_result_fields_5.rowid AS rowid1
 FROM (SELECT * FROM assayresult."c69d129_particle_size_result_fields"
 WHERE (((SELECT Container FROM exp.Data WHERE RowId = DataId) IN ('d938da12-1b43-102d-a8a2-78911b79dd1c')))) c69d129_particle_size_result_fields_5
     LEFT OUTER JOIN (SELECT * FROM exp.experimentrun
 WHERE (((protocollsid LIKE 'urn:lsid:labkey.com:Particle+SizeProtocol.Folder-69:Particle+Size'))) AND (container IN ('d938da12-1b43-102d-a8a2-78911b79dd1c'))) c69d129_particle_size_result_fields_5$Run$ ON ((SELECT RunId FROM exp.Data WHERE RowId = c69d129_particle_size_result_fields_5.DataId) = c69d129_particle_size_result_fields_5$Run$.rowid)
     LEFT OUTER JOIN core.containers c69d129_particle_size_result_fields_5$Run$Folder$ ON (c69d129_particle_size_result_fields_5$Run$.container = c69d129_particle_size_result_fields_5$Run$Folder$.entityid)) PS_2
 INNER JOIN (SELECT Formulations_6.container AS Folder, Formulations_6$Folder$.entityid AS Folder_EntityId, Formulations_6.name AS Name, Formulations_6.rowid AS RowId,
 CAST((SELECT StringValue FROM exp.ObjectProperty WHERE exp.ObjectProperty.PropertyId = 560 AND exp.ObjectProperty.ObjectId = Formulations_6$LSID$_C.ObjectId) AS VARCHAR(4000))
  AS Type
 FROM (SELECT * FROM exp.material
 WHERE (container IN ('d938da12-1b43-102d-a8a2-78911b79dd1c')) AND ((cpastype = 'urn:lsid:labkey.com:SampleSet.Folder-69:Formulations'))) Formulations_6
     LEFT OUTER JOIN core.containers Formulations_6$Folder$ ON (Formulations_6.container = Formulations_6$Folder$.entityid)
     LEFT OUTER JOIN exp.object Formulations_6$LSID$_C ON (Formulations_6.lsid = Formulations_6$LSID$_C.objecturi) AND Formulations_6.Container = 'd938da12-1b43-102d-a8a2-78911b79dd1c') F_3 ON PS_2.Run_Name=F_3.Name || '.xls' AND PS_2.Run_Folder_EntityId=F_3.Folder_EntityId
 INNER JOIN (SELECT
 CAST((SELECT FloatValue FROM exp.ObjectProperty WHERE exp.ObjectProperty.PropertyId = 1288 AND exp.ObjectProperty.ObjectId = Timepoints_7.objectid) AS INT)
  AS sort, Timepoints_7.key AS time
 FROM (SELECT * FROM exp.indexvarchar
 WHERE ((listid = 17))) Timepoints_7) T_4 ON PS_2.timelabel=T_4.time) Peter_SizeByType ) x
 ORDER BY Name ASC
 LIMIT 101


Explain Analyze output with enable_material='on'
------------------------------------------------------------------------------------------------
Limit  (cost=233190.18..233190.43 rows=101 width=71) (actual time=194078.460..194078.645 rows=101 loops=1)
   ->  Sort  (cost=233190.18..233190.44 rows=104 width=71) (actual time=194078.457..194078.520 rows=101 loops=1)
         Sort Key: material.name
         Sort Method:  top-N heapsort  Memory: 39kB
         ->  Nested Loop Left Join  (cost=3.27..233186.69 rows=104 width=71) (actual time=3996.558..193952.126 rows=67044 loops=1)
               Join Filter: ((material.container)::text = 'd938da12-1b43-102d-a8a2-78911b79dd1c'::text)
               ->  Nested Loop  (cost=3.27..232216.91 rows=104 width=155) (actual time=3996.433..190230.691 rows=67044 loops=1)
                     Join Filter: ((experimentrun.name)::text = ((material.name)::text || '.xls'::text))
                     ->  Index Scan using idx_material_lsid on material  (cost=0.00..444.90 rows=774 width=96) (actual time=0.251..14.974 rows=1303 loops=1)
                           Filter: (((container)::text = 'd938da12-1b43-102d-a8a2-78911b79dd1c'::text) AND ((cpastype)::text = 'urn:lsid:labkey.com:SampleSet.Folder-69:Formulations'::text))
                     ->  Materialize  (cost=3.27..230512.55 rows=93 width=129) (actual time=0.023..70.942 rows=67044 loops=1303)
                           ->  Nested Loop  (cost=3.27..230512.08 rows=93 width=129) (actual time=0.065..3355.484 rows=67044 loops=1)
                                 ->  Nested Loop  (cost=0.00..229803.99 rows=93 width=137) (actual time=0.048..2335.540 rows=67044 loops=1)
                                       ->  Nested Loop  (cost=0.00..2.33 rows=1 width=74) (actual time=0.013..0.023 rows=1 loops=1)
                                             ->  Seq Scan on containers "c69d129_particle_size_result_fields_5$run$folder$"  (cost=0.00..1.16 rows=1 width=37) (actual time=0.006..0.009 rows=1 loops=1)
                                                   Filter: ((entityid)::text = 'd938da12-1b43-102d-a8a2-78911b79dd1c'::text)
                                             ->  Seq Scan on containers "formulations_6$folder$"  (cost=0.00..1.16 rows=1 width=37) (actual time=0.002..0.006 rows=1 loops=1)
                                                   Filter: (("formulations_6$folder$".entityid)::text = 'd938da12-1b43-102d-a8a2-78911b79dd1c'::text)
                                       ->  Nested Loop  (cost=0.00..229800.73 rows=93 width=63) (actual time=0.032..2169.448 rows=67044 loops=1)
                                             Join Filter: ((c69d129_particle_size_result_fields.timelabel)::text = (indexvarchar.key)::text)
                                             ->  Seq Scan on c69d129_particle_size_result_fields  (cost=0.00..229742.02 rows=348 width=59) (actual time=0.016..527.225 rows=69654 loops=1)
                                                   Filter: (((SubPlan 3))::text = 'd938da12-1b43-102d-a8a2-78911b79dd1c'::text)
                                                   SubPlan 3
                                                     ->  Index Scan using pk_data on data  (cost=0.00..3.27 rows=1 width=37) (actual time=0.003..0.004 rows=1 loops=69654)
                                                           Index Cond: (rowid = $2)
                                             ->  Materialize  (cost=0.00..1.32 rows=11 width=10) (actual time=0.001..0.010 rows=11 loops=69654)
                                                   ->  Seq Scan on indexvarchar  (cost=0.00..1.26 rows=11 width=10) (actual time=0.003..0.013 rows=11 loops=1)
                                                         Filter: (listid = 17)
                                 ->  Index Scan using pk_experimentrun on experimentrun  (cost=3.27..4.33 rows=1 width=74) (actual time=0.004..0.005 rows=1 loops=67044)
                                       Index Cond: (experimentrun.rowid = (SubPlan 4))
                                       Filter: (((experimentrun.protocollsid)::text ~~ 'urn:lsid:labkey.com:Particle+SizeProtocol.Folder-69:Particle+Size'::text) AND ((experimentrun.container)::text = 'd938da12-1b43-102d-a8a
2-78911b79dd1c'::text))
                                       SubPlan 4
                                         ->  Index Scan using pk_data on data  (cost=0.00..3.27 rows=1 width=4) (actual time=0.003..0.004 rows=1 loops=67044)
                                               Index Cond: (rowid = $2)
                                       SubPlan 4
                                         ->  Index Scan using pk_data on data  (cost=0.00..3.27 rows=1 width=4) (actual time=0.003..0.004 rows=1 loops=67044)
                                               Index Cond: (rowid = $2)
               ->  Index Scan using uq_object on object "formulations_6$lsid$_c"  (cost=0.00..2.69 rows=1 width=64) (actual time=0.033..0.034 rows=1 loops=67044)
                     Index Cond: ((material.lsid)::text = ("formulations_6$lsid$_c".objecturi)::text)
               SubPlan 1
                 ->  Index Scan using pk_objectproperty on objectproperty  (cost=0.00..3.31 rows=1 width=13) (actual time=0.005..0.007 rows=1 loops=67044)
                       Index Cond: ((objectid = $0) AND (propertyid = 560))
               SubPlan 2
                 ->  Index Scan using pk_objectproperty on objectproperty  (cost=0.00..3.31 rows=1 width=8) (actual time=0.004..0.005 rows=1 loops=67044)
                       Index Cond: ((objectid = $1) AND (propertyid = 1288))
 Total runtime: 194080.893 ms


Thank you, 

Brian Connolly 

pgsql-performance by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Re: Explicit joins
Next
From: Robert Haas
Date:
Subject: Re: VX_CONCURRENT flag on vxfs( 5.1 or later) for performance for postgresql?