Thread: Improve performance of query
I'm trying to port our application from MS-SQL to Postgres. We have implemented all of our rather complicated application security in the database. The query that follows takes a half of a second or less on MS-SQL server and around 5 seconds on Postgres. My concern is that this data set is rather "small" by our applications standards. It is not unusual for the da_answer table to have 2-4 million records. I'm worried that if this very small data set is taking 5 seconds, then a "regular sized" data set will take far too long. I originally thought the NOT EXISTS on the "da_data_restrict_except_open" table was killing performance, but the query took the exact same amount of time after I deleted all rows from this table. Note that the hard-coded 999999999.0, and 4000 parameters, as well as the parameter to svp_getparentproviders are the three variables that change from one run of this query to the next. I'm using Postgres 7.4.5 as packaged in Debian. shared_buffers is set to 57344 and sort_mem=4096. The machine has an AMD 1.8+ and ` gig of RAM. Here are some relevant performance statistics: richard:/usr/share/cups/model# cat /proc/sys/kernel/shmmax 536870912 richard:/usr/share/cups/model# cat /proc/sys/kernel/shmall 536870912 richard:/home/richard# hdparm -tT /dev/hda Timing cached reads: 1112 MB in 2.00 seconds = 556.00 MB/sec Timing buffered disk reads: 176 MB in 3.02 seconds = 58.28 MB/sec I have included an EXPLAIN ANALYZE, relevant table counts, and relevant indexing information. If anyone has any suggestions on how to improve performance.... TIA! SELECT tab.answer_id, client_id, question_id, recordset_id, date_effective, virt_field_name FROM ( SELECT a.uid AS answer_id, a.client_id, a.question_id, recordset_id, date_effective FROM da_answer a WHERE a.date_effective <= 9999999999.0 AND a.inactive != 1 AND ( 5000 = 4000 OR (EXISTS (SELECT * FROM svp_getparentproviderids(1) WHERE svp_getparentproviderids = a.provider_id)) ) UNION SELECT a.uid AS answer_id, a.client_id, a.question_id, recordset_id, date_effective FROM da_answer a, ( SELECT main_id FROM da_data_restrict WHERE type_id = 2 AND (provider_id IN (SELECT * FROM svp_getparentproviderids(1))) UNION SELECT sa.uid AS main_id FROM da_answer sa JOIN da_data_restrict_except_closed dr ON dr.main_id = sa.uid AND dr.type_id = 2 AND dr.except_provider_id = 1 WHERE (restricted = 1) AND (restricted_closed_except = 1) AND sa.covered_by_roi = 1 UNION SELECT sa.uid AS main_id FROM da_answer sa WHERE (restricted = 0) AND (restricted_open_except = 1) AND (NOT EXISTS (SELECT dr.main_id FROM da_data_restrict_except_open dr WHERE (dr.main_id = sa.uid) AND (dr.type_id = 2) AND (dr.except_provider_id in (select * from svp_getparentproviderids(1))))) AND sa.covered_by_roi = 1 UNION SELECT sa.uid AS main_id FROM da_answer sa WHERE (restricted = 0) AND (restricted_open_except = 0) AND sa.covered_by_roi = 1 ) sec WHERE a.covered_by_roi = 1 AND a.date_effective <= 9999999999.0 AND a.inactive != 1 AND a.uid = sec.main_id AND 5000 > 4000 ) tab, da_question q WHERE tab.question_id = q.uid AND (min_access_level <= 4000 OR min_access_level IS NULL) Table counts from relevant tables da_question 1095 da_answer 21117 da_question 1095 da_data_restrict_except_closed 3087 da_data_restrict_except_open 13391 svp_getparentproviderids(1) 1 Relevant Index create index in_da_data_restrict_provider_id on da_data_restrict(provider_id); create index in_da_data_restrict_main_id on da_data_restrict(main_id); create index in_da_data_restrict_type_id on da_data_restrict(type_id); create index in_da_data_restrict_client_id on da_data_restrict(client_id); create index in_da_dr_type_provider on da_data_restrict(type_id,provider_id); create index in_da_data_rec_provider_id ON da_data_restrict_except_closed(provider_id); create index in_da_data_rec_type_id ON da_data_restrict_except_closed(type_id); create index in_da_data_rec_main_id ON da_data_restrict_except_closed(main_id); create index in_da_data_rec_except_provider_id ON da_data_restrict_except_closed(except_provider_id); create index in_da_data_reo_provider_id ON da_data_restrict_except_open(provider_id); create index in_da_data_reo_type_id ON da_data_restrict_except_open(type_id); create index in_da_data_reo_main_id ON da_data_restrict_except_open(main_id); create index in_da_data_reo_except_provider_id ON da_data_restrict_except_open(except_provider_id); create index in_da_answer_client_id ON da_answer(client_id); create index in_da_answer_provider_id ON da_answer(provider_id); create index in_da_answer_question_id ON da_answer(question_id); create index in_da_answer_recordset_id ON da_answer(recordset_id); create index in_da_answer_restricted ON da_answer(restricted); create index in_da_answer_restricted_open_except ON da_answer(restricted_open_except); create index in_da_answer_restricted_closed_except ON da_answer(restricted_closed_except); create index in_da_answer_date_effective ON da_answer(date_effective); create index in_da_answer_inactive ON da_answer(inactive); create index in_da_answer_covered_by_roi ON da_answer(covered_by_roi); create index in_da_ed_inactive_roi ON da_answer(date_effective,inactive, covered_by_roi); create index in_da_question_mal ON da_question(min_access_level);
Attachment
* Richard Rowell (richard@bowmansystems.com) wrote: > I have included an EXPLAIN ANALYZE, relevant table counts, and relevant > indexing information. If anyone has any suggestions on how to improve > performance.... TIA! Just a thought- do the UNION's actually have to be union's or would having them be 'UNION ALL's work? Stephen
Attachment
Richard Rowell wrote: > I'm trying to port our application from MS-SQL to Postgres. We have > implemented all of our rather complicated application security in the > database. The query that follows takes a half of a second or less on > MS-SQL server and around 5 seconds on Postgres. My concern is that this > data set is rather "small" by our applications standards. It is not > unusual for the da_answer table to have 2-4 million records. I'm > worried that if this very small data set is taking 5 seconds, then a > "regular sized" data set will take far too long. > > I originally thought the NOT EXISTS on the > "da_data_restrict_except_open" table was killing performance, but the > query took the exact same amount of time after I deleted all rows from > this table. Note that the hard-coded 999999999.0, and 4000 parameters, > as well as the parameter to svp_getparentproviders are the three > variables that change from one run of this query to the next. > > I'm using Postgres 7.4.5 as packaged in Debian. shared_buffers is set > to 57344 and sort_mem=4096. That shared_buffers value sounds too large for 1GB RAM - rewind to 10000 say. Also make sure you've read the "performance tuning" article at: http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php > I have included an EXPLAIN ANALYZE, relevant table counts, and relevant > indexing information. If anyone has any suggestions on how to improve > performance.... TIA! I think it's the function call(s). > SELECT tab.answer_id, client_id, question_id, recordset_id, > date_effective, virt_field_name > FROM > ( > SELECT a.uid AS answer_id, a.client_id, a.question_id, recordset_id, > date_effective > FROM da_answer a > WHERE a.date_effective <= 9999999999.0 > AND a.inactive != 1 > AND > ( > 5000 = 4000 > OR > (EXISTS (SELECT * FROM svp_getparentproviderids(1) WHERE > svp_getparentproviderids = a.provider_id)) > ) ... >SubPlan > -> Function Scan on svp_getparentproviderids (cost=0.00..15.00 rows=5 width=4) (actual time=0.203..0.203 rows=0 loops=21089) > Filter: (svp_getparentproviderids = $1) Here it's running 21,089 loops around your function. Each one isn't costing much, but it's the total that's killing you I think. It might be possible to mark the function STABLE or such, depending on what it does - see http://www.postgresql.org/docs/7.4/static/sql-createfunction.html -- Richard Huxton Archonet Ltd
Richard Rowell <richard@bowmansystems.com> writes: > I'm trying to port our application from MS-SQL to Postgres. We have > implemented all of our rather complicated application security in the > database. The query that follows takes a half of a second or less on > MS-SQL server and around 5 seconds on Postgres. The EXPLAIN shows that most of the time is going into repeated executions of svp_getparentproviderids() in the first UNION arm: > -> Seq Scan on da_answer a (cost=0.00..63928.75 rows=10540 width=24) (actual time=279.080..4418.808rows=161 loops=1) > Filter: ((date_effective <= 9999999999::double precision) AND (inactive <> 1) AND(subplan)) > SubPlan > -> Function Scan on svp_getparentproviderids (cost=0.00..15.00 rows=5 width=4)(actual time=0.203..0.203 rows=0 loops=21089) > Filter: (svp_getparentproviderids = $1) I'd suggest replacing the EXISTS coding by IN: (EXISTS (SELECT * FROM svp_getparentproviderids(1) WHERE svp_getparentproviderids = a.provider_id)) to (a.provider_id IN (SELECT * FROM svp_getparentproviderids(1))) The latter form is likely to be significantly faster in PG 7.4. It's also possible that the speed loss compared to MSSQL is really inside the svp_getparentproviderids function; you should look into that rather than assuming this query per se is at fault. Also, do you actually need UNION as opposed to UNION ALL? The duplicate-elimination behavior of UNION is a bit expensive if not needed. It looks from the EXPLAIN output that some of the unions aren't actually eliminating any rows. regards, tom lane
The first thing to check... Did you do a recent VACUUM ANALYZE? This updates all the statistics. There are a number of places where it says "rows=1000" which is usually the "I have no idea, let me guess 1000". Also, there are a number of places where the estimates are pretty far off. For instance: Richard Rowell wrote: >-> Subquery Scan "*SELECT* 1" (cost=0.00..64034.15 rows=10540 width=24) (actual time=279.089..4419.371 rows=161 loops=1) > > estimating 10,000 when only 161 is a little bit different. > -> Seq Scan on da_answer a (cost=0.00..63928.75 rows=10540 width=24) (actual time=279.080..4418.808rows=161 loops=1) > Filter: ((date_effective <= 9999999999::double precision) AND (inactive <> 1) AND(subplan)) > > Though this could be a lack of cross-column statistics. If 2 columns are correlated, the planner isn't as accurate as it could be. Also, date_effective <= 9999999999 doesn't seem very restrictive, could you use a between statement? (date between 0 and 9999999). I know for timestamps usually giving a between is better than a single sided query. This one was underestimated. >-> Subquery Scan "*SELECT* 2" (cost=988627.58..989175.52 rows=2799 width=24) (actual time=290.730..417.720 rows=7556 loops=1) > -> Hash Join (cost=988627.58..989147.53 rows=2799 width=24) (actual time=290.722..395.739rows=7556 loops=1) > Hash Cond: ("outer".main_id = "inner".uid) > > This is one of the ones that looks like it didn't have any ideas. It could be because of the function. You might consider adding a function index, though I think there are some caveats there. >-> Function Scan on svp_getparentproviderids (cost=0.00..12.50 rows=1000 width=4) (actual time=0.473..0.474 rows=1 loops=1) > > Another very poor estimation. It might be a need to increase the statistics for this column (ALTER TABLE, ALTER COLUMN, SET STATISTICS). IIRC, compared with other db's postgres defaults to a much lower statistics value. Try changing it from 10 (?) to 100 or so. There was a discussion that every column with an index should use higher statistics. >-> Index Scan using in_da_dr_type_provider on da_data_restrict (cost=0.00..145.50 rows=46 width=8) (actual time=0.041..26.627rows=7280 loops=1) > > I'm not a great optimizer, these are just some first things to look at. Your sort mem seems pretty low to me (considering you have 1GB of RAM). Perhaps you could bump that up to 40MB instead of 4MB. Also, if you run this query twice in a row, is it still slow? (Sometimes it takes a bit of work to get the right indexes loaded into ram, but then it is faster.) Just some guesses, John =:->
Attachment
The first thing to check... Did you do a recent VACUUM ANALYZE? This updates all the statistics. There are a number of places where it says "rows=1000" which is usually the "I have no idea, let me guess 1000". Also, there are a number of places where the estimates are pretty far off. For instance: Richard Rowell wrote: >-> Subquery Scan "*SELECT* 1" (cost=0.00..64034.15 rows=10540 width=24) (actual time=279.089..4419.371 rows=161 loops=1) > > estimating 10,000 when only 161 is a little bit different. > -> Seq Scan on da_answer a (cost=0.00..63928.75 rows=10540 width=24) (actual time=279.080..4418.808rows=161 loops=1) > Filter: ((date_effective <= 9999999999::double precision) AND (inactive <> 1) AND(subplan)) > > Though this could be a lack of cross-column statistics. If 2 columns are correlated, the planner isn't as accurate as it could be. Also, date_effective <= 9999999999 doesn't seem very restrictive, could you use a between statement? (date between 0 and 9999999). I know for timestamps usually giving a between is better than a single sided query. This one was underestimated. >-> Subquery Scan "*SELECT* 2" (cost=988627.58..989175.52 rows=2799 width=24) (actual time=290.730..417.720 rows=7556 loops=1) > -> Hash Join (cost=988627.58..989147.53 rows=2799 width=24) (actual time=290.722..395.739rows=7556 loops=1) > Hash Cond: ("outer".main_id = "inner".uid) > > This is one of the ones that looks like it didn't have any ideas. It could be because of the function. You might consider adding a function index, though I think there are some caveats there. >-> Function Scan on svp_getparentproviderids (cost=0.00..12.50 rows=1000 width=4) (actual time=0.473..0.474 rows=1 loops=1) > > Another very poor estimation. It might be a need to increase the statistics for this column (ALTER TABLE, ALTER COLUMN, SET STATISTICS). IIRC, compared with other db's postgres defaults to a much lower statistics value. Try changing it from 10 (?) to 100 or so. There was a discussion that every column with an index should use higher statistics. >-> Index Scan using in_da_dr_type_provider on da_data_restrict (cost=0.00..145.50 rows=46 width=8) (actual time=0.041..26.627rows=7280 loops=1) > > I'm not a great optimizer, these are just some first things to look at. Your sort mem seems pretty low to me (considering you have 1GB of RAM). Perhaps you could bump that up to 40MB instead of 4MB. Also, if you run this query twice in a row, is it still slow? (Sometimes it takes a bit of work to get the right indexes loaded into ram, but then it is faster.) Just some guesses, John =:->