Thread: Pointers needed on optimizing slow SQL statements
I've been Googling for SQL tuning help for Postgres but the pickings have been rather slim. Maybe I'm using the wrong search terms. I'm trying to improve the performance of the following query and would be grateful for any hints, either directly on the problem at hand, or to resources I can read to find out more about how to do this. In the past I have fixed most problems by adding indexes to get rid of sequential scans, but in this case it appears to be the hash join and the nested loops that are taking up all the time and I don't really know what to do about that. In Google I found mostly references from people wanting to use a hash join to *fix* a performance problem, not deal with it creating one... My Postgres version is 8.3.3, on Linux. Thanks in advance, janine iso=# explain analyze select a.item_id, iso-# iso-# content_item__get_best_revision(a.item_id) as revision_id, iso-# content_item__get_latest_revision(a.item_id) as last_revision_id, iso-# content_revision__get_number(a.article_id) as revision_no, iso-# (select count(*) from cr_revisions where item_id=a.item_id) as revision_count, iso-# iso-# -- Language support iso-# b.lang_id, iso-# b.lang_key, iso-# (case when b.lang_key = 'big5' then '#D7D7D7' else '#ffffff' end) as tr_bgcolor, iso-# coalesce(dg21_item_langs__rel_lang (b.lang_id,'gb2312'),'0') as gb_item_id, iso-# coalesce(dg21_item_langs__rel_lang (b.lang_id,'iso-8859-1'),'0') as eng_item_id, iso-# iso-# -- user defined data iso-# a.article_id, iso-# a.region_id, iso-# a.author, iso-# a.archive_status, iso-# a.article_status, iso-# case when a.archive_status='t' iso-# then '<font color=#808080>never expire</font>' iso-# else to_char(a.archive_date, 'YYYY年MM月DD日') iso-# end as archive_date, iso-# iso-# -- Standard data iso-# a.article_title, iso-# a.article_desc, iso-# a.creation_user, iso-# a.creation_ip, iso-# a.modifying_user, iso-# iso-# -- Pretty format data iso-# a.item_creator, iso-# iso-# -- Other data iso-# a.live_revision, iso-# to_char(a.publish_date, 'YYYY年MM月 DD日') as publish_date, iso-# to_char(a.creation_date, 'DD/MM/YYYY HH:MI AM') as creation_date, iso-# iso-# case when article_status='approved' iso-# then 'admin content, auto approved' iso-# when article_status='unapproved' iso-# then (select approval_text iso(# from dg21_approval iso(# where revision_id=a.article_id iso(# and approval_status='f' order by approval_date desc limit 1) iso-# else '' iso-# end as approval_text iso-# iso-# from dg21_article_items a, dg21_item_langs b iso-# where a.item_id = b.item_id iso-# iso-# order by b.lang_id desc, a.item_id iso-# limit 21 offset 0; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=3516.97..3516.98 rows=1 width=1245) (actual time=195948.132..195948.250 rows=21 loops=1) -> Sort (cost=3516.97..3516.98 rows=1 width=1245) (actual time=195948.122..195948.165 rows=21 loops=1) Sort Key: b.lang_id, ci.item_id Sort Method: top-N heapsort Memory: 24kB -> Nested Loop (cost=719.67..3516.96 rows=1 width=1245) (actual time=346.687..195852.741 rows=4159 loops=1) -> Nested Loop (cost=719.67..3199.40 rows=1 width=413) (actual time=311.422..119467.334 rows=4159 loops=1) -> Nested Loop (cost=719.67..3198.86 rows=1 width=400) (actual time=292.951..1811.051 rows=4159 loops=1) -> Hash Join (cost=719.67..3197.98 rows=1 width=352) (actual time=292.832..777.290 rows=4159 loops=1) Hash Cond: (cr.item_id = ci.item_id) Join Filter: ((ci.live_revision = cr.revision_id) OR ((ci.live_revision IS NULL) AND (cr.revision_id = content_item__get_latest_revision(ci.item_id)))) -> Hash Join (cost=154.38..1265.24 rows=4950 width=348) (actual time=74.789..375.580 rows=4950 loops=1) Hash Cond: (cr.revision_id = ox.article_id) -> Seq Scan on cr_revisions cr (cost=0.00..913.73 rows=16873 width=321) (actual time=0.058..71.539 rows=16873 loops=1) -> Hash (cost=92.50..92.50 rows=4950 width=27) (actual time=74.607..74.607 rows=4950 loops=1) -> Seq Scan on dg21_articles ox (cost=0.00..92.50 rows=4950 width=27) (actual time=0.071..18.604 rows=4950 loops=1) -> Hash (cost=384.02..384.02 rows=14502 width=8) (actual time=217.789..217.789 rows=14502 loops=1) -> Seq Scan on cr_items ci (cost=0.00..384.02 rows=14502 width=8) (actual time=0.051..137.988 rows=14502 loops=1) -> Index Scan using acs_objects_pk on acs_objects ao (cost=0.00..0.88 rows=1 width=56) (actual time=0.223..0.229 rows=1 loops=4159) Index Cond: (ao.object_id = cr.revision_id) -> Index Scan using persons_pk on persons ps (cost=0.00..0.27 rows=1 width=17) (actual time=0.017..0.023 rows=1 loops=4159) Index Cond: (ps.person_id = ao.creation_user) -> Index Scan using dg21_item_langs_id_key on dg21_item_langs b (cost=0.00..8.27 rows=1 width=15) (actual time=0.526..0.537 rows=1 loops=4159) Index Cond: (b.item_id = ci.item_id) SubPlan -> Limit (cost=297.21..297.22 rows=1 width=29) (never executed) -> Sort (cost=297.21..297.22 rows=1 width=29) (never executed) Sort Key: dg21_approval.approval_date -> Seq Scan on dg21_approval (cost=0.00..297.20 rows=1 width=29) (never executed) Filter: ((revision_id = $2) AND ((approval_status)::text = 'f'::text)) -> Aggregate (cost=10.77..10.78 rows=1 width=0) (actual time=0.051..0.053 rows=1 loops=4159) -> Index Scan using cr_revisions_item_id_idx on cr_revisions (cost=0.00..10.77 rows=2 width=0) (actual time=0.019..0.024 rows=1 loops=4159) Index Cond: (item_id = $0) Total runtime: 195949.928 ms (33 rows) --- Janine Sisk President/CEO of furfly, LLC 503-693-6407
Janine Sisk <janine@furfly.net> writes: > I've been Googling for SQL tuning help for Postgres but the pickings > have been rather slim. Maybe I'm using the wrong search terms. I'm > trying to improve the performance of the following query and would be > grateful for any hints, either directly on the problem at hand, or to > resources I can read to find out more about how to do this. In the > past I have fixed most problems by adding indexes to get rid of > sequential scans, but in this case it appears to be the hash join and > the nested loops that are taking up all the time and I don't really > know what to do about that. In Google I found mostly references from > people wanting to use a hash join to *fix* a performance problem, not > deal with it creating one... The hashjoin isn't creating any problem that I can see. What's hurting you is the nestloops above it, which need to be replaced with some other join technique. The planner is going for a nestloop because it expects only one row out of the hashjoin, which is off by more than three orders of magnitude :-(. So in short, your problem is poor estimation of the selectivity of this condition: > Join Filter: ((ci.live_revision = > cr.revision_id) OR ((ci.live_revision IS NULL) AND (cr.revision_id = > content_item__get_latest_revision(ci.item_id)))) It's hard to tell why the estimate is so bad, though, since you didn't provide any additional information. Perhaps increasing the statistics target for these columns (or the whole database) would help. regards, tom lane
Ok, I will look into gathering better statistics. This is the first time I've had a significant problem with a PG database, so this is uncharted territory for me. If there is more info I could give that would help, please be more specific about what you need and I will attempt to do so. Thanks! janine On Jun 3, 2009, at 2:42 PM, Tom Lane wrote: > Janine Sisk <janine@furfly.net> writes: >> I've been Googling for SQL tuning help for Postgres but the pickings >> have been rather slim. Maybe I'm using the wrong search terms. I'm >> trying to improve the performance of the following query and would be >> grateful for any hints, either directly on the problem at hand, or to >> resources I can read to find out more about how to do this. In the >> past I have fixed most problems by adding indexes to get rid of >> sequential scans, but in this case it appears to be the hash join and >> the nested loops that are taking up all the time and I don't really >> know what to do about that. In Google I found mostly references from >> people wanting to use a hash join to *fix* a performance problem, not >> deal with it creating one... > > The hashjoin isn't creating any problem that I can see. What's > hurting you is the nestloops above it, which need to be replaced with > some other join technique. The planner is going for a nestloop > because > it expects only one row out of the hashjoin, which is off by more than > three orders of magnitude :-(. So in short, your problem is poor > estimation of the selectivity of this condition: > >> Join Filter: ((ci.live_revision = >> cr.revision_id) OR ((ci.live_revision IS NULL) AND (cr.revision_id = >> content_item__get_latest_revision(ci.item_id)))) > > It's hard to tell why the estimate is so bad, though, since you didn't > provide any additional information. Perhaps increasing the statistics > target for these columns (or the whole database) would help. > > regards, tom lane > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org > ) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance --- Janine Sisk President/CEO of furfly, LLC 503-693-6407
On Wed, Jun 3, 2009 at 6:04 PM, Janine Sisk <janine@furfly.net> wrote: > Ok, I will look into gathering better statistics. This is the first time > I've had a significant problem with a PG database, so this is uncharted > territory for me. > > If there is more info I could give that would help, please be more specific > about what you need and I will attempt to do so. > > Thanks! > > janine You might find it helpful to try to inline the content_item__get_latest_revision function call. I'm not sure whether that's a SQL function or what, but the planner isn't always real clever about things like that. If you can redesign things so that all the logic is in the actual query, you may get better results. But, we're not always real clever about selectivity. Sometimes you have to fake the planner out, as discussed here. http://archives.postgresql.org/pgsql-performance/2009-06/msg00023.php Actually, I had to do this today on a production application. In my case, the planner thought that a big OR clause was not very selective, so it figured it wouldn't have to scan very far through the outer side before it found enough rows to satisfy the LIMIT clause. Therefore it materialized the inner side instead of hashing it, and when the selectivity estimate turned out to be wrong, it took 220 seconds to execute. I added a fake join condition of the form a || b = a || b, where a and b were on different sides of the join, and now it hashes the inner side and takes < 100 ms. Fortunately, these kinds of problems are fairly rare, but they can be extremely frustrating to debug. With any kind of query debugging, the first question to ask yourself is "Are any of my selectivity estimates way off?". If the answer to that question is no, you should then ask "Where is all the time going in this plan?". If the answer to the first question is yes, though, your time is usually better spent fixing that problem, because once you do, the plan will most likely change to something a lot better. ...Robert
I'm sorry if this is a stupid question, but... I changed default_statistics_target from the default of 10 to 100, restarted PG, and then ran "vacuumdb -z" on the database. The plan is exactly the same as before. Was I supposed to do something else? Do I need to increase it even further? This is an overloaded system to start with, so I'm being fairly conservative with what I change. thanks, janine On Jun 3, 2009, at 2:42 PM, Tom Lane wrote: > Janine Sisk <janine@furfly.net> writes: >> I've been Googling for SQL tuning help for Postgres but the pickings >> have been rather slim. Maybe I'm using the wrong search terms. I'm >> trying to improve the performance of the following query and would be >> grateful for any hints, either directly on the problem at hand, or to >> resources I can read to find out more about how to do this. In the >> past I have fixed most problems by adding indexes to get rid of >> sequential scans, but in this case it appears to be the hash join and >> the nested loops that are taking up all the time and I don't really >> know what to do about that. In Google I found mostly references from >> people wanting to use a hash join to *fix* a performance problem, not >> deal with it creating one... > > The hashjoin isn't creating any problem that I can see. What's > hurting you is the nestloops above it, which need to be replaced with > some other join technique. The planner is going for a nestloop > because > it expects only one row out of the hashjoin, which is off by more than > three orders of magnitude :-(. So in short, your problem is poor > estimation of the selectivity of this condition: > >> Join Filter: ((ci.live_revision = >> cr.revision_id) OR ((ci.live_revision IS NULL) AND (cr.revision_id = >> content_item__get_latest_revision(ci.item_id)))) > > It's hard to tell why the estimate is so bad, though, since you didn't > provide any additional information. Perhaps increasing the statistics > target for these columns (or the whole database) would help. > > regards, tom lane > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org > ) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance --- Janine Sisk President/CEO of furfly, LLC 503-693-6407
On Wed, Jun 3, 2009 at 8:32 PM, Janine Sisk <janine@furfly.net> wrote: > I'm sorry if this is a stupid question, but... I changed > default_statistics_target from the default of 10 to 100, restarted PG, and > then ran "vacuumdb -z" on the database. The plan is exactly the same as > before. Was I supposed to do something else? Do I need to increase it even > further? This is an overloaded system to start with, so I'm being fairly > conservative with what I change. No need to restart pg, just analyze is good enough (vacuumdb -z will do). After that, compare your explain analyze output and see if the estimates are any better. If they're better but not good enough, try increasing stats target to something like 500 or 1000 (max is 1000) and reanalyze and see if that helps. If not, post the new explain analyze and we'll take another whack at it.
On 6/3/09 7:32 PM, Janine Sisk wrote: > I'm sorry if this is a stupid question, but... I changed > default_statistics_target from the default of 10 to 100, restarted PG, > and then ran "vacuumdb -z" on the database. The plan is exactly the same > as before. Was I supposed to do something else? Do I need to increase it > even further? This is an overloaded system to start with, so I'm being > fairly conservative with what I change. It's possible that it won't change the plan; 100 is often not enough to change the statistics. Try changing, in a superuser session, default_statistics_target to 400 and just ANALYZing the one table, and see if that changes the plan. If so, you'll want to increase the statistics setting on the filtered columns on that table. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com
On Wed, 2009-06-03 at 21:21 -0400, Robert Haas wrote: > But, we're not always real clever about selectivity. Sometimes you > have to fake the planner out, as discussed here. > > http://archives.postgresql.org/pgsql-performance/2009-06/msg00023.php > > Actually, I had to do this today on a production application. In my > case, the planner thought that a big OR clause was not very selective, > so it figured it wouldn't have to scan very far through the outer side > before it found enough rows to satisfy the LIMIT clause. Therefore it > materialized the inner side instead of hashing it, and when the > selectivity estimate turned out to be wrong, it took 220 seconds to > execute. I added a fake join condition of the form a || b = a || b, > where a and b were on different sides of the join, and now it hashes > the inner side and takes < 100 ms. > > Fortunately, these kinds of problems are fairly rare, but they can be > extremely frustrating to debug. With any kind of query debugging, the > first question to ask yourself is "Are any of my selectivity estimates > way off?". If the answer to that question is no, you should then ask > "Where is all the time going in this plan?". If the answer to the > first question is yes, though, your time is usually better spent > fixing that problem, because once you do, the plan will most likely > change to something a lot better. The Function Index solution works, but it would be much better if we could get the planner to remember certain selectivities. I'm thinking a command like ANALYZE foo [WHERE .... ] which would specifically analyze the selectivity of the given WHERE clause for use in queries. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support
Hi, Le 6 juin 09 à 10:50, Simon Riggs a écrit : > On Wed, 2009-06-03 at 21:21 -0400, Robert Haas wrote: >> But, we're not always real clever about selectivity. Sometimes you >> have to fake the planner out, as discussed here. [...] > >> Fortunately, these kinds of problems are fairly rare, but they can be >> extremely frustrating to debug. With any kind of query debugging, >> the >> first question to ask yourself is "Are any of my selectivity >> estimates >> way off?". If the answer to that question is no, you should then ask >> "Where is all the time going in this plan?". If the answer to the >> first question is yes, though, your time is usually better spent >> fixing that problem, because once you do, the plan will most likely >> change to something a lot better. > > The Function Index solution works, but it would be much better if we > could get the planner to remember certain selectivities. > > I'm thinking a command like > > ANALYZE foo [WHERE .... ] > > which would specifically analyze the selectivity of the given WHERE > clause for use in queries. I don't know the stats subsystem well enough to judge by myself how good this idea is, but I have some remarks about it: - it looks good :) - where to store the clauses to analyze? - do we want to tackle JOIN selectivity patterns too (more than one table)? An extension to the ANALYZE foo WHERE ... idea would be then to be able to analyze random SQL, which could lead to allow for maintaining VIEW stats. Is this already done, and if not, feasible and a good idea? This way one could define a view and have the system analyze the clauses and selectivity of joins etc, then the hard part is for the planner to be able to use those in user queries... mmm... maybe this isn't going to help much? Regards, -- dim
On Sat, Jun 6, 2009 at 4:50 AM, Simon Riggs<simon@2ndquadrant.com> wrote: > The Function Index solution works, but it would be much better if we > could get the planner to remember certain selectivities. I agree. > I'm thinking a command like > > ANALYZE foo [WHERE .... ] > > which would specifically analyze the selectivity of the given WHERE > clause for use in queries. I think that's probably not the best syntax, because we don't want to just do it once; we want to make it a persistent property of the table so that every future ANALYZE run picks it up. Maybe something like: ALTER TABLE <table> ADD ANALYZE <name> (<clause>) ALTER TABLE <table> DROP ANALYZE <name> (I'm not in love with this so feel free to suggest improvements.) One possible problem with this kind of thing is that it could be inconvenient if the number of clauses that you need to analyze is large. For example, suppose you have a table called "object" with a column called "type_id". It's not unlikely that the histograms and MCVs for many of the columns in that table will be totally different depending on the value of type_id. There might be enough different WHERE clauses that capturing their selectivity individually wouldn't be practical, or at least not convenient. One possible alternative would be to change the meaning of the <clause>, so that instead of just asking the planner to gather selectivity on that one clause, it asks the planner to gather a whole separate set of statistics for the case where that clause holds. Then when we plan a query, we set the theorem-prover to work on the clauses (a la constraint exclusion) and see if any of them are implied by the query. If so, we can use that set of statistics in lieu of the global table statistics. There is the small matter of figuring out what to do if we added multiple clauses and more than one is provable, but <insert hand-waving here>. It would also be good to do this automatically whenever a partial index is present. ...Robert
I'd prefer ALTER VIEW <name> SET ANALYZE=true; or CREATE/DROP ANALYZE <SQL>;
Also it should be possible to change statistics target for analyzed columns.
Such a statement would allow to analyze multi-table correlations. Note that for view planner should be able to use correlation information even for queries that do not use view, but may benefit from the information.
Also it should be possible to change statistics target for analyzed columns.
Such a statement would allow to analyze multi-table correlations. Note that for view planner should be able to use correlation information even for queries that do not use view, but may benefit from the information.
Віталій Тимчишин <tivv00@gmail.com> writes: > I'd prefer ALTER VIEW <name> SET ANALYZE=true; or CREATE/DROP ANALYZE <SQL>; > Also it should be possible to change statistics target for analyzed > columns. Yeah, my idea was ALTER VIEW <name> ENABLE ANALYZE; but that's an easy point to solve if the idea proves helpful. > Such a statement would allow to analyze multi-table correlations. Note > that for view planner should be able to use correlation information > even for queries that do not use view, but may benefit from the > information. That sounds like the hard part of it, but maybe our lovely geniuses will come back and tell: "oh, you can do it this way, easy enough". :) Regards, -- dim