Thread: Redundant sub query triggers slow nested loop left join
In investigating a slow query, I distiled the code below from a larger query: SELECT * FROM /* SUBQUERY banners */ ( SELECT * FROM /* SUBQUERY banners_links */ ( SELECT * FROM banners_links WHERE merchant_id = 5631 ) as banners_links WHERE merchant_id = 5631 AND banners_links.status = 0 ) AS banners LEFT OUTER JOIN /* SUBQUERY types */ ( SELECT banner_types.id AS type_id, banner_types.type AS type, banners_banner_types.banner_id AS id FROM banner_types,banners_banner_types WHERE banners_banner_types.banner_id IN /* SUBQUERY */ ( SELECT id FROM banners_links WHERE merchant_id = 5631 ) AND banners_banner_types.type_id = banner_types.id ) AS types USING (id) Obviously, the subquery "banners_links" is redundant. The query however is a generated one, and this redundancy probably wasn't noted before. Logically you would say it shouldn't hurt, but in fact it does. The above query executes painfully slow. The left outer join is killing the performance, as witnessed by the plan: "Nested Loop Left Join (cost=964.12..1480.67 rows=1 width=714) (actual time=20.801..8233.410 rows=553 loops=1)" " Join Filter: (public.banners_links.id = banners_banner_types.banner_id)" " -> Bitmap Heap Scan on banners_links (cost=4.35..42.12 rows=1 width=671) (actual time=0.127..0.690 rows=359 loops=1)" " Recheck Cond: ((merchant_id = 5631) AND (merchant_id = 5631))" " Filter: ((status)::text = '0'::text)" " -> Bitmap Index Scan on banners_links_merchant_id_idx (cost=0.00..4.35 rows=10 width=0) (actual time=0.092..0.092 rows=424 loops=1)" " Index Cond: ((merchant_id = 5631) AND (merchant_id = 5631))" " -> Hash Join (cost=959.77..1432.13 rows=514 width=51) (actual time=0.896..22.611 rows=658 loops=359)" " Hash Cond: (banners_banner_types.type_id = banner_types.id)" " -> Hash IN Join (cost=957.32..1422.52 rows=540 width=16) (actual time=0.894..21.878 rows=658 loops=359)" " Hash Cond: (banners_banner_types.banner_id = public.banners_links.id)" " -> Seq Scan on banners_banner_types (cost=0.00..376.40 rows=22240 width=16) (actual time=0.003..10.149 rows=22240 loops=359)" " -> Hash (cost=952.02..952.02 rows=424 width=8) (actual time=0.779..0.779 rows=424 loops=1)" " -> Bitmap Heap Scan on banners_links (cost=11.54..952.02 rows=424 width=8) (actual time=0.108..0.513 rows=424 loops=1)" " Recheck Cond: (merchant_id = 5631)" " -> Bitmap Index Scan on banners_links_merchant_id_idx (cost=0.00..11.43 rows=424 width=0) (actual time=0.078..0.078 rows=424 loops=1)" " Index Cond: (merchant_id = 5631)" " -> Hash (cost=2.20..2.20 rows=20 width=43) (actual time=0.033..0.033 rows=20 loops=1)" " -> Seq Scan on banner_types (cost=0.00..2.20 rows=20 width=43) (actual time=0.004..0.017 rows=20 loops=1)" "Total runtime: 8233.710 ms" I noticed that the recheck condition looks a bit weird: Recheck Cond: ((merchant_id = 5631) AND (merchant_id = 5631)) You would think that PG (8.2.3) would be smart enough to optimize this away. Also the estimate of the nested loop left join and the actual results are way off. I tried increasing the statistics of both public.banners_links.id and banners_banner_types.banner_id (to the highest value 1000), analyzed, vacuum analyzed and did a vacuum full, but without any improvements. Anyway, when I remove the redundant sub query the code becomes: SELECT * FROM /* SUBQUERY banners */ ( SELECT * FROM banners_links WHERE merchant_id = 5631 AND banners_links.status = 0 ) AS banners LEFT OUTER JOIN /* SUBQUERY types */ ( SELECT banner_types.id AS type_id, banner_types.type AS type, banners_banner_types.banner_id AS id FROM banner_types,banners_banner_types WHERE banners_banner_types.banner_id IN /* SUBQUERY */ ( SELECT id FROM banners_links WHERE merchant_id = 5631 ) AND banners_banner_types.type_id = banner_types.id ) AS types USING (id) With this query, the execution time drops from 8 seconds to a mere 297 ms! The plan now looks as follows: "Hash Left Join (cost=1449.99..2392.68 rows=2 width=714) (actual time=24.257..25.292 rows=553 loops=1)" " Hash Cond: (public.banners_links.id = banners_banner_types.banner_id)" " -> Bitmap Heap Scan on banners_links (cost=11.43..954.03 rows=2 width=671) (actual time=0.122..0.563 rows=359 loops=1)" " Recheck Cond: (merchant_id = 5631)" " Filter: ((status)::text = '0'::text)" " -> Bitmap Index Scan on banners_links_merchant_id_idx (cost=0.00..11.43 rows=424 width=0) (actual time=0.086..0.086 rows=424 loops=1)" " Index Cond: (merchant_id = 5631)" " -> Hash (cost=1432.13..1432.13 rows=514 width=51) (actual time=24.128..24.128 rows=658 loops=1)" " -> Hash Join (cost=959.77..1432.13 rows=514 width=51) (actual time=1.714..23.606 rows=658 loops=1)" " Hash Cond: (banners_banner_types.type_id = banner_types.id)" " -> Hash IN Join (cost=957.32..1422.52 rows=540 width=16) (actual time=1.677..22.811 rows=658 loops=1)" " Hash Cond: (banners_banner_types.banner_id = public.banners_links.id)" " -> Seq Scan on banners_banner_types (cost=0.00..376.40 rows=22240 width=16) (actual time=0.005..10.306 rows=22240 loops=1)" " -> Hash (cost=952.02..952.02 rows=424 width=8) (actual time=0.772..0.772 rows=424 loops=1)" " -> Bitmap Heap Scan on banners_links (cost=11.54..952.02 rows=424 width=8) (actual time=0.105..0.510 rows=424 loops=1)" " Recheck Cond: (merchant_id = 5631)" " -> Bitmap Index Scan on banners_links_merchant_id_idx (cost=0.00..11.43 rows=424 width=0) (actual time=0.077..0.077 rows=424 loops=1)" " Index Cond: (merchant_id = 5631)" " -> Hash (cost=2.20..2.20 rows=20 width=43) (actual time=0.032..0.032 rows=20 loops=1)" " -> Seq Scan on banner_types (cost=0.00..2.20 rows=20 width=43) (actual time=0.004..0.018 rows=20 loops=1)" "Total runtime: 25.602 ms" We see that instead of a nested loop left join the planner now chooses a Hash Left Join. I'm not really an expert in this matter and would like some more insight into what's happening here. Naively I would say that a planner would have to be smart enough to see this by itself? Thanks in advance for all hints. _________________________________________________________________ Talk with your online friends with Messenger http://www.join.msn.com/messenger/overview
"henk de wit" <henk53602@hotmail.com> writes: > Naively I would say that a planner > would have to be smart enough to see this by itself? We got rid of direct tests for redundant WHERE clauses a long time ago (in 7.4, according to some quick tests I just made). They took a lot of cycles and almost never accomplished anything. Since you have two redundant tests, the selectivity is being double-counted, leading to a too-small rows estimate and a not very appropriate choice of join plan. FWIW, CVS HEAD does get rid of the duplicate conditions for the common case of mergejoinable equality operators --- but it's not explicitly looking for duplicate conditions, rather this is falling out of a new method for making transitive equality deductions. regards, tom lane
>Since you have two redundant tests, the selectivity is being >double-counted, leading to a too-small rows estimate and a not very >appropriate choice of join plan. I see, thanks for the explanation. I did notice though that in the second case, with 1 redundant test removed, the estimate is still low: "Hash Left Join (cost=1449.99..2392.68 rows=2 width=714) (actual time=24.257..25.292 rows=553 loops=1)" In that case the prediction is 2 rows, which is only 1 row more than in the previous case. Yet the plan is much better and performance improved dramatically. Is there a reason/explanation for that? >FWIW, CVS HEAD does get rid of the duplicate conditions for the common >case of mergejoinable equality operators --- but it's not explicitly >looking for duplicate conditions, rather this is falling out of a new >method for making transitive equality deductions. This sounds very interesting Tom. Is there some documentation somewhere where I can read about this new method? _________________________________________________________________ Live Search, for accurate results! http://www.live.nl
"henk de wit" <henk53602@hotmail.com> writes: > In that case the prediction is 2 rows, which is only 1 row more than in the > previous case. Yet the plan is much better and performance improved > dramatically. Is there a reason/explanation for that? Well, it's just an estimated-cost comparison. If there's only one row then a nestloop join looks like the best way since it requires no extra overhead. But as soon as you get to two rows, the other side of the join would have to be executed twice, and that's more expensive than doing it once and setting up a hash table. In the actual event, with 359 rows out of the scan, the nestloop way is just horrid because it repeats the other side 359 times :-( It strikes me that it might be interesting to use a minimum rowcount estimate of two rows, not one, for any case where we can't actually prove there is at most one row (ie, the query conditions match a unique index). That is probably enough to discourage this sort of brittle behavior ... though no doubt there'd still be cases where it's the wrong thing. We do not actually have any code right now to make such proofs, but there's been some discussion recently about adding such logic in support of removing useless outer joins. >> FWIW, CVS HEAD does get rid of the duplicate conditions for the common >> case of mergejoinable equality operators --- but it's not explicitly >> looking for duplicate conditions, rather this is falling out of a new >> method for making transitive equality deductions. > This sounds very interesting Tom. Is there some documentation somewhere > where I can read about this new method? Check the archives for mention of equivalence classes, notably these two threads: http://archives.postgresql.org/pgsql-hackers/2007-01/msg00568.php http://archives.postgresql.org/pgsql-hackers/2007-01/msg00826.php regards, tom lane
>In the actual event, with >359 rows out of the scan, the nestloop way is just horrid because it >repeats the other side 359 times :-( Indeed. :( Btw, I tried to apply the removal of the redundant check in the larger query (the one from which I extracted the part shown earlier) but it only performs worse after that. The more redundant checks I remove, the slower the query gets. I figure the original designer of the query inserted those checks to quickly limit the number of rows involved in the nested loop. Of course, the problem is probably not the number of rows involved, but the unfortunate choice of the nested loop. I spend a few hours today in trying to figure it all out, but I'm pretty stuck at the moment. For what its worth, this is the plan PG 8.2 comes up with right after I remove the same check that made the isolated query in the openings post so much faster: Sort (cost=6006.54..6006.55 rows=1 width=597) (actual time=14561.499..14561.722 rows=553 loops=1) Sort Key: public.banners_links.id -> Nested Loop Left Join (cost=3917.68..6006.53 rows=1 width=597) (actual time=64.723..14559.811 rows=553 loops=1) Join Filter: (public.banners_links.id = public.fetch_banners.banners_links_id) -> Nested Loop Left Join (cost=3917.68..6002.54 rows=1 width=527) (actual time=64.607..14509.291 rows=553 loops=1) Join Filter: (public.banners_links.id = reward_ratings.banner_id) -> Nested Loop Left Join (cost=2960.36..4395.12 rows=1 width=519) (actual time=52.761..8562.575 rows=553 loops=1) Join Filter: (public.banners_links.id = banners_banner_types.banner_id) -> Nested Loop Left Join (cost=2000.60..2956.57 rows=1 width=484) (actual time=32.026..304.700 rows=359 loops=1) Join Filter: (public.banners_links.id = ecpc_per_banner_link.banners_links_id) -> Nested Loop (cost=124.58..1075.70 rows=1 width=468) (actual time=9.793..187.724 rows=359 loops=1) -> Nested Loop Left Join (cost=124.58..1067.42 rows=1 width=89) (actual time=9.786..184.671 rows=359 loops=1) Join Filter: (public.banners_links.id = users_banners_tot_sub.banner_id) -> Hash Left Join (cost=107.97..1050.78 rows=1 width=81) (actual time=6.119..7.605 rows=359 loops=1) Hash Cond: (public.banners_links.id = special_deals.id) Filter: (special_deals.special_deal IS NULL) -> Bitmap Heap Scan on banners_links (cost=11.43..954.03 rows=2 width=73) (actual time=0.128..1.069 rows=359 loops=1) Recheck Cond: (merchant_id = 5631) Filter: ((status)::text = '0'::text) -> Bitmap Index Scan on banners_links_merchant_id_idx (cost=0.00..11.43 rows=424 width=0) (actual time=0.089..0.089 rows=424 loops=1) Index Cond: (merchant_id = 5631) -> Hash (cost=86.93..86.93 rows=769 width=16) (actual time=5.982..5.982 rows=780 loops=1) -> Subquery Scan special_deals (cost=69.62..86.93 rows=769 width=16) (actual time=4.179..5.414 rows=780 loops=1) -> HashAggregate (cost=69.62..79.24 rows=769 width=16) (actual time=4.179..4.702 rows=780 loops=1) -> Seq Scan on banner_deals (cost=0.00..53.75 rows=3175 width=16) (actual time=0.006..1.480 rows=3175 loops=1) -> HashAggregate (cost=16.61..16.62 rows=1 width=24) (actual time=0.011..0.292 rows=424 loops=359) -> Nested Loop (cost=0.00..16.60 rows=1 width=24) (actual time=0.029..3.096 rows=424 loops=1) -> Index Scan using users_banners_affiliate_id_idx on users_banners (cost=0.00..8.30 rows=1 width=16) (actual time=0.021..0.523 rows=424 loops=1) Index Cond: ((affiliate_id = 5631) AND (affiliate_id = 5631)) Filter: ((status)::text = '3'::text) -> Index Scan using users_banners_id_idx on users_banners_rotation (cost=0.00..8.29 rows=1 width=16) (actual time=0.003..0.004 rows=1 loops=424) Index Cond: (users_banners_rotation.users_banners_id = users_banners.id) -> Index Scan using banners_org_id_banner.idx on banners_org (cost=0.00..8.27 rows=1 width=387) (actual time=0.005..0.006 rows=1 loops=359) Index Cond: (public.banners_links.id = banners_org.id_banner) -> Sort (cost=1876.01..1876.50 rows=194 width=30) (actual time=0.062..0.183 rows=290 loops=359) Sort Key: CASE WHEN (precalculated_stats_banners_links.clicks_total > 0) THEN (((precalculated_stats_banners_links.revenue_total_affiliate / (precalculated_stats_banners_links.clicks_total)::numeric))::double precision / 1000::double precision) ELSE 0::double precision END -> Merge IN Join (cost=1819.78..1868.64 rows=194 width=30) (actual time=16.701..21.797 rows=290 loops=1) Merge Cond: (precalculated_stats_banners_links.banners_links_id = public.banners_links.id) -> Sort (cost=849.26..869.24 rows=7993 width=30) (actual time=12.416..15.717 rows=7923 loops=1) Sort Key: precalculated_stats_banners_links.banners_links_id -> Index Scan using pre_calc_banners_status on precalculated_stats_banners_links (cost=0.00..331.13 rows=7993 width=30) (actual time=0.006..6.209 rows=7923 loops=1) Index Cond: (status = 4) -> Sort (cost=970.52..971.58 rows=424 width=8) (actual time=0.885..1.049 rows=366 loops=1) Sort Key: public.banners_links.id -> Bitmap Heap Scan on banners_links (cost=11.54..952.02 rows=424 width=8) (actual time=0.121..0.549 rows=424 loops=1) Recheck Cond: (merchant_id = 5631) -> Bitmap Index Scan on banners_links_merchant_id_idx (cost=0.00..11.43 rows=424 width=0) (actual time=0.087..0.087 rows=424 loops=1) Index Cond: (merchant_id = 5631) -> Hash Join (cost=959.77..1432.13 rows=514 width=43) (actual time=0.900..22.684 rows=658 loops=359) Hash Cond: (banners_banner_types.type_id = banner_types.id) -> Hash IN Join (cost=957.32..1422.52 rows=540 width=16) (actual time=0.898..21.944 rows=658 loops=359) Hash Cond: (banners_banner_types.banner_id = public.banners_links.id) -> Seq Scan on banners_banner_types (cost=0.00..376.40 rows=22240 width=16) (actual time=0.004..10.184 rows=22240 loops=359) -> Hash (cost=952.02..952.02 rows=424 width=8) (actual time=0.751..0.751 rows=424 loops=1) -> Bitmap Heap Scan on banners_links (cost=11.54..952.02 rows=424 width=8) (actual time=0.127..0.470 rows=424 loops=1) Recheck Cond: (merchant_id = 5631) -> Bitmap Index Scan on banners_links_merchant_id_idx (cost=0.00..11.43 rows=424 width=0) (actual time=0.086..0.086 rows=424 loops=1) Index Cond: (merchant_id = 5631) -> Hash (cost=2.20..2.20 rows=20 width=43) (actual time=0.037..0.037 rows=20 loops=1) -> Seq Scan on banner_types (cost=0.00..2.20 rows=20 width=43) (actual time=0.004..0.015 rows=20 loops=1) -> Hash IN Join (cost=957.32..1606.26 rows=93 width=16) (actual time=10.751..10.751 rows=0 loops=553) Hash Cond: (reward_ratings.banner_id = public.banners_links.id) -> Seq Scan on reward_ratings (cost=0.00..633.66 rows=3827 width=16) (actual time=0.007..8.770 rows=4067 loops=553) Filter: ((now() >= period_start) AND (now() <= period_end)) -> Hash (cost=952.02..952.02 rows=424 width=8) (actual time=0.747..0.747 rows=424 loops=1) -> Bitmap Heap Scan on banners_links (cost=11.54..952.02 rows=424 width=8) (actual time=0.120..0.472 rows=424 loops=1) Recheck Cond: (merchant_id = 5631) -> Bitmap Index Scan on banners_links_merchant_id_idx (cost=0.00..11.43 rows=424 width=0) (actual time=0.088..0.088 rows=424 loops=1) Index Cond: (merchant_id = 5631) -> Seq Scan on fetch_banners (cost=0.00..2.88 rows=88 width=78) (actual time=0.003..0.042 rows=88 loops=553) Total runtime: 14562.251 ms The same check (merchant_id = 5631) still appears at 5 other places in the query. If I remove one other, the query goes to 20 seconds, if I then remove one other again it goes to 28 seconds, etc all the way to more than 40 seconds. I understand the above looks like a complicated mess, but would you have any pointers of what I could possibly do next to force a better plan? >Check the archives for mention of equivalence classes, notably these >two threads: >http://archives.postgresql.org/pgsql-hackers/2007-01/msg00568.php >http://archives.postgresql.org/pgsql-hackers/2007-01/msg00826.php I'm going to read those. Thanks for the references. _________________________________________________________________ Play online games with your friends with Messenger http://www.join.msn.com/messenger/overview
"henk de wit" <henk53602@hotmail.com> writes: > I understand the above looks like a complicated mess, but would you > have any pointers of what I could possibly do next to force a better plan? Taking a closer look, it seems the problem is the underestimation of the number of rows resulting from this relation scan: > -> Bitmap Heap Scan on > banners_links (cost=11.43..954.03 rows=2 width=73) (actual > time=0.128..1.069 rows=359 loops=1) > Recheck Cond: (merchant_id = 5631) > Filter: ((status)::text = '0'::text) > -> Bitmap Index Scan on > banners_links_merchant_id_idx (cost=0.00..11.43 rows=424 width=0) (actual > time=0.089..0.089 rows=424 loops=1) > Index Cond: (merchant_id = 5631) You might be able to improve matters by increasing the statistics target for this table. I have a bad feeling though that the problem may be lack of cross-column statistics --- the thing is evidently assuming that only about 1 in 200 rows have status = '0', which might be accurate as a global average but not for this particular merchant. What exactly is the relationship between status and merchant_id, anyway? regards, tom lane
>You might be able to improve matters by increasing the statistics target >for this table. I have tried to increase the statistics for the status column to the maximum of 1000. After that I performed an analyze, vacuum analyze and vacuum full analyze on the table. Unfortunately this didn't seem to make any difference. >I have a bad feeling though that the problem may be >lack of cross-column statistics I assume this isn't a thing that can be tweaked/increased in PG 8.2? >--- the thing is evidently assuming >that only about 1 in 200 rows have status = '0', which might be accurate >as a global average but not for this particular merchant. What exactly >is the relationship between status and merchant_id, anyway? The meaning is that a "banners_link" belongs to a merchant with the id merchant_id. A "banners_link" can be disabled (status 1) or enabled (status 0). Globally about 1/3 of the banners_links have status 0 and 2/3 have status 1. The 1 in 200 estimate is indeed way off. For the merchant in question the numbers are a bit different though, but not that much. Out of 424 rows total, 359 have status 0 and 65 have status 1. _________________________________________________________________ FREE pop-up blocking with the new Windows Live Toolbar - get it now! http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/
One interesting other thing to note; if I remove the banners_links.status = 0 condition from the query altogether the execution times improve dramatically again. The results are not correct right now, but if worse comes to worst I can always remove the unwanted rows in a procedural language (it's a simple case of iterating a resultset and omitting rows with status 1). Of course this would not really be a neat solution. Anyway, the plan without the status = 0 condition now looks like this: Sort (cost=6058.87..6058.88 rows=2 width=597) (actual time=305.869..306.138 rows=658 loops=1) Sort Key: public.banners_links.id -> Nested Loop Left Join (cost=5051.23..6058.86 rows=2 width=597) (actual time=69.956..304.259 rows=658 loops=1) Join Filter: (public.banners_links.id = public.fetch_banners.banners_links_id) -> Nested Loop Left Join (cost=5048.26..6051.92 rows=2 width=527) (actual time=69.715..249.122 rows=658 loops=1) Join Filter: (public.banners_links.id = reward_ratings.banner_id) -> Nested Loop Left Join (cost=3441.91..4441.39 rows=2 width=519) (actual time=57.795..235.954 rows=658 loops=1) Join Filter: (public.banners_links.id = ecpc_per_banner_link.banners_links_id) -> Nested Loop (cost=1563.28..2554.02 rows=2 width=503) (actual time=35.359..42.018 rows=658 loops=1) -> Hash Left Join (cost=1563.28..2545.93 rows=2 width=124) (actual time=35.351..37.987 rows=658 loops=1) Hash Cond: (public.banners_links.id = users_banners_tot_sub.banner_id) -> Hash Left Join (cost=1546.63..2529.27 rows=2 width=116) (actual time=30.757..32.552 rows=658 loops=1) Hash Cond: (public.banners_links.id = banners_banner_types.banner_id) -> Hash Left Join (cost=108.08..1090.62 rows=2 width=81) (actual time=6.087..7.085 rows=424 loops=1) Hash Cond: (public.banners_links.id = special_deals.id) Filter: (special_deals.special_deal IS NULL) -> Bitmap Heap Scan on banners_links (cost=11.54..952.02 rows=424 width=73) (actual time=0.125..0.514 rows=424 loops=1) Recheck Cond: (merchant_id = 5631) -> Bitmap Index Scan on banners_links_merchant_id_idx (cost=0.00..11.43 rows=424 width=0) (actual time=0.089..0.089 rows=424 loops=1) Index Cond: (merchant_id = 5631) -> Hash (cost=86.93..86.93 rows=769 width=16) (actual time=5.951..5.951 rows=780 loops=1) -> Subquery Scan special_deals (cost=69.62..86.93 rows=769 width=16) (actual time=4.164..5.389 rows=780 loops=1) -> HashAggregate (cost=69.62..79.24 rows=769 width=16) (actual time=4.164..4.670 rows=780 loops=1) -> Seq Scan on banner_deals (cost=0.00..53.75 rows=3175 width=16) (actual time=0.005..1.496 rows=3175 loops=1) -> Hash (cost=1432.13..1432.13 rows=514 width=43) (actual time=24.661..24.661 rows=658 loops=1) -> Hash Join (cost=959.77..1432.13 rows=514 width=43) (actual time=1.780..24.147 rows=658 loops=1) Hash Cond: (banners_banner_types.type_id = banner_types.id) -> Hash IN Join (cost=957.32..1422.52 rows=540 width=16) (actual time=1.738..23.332 rows=658 loops=1) Hash Cond: (banners_banner_types.banner_id = public.banners_links.id) -> Seq Scan on banners_banner_types (cost=0.00..376.40 rows=22240 width=16) (actual time=0.005..10.355 rows=22240 loops=1) -> Hash (cost=952.02..952.02 rows=424 width=8) (actual time=0.808..0.808 rows=424 loops=1) -> Bitmap Heap Scan on banners_links (cost=11.54..952.02 rows=424 width=8) (actual time=0.114..0.515 rows=424 loops=1) Recheck Cond: (merchant_id = 5631) -> Bitmap Index Scan on banners_links_merchant_id_idx (cost=0.00..11.43 rows=424 width=0) (actual time=0.085..0.085 rows=424 loops=1) Index Cond: (merchant_id = 5631) -> Hash (cost=2.20..2.20 rows=20 width=43) (actual time=0.034..0.034 rows=20 loops=1) -> Seq Scan on banner_types (cost=0.00..2.20 rows=20 width=43) (actual time=0.004..0.016 rows=20 loops=1) -> Hash (cost=16.63..16.63 rows=1 width=24) (actual time=4.582..4.582 rows=424 loops=1) -> Subquery Scan users_banners_tot_sub (cost=16.61..16.63 rows=1 width=24) (actual time=3.548..4.235 rows=424 loops=1) -> HashAggregate (cost=16.61..16.62 rows=1 width=24) (actual time=3.547..3.850 rows=424 loops=1) -> Nested Loop (cost=0.00..16.60 rows=1 width=24) (actual time=0.031..3.085 rows=424 loops=1) -> Index Scan using users_banners_affiliate_id_idx on users_banners (cost=0.00..8.30 rows=1 width=16) (actual time=0.021..0.516 rows=424 loops=1) Index Cond: ((affiliate_id = 5631) AND (affiliate_id = 5631)) Filter: ((status)::text = '3'::text) -> Index Scan using users_banners_id_idx on users_banners_rotation (cost=0.00..8.29 rows=1 width=16) (actual time=0.003..0.004 rows=1 loops=424) Index Cond: (users_banners_rotation.users_banners_id = users_banners.id) -> Index Scan using banners_org_id_banner.idx on banners_org (cost=0.00..4.03 rows=1 width=387) (actual time=0.003..0.004 rows=1 loops=658) Index Cond: (public.banners_links.id = banners_org.id_banner) -> Materialize (cost=1878.63..1880.57 rows=194 width=20) (actual time=0.034..0.153 rows=290 loops=658) -> Sort (cost=1876.01..1876.50 rows=194 width=30) (actual time=22.105..22.230 rows=290 loops=1) Sort Key: CASE WHEN (precalculated_stats_banners_links.clicks_total > 0) THEN (((precalculated_stats_banners_links.revenue_total_affiliate / (precalculated_stats_banners_links.clicks_total)::numeric))::double precision / 1000::double precision) ELSE 0::double precision END -> Merge IN Join (cost=1819.78..1868.64 rows=194 width=30) (actual time=16.723..21.832 rows=290 loops=1) Merge Cond: (precalculated_stats_banners_links.banners_links_id = public.banners_links.id) -> Sort (cost=849.26..869.24 rows=7993 width=30) (actual time=12.474..15.725 rows=7923 loops=1) Sort Key: precalculated_stats_banners_links.banners_links_id -> Index Scan using pre_calc_banners_status on precalculated_stats_banners_links (cost=0.00..331.13 rows=7993 width=30) (actual time=0.007..6.220 rows=7923 loops=1) Index Cond: (status = 4) -> Sort (cost=970.52..971.58 rows=424 width=8) (actual time=0.862..1.012 rows=366 loops=1) Sort Key: public.banners_links.id -> Bitmap Heap Scan on banners_links (cost=11.54..952.02 rows=424 width=8) (actual time=0.121..0.490 rows=424 loops=1) Recheck Cond: (merchant_id = 5631) -> Bitmap Index Scan on banners_links_merchant_id_idx (cost=0.00..11.43 rows=424 width=0) (actual time=0.087..0.087 rows=424 loops=1) Index Cond: (merchant_id = 5631) -> Materialize (cost=1606.35..1607.28 rows=93 width=16) (actual time=0.019..0.019 rows=0 loops=658) -> Hash IN Join (cost=957.32..1606.25 rows=93 width=16) (actual time=11.916..11.916 rows=0 loops=1) Hash Cond: (reward_ratings.banner_id = public.banners_links.id) -> Seq Scan on reward_ratings (cost=0.00..633.66 rows=3826 width=16) (actual time=0.016..9.190 rows=4067 loops=1) Filter: ((now() >= period_start) AND (now() <= period_end)) -> Hash (cost=952.02..952.02 rows=424 width=8) (actual time=0.738..0.738 rows=424 loops=1) -> Bitmap Heap Scan on banners_links (cost=11.54..952.02 rows=424 width=8) (actual time=0.118..0.459 rows=424 loops=1) Recheck Cond: (merchant_id = 5631) -> Bitmap Index Scan on banners_links_merchant_id_idx (cost=0.00..11.43 rows=424 width=0) (actual time=0.086..0.086 rows=424 loops=1) Index Cond: (merchant_id = 5631) -> Materialize (cost=2.97..3.85 rows=88 width=78) (actual time=0.000..0.037 rows=88 loops=658) -> Seq Scan on fetch_banners (cost=0.00..2.88 rows=88 width=78) (actual time=0.005..0.052 rows=88 loops=1) Total runtime: 306.734 ms _________________________________________________________________ Play online games with your friends with Messenger http://www.join.msn.com/messenger/overview
"henk de wit" <henk53602@hotmail.com> writes: >> --- the thing is evidently assuming >> that only about 1 in 200 rows have status = '0', which might be accurate >> as a global average but not for this particular merchant. What exactly >> is the relationship between status and merchant_id, anyway? > The meaning is that a "banners_link" belongs to a merchant with the id > merchant_id. A "banners_link" can be disabled (status 1) or enabled (status > 0). Globally about 1/3 of the banners_links have status 0 and 2/3 have > status 1. The 1 in 200 estimate is indeed way off. Well, that's darn odd. It should not be getting that so far wrong. What's the datatype of the status column exactly (I'm guessing varchar but maybe not)? Would you show us the pg_stats row for the status column? > One interesting other thing to note; if I remove the banners_links.status = > 0 condition from the query altogether the execution times improve > dramatically again. Right, because it was dead on about how many merchant_id = 5631 rows there are. The estimation error is creeping in where it guesses how selective the status filter is. It should be using the global fraction of status = 0 rows for that, but it seems to be using a default estimate instead (1/200 is actually the default eqsel estimate now that I think about it). I'm not sure why, but I think it must have something to do with the subquery structure of your query. Were you showing us the whole truth about your query, or were there details you left out? regards, tom lane
>Well, that's darn odd. It should not be getting that so far wrong. >What's the datatype of the status column exactly (I'm guessing varchar >but maybe not)? Would you show us the pg_stats row for the status column? It has been created as a char(1) in fact. The pg_stats row for the status column is: public|banners_links|status|0|5|2|{0,1}|{0.626397,0.373603}||0.560611 >I'm not sure why, but I think it must have something to do >with the subquery structure of your query. Were you showing us the >whole truth about your query, or were there details you left out? The query I gave in the opening post was just a small part, the part that I initially identified as the 'slow path'. The last plan I gave was from the whole query, without any details left out. I didn't gave the SQL of that yet, so here it is: SELECT id, status, merchant_id, description, org_text, users_banners_id, banner_url, cookie_redirect, type, CASE WHEN special_deal IS null THEN '' ELSE 'special deal' END AS special_deal, CASE WHEN url_of_banner IS null THEN '' ELSE url_of_banner END AS url_of_banner, CASE WHEN period_end IS NULL THEN 'not_active' ELSE 'active' END AS active_not_active, CASE WHEN ecpc IS NULL THEN 0.00 ELSE ROUND(ecpc::numeric,2) END AS ecpc, CASE WHEN ecpc_merchant IS NULL THEN 0.00 ELSE ROUND(ecpc_merchant::numeric,2) END AS ecpc_merchant FROM /* SUBQUERY grand_total_fetch_banners */ ( /* SUBQUERY grand_total */( /* SUBQUERY banners_special_deals */ ( /* SUBQUERY banners */ ( SELECT * FROM /* SUBQUERY banners_links */ ( SELECT banners_links.id, merchant_id, banners_org.banner_text AS org_text, description, status, banner_url, ecpc, ecpc_merchant, COALESCE(cookie_redirect,0) AS cookie_redirect FROM /* SUBQUERY banners_links */ ( /* subselect tot join ecpc_per_banner_links on banners_links*/ /* SUBQUERY banners_links */ ( SELECT * FROM banners_links WHERE merchant_id = 5631 ) AS banners_links LEFT OUTER JOIN /* SUBQUERY ecpc_per_banner_link */ ( SELECT CASE WHEN clicks_total > 0 THEN (revenue_total_affiliate/clicks_total)::float/1000.0 ELSE 0.0 END AS ecpc, CASE WHEN clicks_total > 0 THEN (revenue_total/clicks_total)::float/1000.0 ELSE 0.0 END AS ecpc_merchant, banners_links_id FROM precalculated_stats_banners_links WHERE status = 4 AND banners_links_id IN /* SUBQUERY */ ( SELECT id FROM banners_links WHERE merchant_id = 5631 ) ORDER BY ecpc DESC ) AS ecpc_per_banner_link ON (banners_links.id = ecpc_per_banner_link.banners_links_id) ) AS banners_links , banners_org WHERE merchant_id = 5631 AND banners_links.id = banners_org.id_banner AND (banners_links.id = -1 OR -1 = -1) AND (banners_links.status = 0 OR 0 = -1) ) AS banners_links LEFT OUTER JOIN /* SUBQUERY users_banners_tot_sub */( SELECT MAX (users_banners_id) AS users_banners_id, merchant_users_banners_id, banner_id FROM /* SUBQUERY users_banners_rotations_sub */( SELECT affiliate_id AS merchant_users_banners_id, users_banners.id AS users_banners_id, users_banners_rotation.banner_id FROM users_banners, users_banners_rotation WHERE affiliate_id = 5631 AND users_banners_rotation.users_banners_id = users_banners.id AND users_banners.status = 3 ) AS users_banners_rotations_sub GROUP BY merchant_users_banners_id,banner_id ) AS users_banners_tot_sub ON ( banners_links.id = users_banners_tot_sub.banner_id AND banners_links.merchant_id = users_banners_tot_sub.merchant_users_banners_id ) ) AS banners LEFT OUTER JOIN /* SUBQUERY special_deals */( SELECT banner_deals.banner_id AS id, MAX(affiliate_id) AS special_deal FROM banner_deals GROUP BY banner_deals.banner_id ) AS special_deals USING (id) ) AS banners_special_deals LEFT OUTER JOIN /* SUBQUERY types */ ( SELECT banner_types.id AS type_id, banner_types.type AS type, banners_banner_types.banner_id AS id FROM banner_types,banners_banner_types WHERE banners_banner_types.banner_id IN /* SUBQUERY */ ( SELECT id FROM banners_links WHERE merchant_id = 5631 ) AND banners_banner_types.type_id = banner_types.id ) AS types USING (id) ) as grand_total LEFT OUTER JOIN /* SUBQUERY fetch_banners */ ( SELECT banners_links_id AS id, url_of_banner FROM fetch_banners ) AS fetch_banners USING (id) ) AS grand_total_fetch_banners LEFT OUTER JOIN /* SUBQUERY active_banners */ ( SELECT banner_id AS id, period_end FROM reward_ratings WHERE now() BETWEEN period_start AND period_end AND banner_id IN /* SUBQUERY */ ( SELECT id FROM banners_links WHERE merchant_id = 5631 ) ) AS active_banners USING (id) WHERE (type_id = -1 OR -1 = -1 ) AND (special_deal IS null) ORDER BY id DESC This is the original query without even the earlier mentioned redundant check removed. For this query, PG 8.2 creates the following plan: Sort (cost=5094.40..5094.41 rows=1 width=597) (actual time=15282.503..15282.734 rows=553 loops=1) Sort Key: public.banners_links.id -> Nested Loop Left Join (cost=3883.68..5094.39 rows=1 width=597) (actual time=64.066..15280.773 rows=553 loops=1) Join Filter: (public.banners_links.id = reward_ratings.banner_id) -> Nested Loop Left Join (cost=2926.37..3486.98 rows=1 width=589) (actual time=51.992..9231.245 rows=553 loops=1) Join Filter: (public.banners_links.id = public.fetch_banners.banners_links_id) -> Nested Loop Left Join (cost=2926.37..3483.00 rows=1 width=519) (actual time=51.898..9183.007 rows=553 loops=1) Join Filter: (public.banners_links.id = ecpc_per_banner_link.banners_links_id) -> Nested Loop (cost=1050.35..1602.14 rows=1 width=503) (actual time=29.585..9015.077 rows=553 loops=1) -> Nested Loop Left Join (cost=1050.35..1593.86 rows=1 width=124) (actual time=29.577..9010.273 rows=553 loops=1) Join Filter: (public.banners_links.id = users_banners_tot_sub.banner_id) -> Nested Loop Left Join (cost=1033.74..1577.21 rows=1 width=116) (actual time=25.904..8738.006 rows=553 loops=1) Join Filter: (public.banners_links.id = special_deals.id) Filter: (special_deals.special_deal IS NULL) -> Nested Loop Left Join (cost=964.12..1480.67 rows=1 width=108) (actual time=20.905..8259.497 rows=553 loops=1) Join Filter: (public.banners_links.id = banners_banner_types.banner_id) -> Bitmap Heap Scan on banners_links (cost=4.35..42.12 rows=1 width=73) (actual time=0.160..1.122 rows=359 loops=1) Recheck Cond: ((merchant_id = 5631) AND (merchant_id = 5631)) Filter: ((status)::text = '0'::text) -> Bitmap Index Scan on banners_links_merchant_id_idx (cost=0.00..4.35 rows=10 width=0) (actual time=0.123..0.123 rows=424 loops=1) Index Cond: ((merchant_id = 5631) AND (merchant_id = 5631)) -> Hash Join (cost=959.77..1432.13 rows=514 width=43) (actual time=0.899..22.685 rows=658 loops=359) Hash Cond: (banners_banner_types.type_id = banner_types.id) -> Hash IN Join (cost=957.32..1422.52 rows=540 width=16) (actual time=0.897..21.946 rows=658 loops=359) Hash Cond: (banners_banner_types.banner_id = public.banners_links.id) -> Seq Scan on banners_banner_types (cost=0.00..376.40 rows=22240 width=16) (actual time=0.004..10.164 rows=22240 loops=359) -> Hash (cost=952.02..952.02 rows=424 width=8) (actual time=0.790..0.790 rows=424 loops=1) -> Bitmap Heap Scan on banners_links (cost=11.54..952.02 rows=424 width=8) (actual time=0.108..0.503 rows=424 loops=1) Recheck Cond: (merchant_id = 5631) -> Bitmap Index Scan on banners_links_merchant_id_idx (cost=0.00..11.43 rows=424 width=0) (actual time=0.078..0.078 rows=424 loops=1) Index Cond: (merchant_id = 5631) -> Hash (cost=2.20..2.20 rows=20 width=43) (actual time=0.033..0.033 rows=20 loops=1) -> Seq Scan on banner_types (cost=0.00..2.20 rows=20 width=43) (actual time=0.004..0.017 rows=20 loops=1) -> HashAggregate (cost=69.62..79.24 rows=769 width=16) (actual time=0.008..0.498 rows=780 loops=553) -> Seq Scan on banner_deals (cost=0.00..53.75 rows=3175 width=16) (actual time=0.004..1.454 rows=3175 loops=1) -> HashAggregate (cost=16.61..16.62 rows=1 width=24) (actual time=0.007..0.291 rows=424 loops=553) -> Nested Loop (cost=0.00..16.60 rows=1 width=24) (actual time=0.056..3.123 rows=424 loops=1) -> Index Scan using users_banners_affiliate_id_idx on users_banners (cost=0.00..8.30 rows=1 width=16) (actual time=0.046..0.555 rows=424 loops=1) Index Cond: ((affiliate_id = 5631) AND (affiliate_id = 5631)) Filter: ((status)::text = '3'::text) -> Index Scan using users_banners_id_idx on users_banners_rotation (cost=0.00..8.29 rows=1 width=16) (actual time=0.003..0.004 rows=1 loops=424) Index Cond: (users_banners_rotation.users_banners_id = users_banners.id) -> Index Scan using "banners_org_id_banner.idx" on banners_org (cost=0.00..8.27 rows=1 width=387) (actual time=0.005..0.006 rows=1 loops=553) Index Cond: (public.banners_links.id = banners_org.id_banner) -> Sort (cost=1876.01..1876.50 rows=194 width=30) (actual time=0.041..0.161 rows=290 loops=553) Sort Key: CASE WHEN (precalculated_stats_banners_links.clicks_total > 0) THEN (((precalculated_stats_banners_links.revenue_total_affiliate / (precalculated_stats_banners_links.clicks_total)::numeric))::double precision / 1000::double precision) ELSE 0::double precision END -> Merge IN Join (cost=1819.78..1868.64 rows=194 width=30) (actual time=16.769..21.879 rows=290 loops=1) Merge Cond: (precalculated_stats_banners_links.banners_links_id = public.banners_links.id) -> Sort (cost=849.26..869.24 rows=7993 width=30) (actual time=12.486..15.740 rows=7923 loops=1) Sort Key: precalculated_stats_banners_links.banners_links_id -> Index Scan using pre_calc_banners_status on precalculated_stats_banners_links (cost=0.00..331.13 rows=7993 width=30) (actual time=0.007..6.291 rows=7923 loops=1) Index Cond: (status = 4) -> Sort (cost=970.52..971.58 rows=424 width=8) (actual time=0.879..1.023 rows=366 loops=1) Sort Key: public.banners_links.id -> Bitmap Heap Scan on banners_links (cost=11.54..952.02 rows=424 width=8) (actual time=0.123..0.509 rows=424 loops=1) Recheck Cond: (merchant_id = 5631) -> Bitmap Index Scan on banners_links_merchant_id_idx (cost=0.00..11.43 rows=424 width=0) (actual time=0.089..0.089 rows=424 loops=1) Index Cond: (merchant_id = 5631) -> Seq Scan on fetch_banners (cost=0.00..2.88 rows=88 width=78) (actual time=0.003..0.042 rows=88 loops=553) -> Hash IN Join (cost=957.32..1606.24 rows=93 width=16) (actual time=10.933..10.933 rows=0 loops=553) Hash Cond: (reward_ratings.banner_id = public.banners_links.id) -> Seq Scan on reward_ratings (cost=0.00..633.66 rows=3822 width=16) (actual time=0.007..8.955 rows=4067 loops=553) Filter: ((now() >= period_start) AND (now() <= period_end)) -> Hash (cost=952.02..952.02 rows=424 width=8) (actual time=0.738..0.738 rows=424 loops=1) -> Bitmap Heap Scan on banners_links (cost=11.54..952.02 rows=424 width=8) (actual time=0.118..0.475 rows=424 loops=1) Recheck Cond: (merchant_id = 5631) -> Bitmap Index Scan on banners_links_merchant_id_idx (cost=0.00..11.43 rows=424 width=0) (actual time=0.087..0.087 rows=424 loops=1) Index Cond: (merchant_id = 5631) Total runtime: 15283.225 ms If I change 1 of the redundant checks: /* SUBQUERY banners_links */ ( SELECT * FROM banners_links WHERE merchant_id = 5631 ) AS banners_links into just banner_links, PG comes up with the (large) plan I posted earlier. _________________________________________________________________ Live Search, for accurate results! http://www.live.nl
>Well, that's darn odd. It should not be getting that so far wrong. I've been puzzling on this for over a week now, but can't seem to find a solution. Would you have some more hints of what I could possibly try next? As far as I can see, the mentioned status column is just a simple column. Or could it be related to the fact that its type is var(1)? Would that confuse the planner? _________________________________________________________________ Talk with your online friends with Messenger http://www.join.msn.com/messenger/overview
Perhaps one other interesting observation; when I earlier removed the status check for which the rows got so wrongly estimated, the query got dramatically faster. However, once I also remove all redundant checks the query gets slower again. This is the query with both status and redundant check removed: SELECT id, status, merchant_id, description, org_text, users_banners_id, banner_url, cookie_redirect, type, CASE WHEN special_deal IS null THEN '' ELSE 'special deal' END AS special_deal, CASE WHEN url_of_banner IS null THEN '' ELSE url_of_banner END AS url_of_banner, CASE WHEN period_end IS NULL THEN 'not_active' ELSE 'active' END AS active_not_active, CASE WHEN ecpc IS NULL THEN 0.00 ELSE ROUND(ecpc::numeric,2) END AS ecpc, CASE WHEN ecpc_merchant IS NULL THEN 0.00 ELSE ROUND(ecpc_merchant::numeric,2) END AS ecpc_merchant FROM /* SUBQUERY grand_total_fetch_banners */ ( /* SUBQUERY grand_total */( /* SUBQUERY banners_special_deals */ ( /* SUBQUERY banners */ ( SELECT * FROM /* SUBQUERY banners_links */ ( SELECT banners_links.id, merchant_id, banners_org.banner_text AS org_text, description, status, banner_url, ecpc, ecpc_merchant, COALESCE(cookie_redirect,0) AS cookie_redirect FROM /* SUBQUERY banners_links */ ( /* subselect tot join ecpc_per_banner_links on banners_links*/ /* SUBQUERY banners_links */ ( SELECT * FROM banners_links WHERE merchant_id = 217 ) AS banners_links LEFT OUTER JOIN /* SUBQUERY ecpc_per_banner_link */ ( SELECT CASE WHEN clicks_total > 0 THEN (revenue_total_affiliate/clicks_total)::float/1000.0 ELSE 0.0 END AS ecpc, CASE WHEN clicks_total > 0 THEN (revenue_total/clicks_total)::float/1000.0 ELSE 0.0 END AS ecpc_merchant, banners_links_id FROM precalculated_stats_banners_links WHERE status = 4 ) AS ecpc_per_banner_link ON (banners_links.id = ecpc_per_banner_link.banners_links_id) ) AS banners_links , banners_org WHERE banners_links.id = banners_org.id_banner AND (banners_links.id = -1 OR -1 = -1) ) AS banners_links LEFT OUTER JOIN /* SUBQUERY users_banners_tot_sub */( SELECT MAX (users_banners_id) AS users_banners_id, merchant_users_banners_id, banner_id FROM /* SUBQUERY users_banners_rotations_sub */( SELECT affiliate_id AS merchant_users_banners_id, users_banners.id AS users_banners_id, users_banners_rotation.banner_id FROM users_banners, users_banners_rotation WHERE users_banners_rotation.users_banners_id = users_banners.id AND users_banners.status = 3 ) AS users_banners_rotations_sub GROUP BY merchant_users_banners_id,banner_id ) AS users_banners_tot_sub ON ( banners_links.id = users_banners_tot_sub.banner_id AND banners_links.merchant_id = users_banners_tot_sub.merchant_users_banners_id ) ) AS banners LEFT OUTER JOIN /* SUBQUERY special_deals */( SELECT banner_deals.banner_id AS id, MAX(affiliate_id) AS special_deal FROM banner_deals GROUP BY banner_deals.banner_id ) AS special_deals USING (id) ) AS banners_special_deals LEFT OUTER JOIN /* SUBQUERY types */ ( SELECT banner_types.id AS type_id, banner_types.type AS type, banners_banner_types.banner_id AS id FROM banner_types,banners_banner_types WHERE banners_banner_types.type_id = banner_types.id ) AS types USING (id) ) as grand_total LEFT OUTER JOIN /* SUBQUERY fetch_banners */ ( SELECT banners_links_id AS id, url_of_banner FROM fetch_banners ) AS fetch_banners USING (id) ) AS grand_total_fetch_banners LEFT OUTER JOIN /* SUBQUERY active_banners */ ( SELECT banner_id AS id, period_end FROM reward_ratings WHERE now() BETWEEN period_start AND period_end ) AS active_banners USING (id) WHERE (type_id = -1 OR -1 = -1 ) AND (special_deal IS null) ORDER BY id DESC For this query, PG comes up with the following plan: Sort (cost=3772.80..3772.81 rows=2 width=597) (actual time=3203.143..3203.315 rows=436 loops=1) Sort Key: public.banners_links.id -> Nested Loop Left Join (cost=2345.33..3772.79 rows=2 width=597) (actual time=108.926..3201.931 rows=436 loops=1) -> Nested Loop Left Join (cost=2341.06..3742.03 rows=2 width=589) (actual time=108.902..3197.302 rows=436 loops=1) Join Filter: (public.banners_links.id = ecpc_per_banner_link.banners_links_id) -> Nested Loop (cost=1722.18..2763.47 rows=2 width=573) (actual time=68.228..78.611 rows=436 loops=1) -> Hash Left Join (cost=1722.18..2754.88 rows=2 width=194) (actual time=68.219..75.916 rows=436 loops=1) Hash Cond: (public.banners_links.id = users_banners_tot_sub.banner_id) -> Nested Loop Left Join (cost=1227.70..2260.38 rows=2 width=186) (actual time=61.822..68.891 rows=436 loops=1) -> Hash Left Join (cost=1227.70..2259.73 rows=2 width=116) (actual time=61.811..67.321 rows=436 loops=1) Hash Cond: (public.banners_links.id = banners_banner_types.banner_id) -> Hash Left Join (cost=103.40..946.54 rows=2 width=81) (actual time=6.135..7.009 rows=331 loops=1) Hash Cond: (public.banners_links.id = special_deals.id) Filter: (special_deals.special_deal IS NULL) -> Bitmap Heap Scan on banners_links (cost=6.86..816.67 rows=336 width=73) (actual time=0.111..0.496 rows=336 loops=1) Recheck Cond: (merchant_id = 217) -> Bitmap Index Scan on banners_links_merchant_id_idx (cost=0.00..6.77 rows=336 width=0) (actual time=0.079..0.079 rows=336 loops=1) Index Cond: (merchant_id = 217) -> Hash (cost=86.93..86.93 rows=769 width=16) (actual time=6.012..6.012 rows=780 loops=1) -> Subquery Scan special_deals (cost=69.62..86.93 rows=769 width=16) (actual time=4.240..5.451 rows=780 loops=1) -> HashAggregate (cost=69.62..79.24 rows=769 width=16) (actual time=4.239..4.748 rows=780 loops=1) -> Seq Scan on banner_deals (cost=0.00..53.75 rows=3175 width=16) (actual time=0.006..1.485 rows=3175 loops=1) -> Hash (cost=673.83..673.83 rows=21158 width=43) (actual time=55.659..55.659 rows=22112 loops=1) -> Hash Join (cost=2.45..673.83 rows=21158 width=43) (actual time=0.047..36.885 rows=22112 loops=1) Hash Cond: (banners_banner_types.type_id = banner_types.id) -> Seq Scan on banners_banner_types (cost=0.00..376.40 rows=22240 width=16) (actual time=0.005..10.653 rows=22240 loops=1) -> Hash (cost=2.20..2.20 rows=20 width=43) (actual time=0.034..0.034 rows=20 loops=1) -> Seq Scan on banner_types (cost=0.00..2.20 rows=20 width=43) (actual time=0.003..0.016 rows=20 loops=1) -> Index Scan using fetch_banners_banners_links_id_idx on fetch_banners (cost=0.00..0.32 rows=1 width=78) (actual time=0.002..0.002 rows=0 loops=436) Index Cond: (public.banners_links.id = public.fetch_banners.banners_links_id) -> Hash (cost=494.34..494.34 rows=11 width=24) (actual time=6.378..6.378 rows=336 loops=1) -> Subquery Scan users_banners_tot_sub (cost=494.09..494.34 rows=11 width=24) (actual time=5.588..6.124 rows=336 loops=1) -> HashAggregate (cost=494.09..494.23 rows=11 width=24) (actual time=5.586..5.810 rows=336 loops=1) -> Nested Loop (cost=360.46..494.01 rows=11 width=24) (actual time=2.876..5.232 rows=336 loops=1) -> Bitmap Heap Scan on users_banners (cost=360.46..402.65 rows=11 width=16) (actual time=2.863..3.133 rows=336 loops=1) Recheck Cond: ((affiliate_id = 217) AND ((status)::text = '3'::text)) -> BitmapAnd (cost=360.46..360.46 rows=11 width=0) (actual time=2.842..2.842 rows=0 loops=1) -> Bitmap Index Scan on users_banners_affiliate_id_idx (cost=0.00..5.31 rows=138 width=0) (actual time=0.072..0.072 rows=350 loops=1) Index Cond: (affiliate_id = 217) -> Bitmap Index Scan on users_banners_status_idx (cost=0.00..354.90 rows=19016 width=0) (actual time=2.741..2.741 rows=17406 loops=1) Index Cond: ((status)::text = '3'::text) -> Index Scan using users_banners_id_idx on users_banners_rotation (cost=0.00..8.29 rows=1 width=16) (actual time=0.004..0.004 rows=1 loops=336) Index Cond: (users_banners_rotation.users_banners_id = users_banners.id) -> Index Scan using banners_org_id_banner.idx on banners_org (cost=0.00..4.28 rows=1 width=387) (actual time=0.003..0.004 rows=1 loops=436) Index Cond: (public.banners_links.id = banners_org.id_banner) -> Materialize (cost=618.88..698.81 rows=7993 width=20) (actual time=0.000..3.299 rows=7923 loops=436) -> Index Scan using pre_calc_banners_status on precalculated_stats_banners_links (cost=0.00..530.96 rows=7993 width=30) (actual time=0.025..26.349 rows=7923 loops=1) Index Cond: (status = 4) -> Bitmap Heap Scan on reward_ratings (cost=4.27..15.33 rows=3 width=16) (actual time=0.005..0.005 rows=0 loops=436) Recheck Cond: (public.banners_links.id = reward_ratings.banner_id) Filter: ((now() >= period_start) AND (now() <= period_end)) -> Bitmap Index Scan on reward_ratings_banner_id_idx (cost=0.00..4.27 rows=3 width=0) (actual time=0.003..0.003 rows=0 loops=436) Index Cond: (public.banners_links.id = reward_ratings.banner_id) Total runtime: 3204.016 ms For the "banners_links.id = ecpc_per_banner_link.banners_links_id" join, it chooses the dreaded Nested loop left join again, which takes up the bulk of the query execution time. After some fiddling and experimentation with the query, I found that if I only removed both case statements in the "ecpc_per_banner_link" subquery, it becomes fast again: Sort (cost=2875.63..2875.64 rows=6 width=599) (actual time=107.824..109.456 rows=1780 loops=1) Sort Key: public.banners_links.id -> Nested Loop Left Join (cost=1726.45..2875.55 rows=6 width=599) (actual time=68.243..98.013 rows=1780 loops=1) -> Nested Loop Left Join (cost=1722.18..2783.30 rows=6 width=591) (actual time=68.220..84.351 rows=1780 loops=1) -> Nested Loop (cost=1722.18..2763.47 rows=2 width=573) (actual time=68.210..78.427 rows=436 loops=1) -> Hash Left Join (cost=1722.18..2754.88 rows=2 width=194) (actual time=68.196..75.592 rows=436 loops=1) Hash Cond: (public.banners_links.id = users_banners_tot_sub.banner_id) -> Nested Loop Left Join (cost=1227.70..2260.38 rows=2 width=186) (actual time=61.870..68.654 rows=436 loops=1) -> Hash Left Join (cost=1227.70..2259.73 rows=2 width=116) (actual time=61.859..67.140 rows=436 loops=1) Hash Cond: (public.banners_links.id = banners_banner_types.banner_id) -> Hash Left Join (cost=103.40..946.54 rows=2 width=81) (actual time=6.099..6.944 rows=331 loops=1) Hash Cond: (public.banners_links.id = special_deals.id) Filter: (special_deals.special_deal IS NULL) -> Bitmap Heap Scan on banners_links (cost=6.86..816.67 rows=336 width=73) (actual time=0.105..0.451 rows=336 loops=1) Recheck Cond: (merchant_id = 217) -> Bitmap Index Scan on banners_links_merchant_id_idx (cost=0.00..6.77 rows=336 width=0) (actual time=0.073..0.073 rows=336 loops=1) Index Cond: (merchant_id = 217) -> Hash (cost=86.93..86.93 rows=769 width=16) (actual time=5.989..5.989 rows=780 loops=1) -> Subquery Scan special_deals (cost=69.62..86.93 rows=769 width=16) (actual time=4.225..5.445 rows=780 loops=1) -> HashAggregate (cost=69.62..79.24 rows=769 width=16) (actual time=4.223..4.742 rows=780 loops=1) -> Seq Scan on banner_deals (cost=0.00..53.75 rows=3175 width=16) (actual time=0.006..1.484 rows=3175 loops=1) -> Hash (cost=673.83..673.83 rows=21158 width=43) (actual time=55.750..55.750 rows=22112 loops=1) -> Hash Join (cost=2.45..673.83 rows=21158 width=43) (actual time=0.042..36.943 rows=22112 loops=1) Hash Cond: (banners_banner_types.type_id = banner_types.id) -> Seq Scan on banners_banner_types (cost=0.00..376.40 rows=22240 width=16) (actual time=0.005..10.791 rows=22240 loops=1) -> Hash (cost=2.20..2.20 rows=20 width=43) (actual time=0.032..0.032 rows=20 loops=1) -> Seq Scan on banner_types (cost=0.00..2.20 rows=20 width=43) (actual time=0.004..0.016 rows=20 loops=1) -> Index Scan using fetch_banners_banners_links_id_idx on fetch_banners (cost=0.00..0.32 rows=1 width=78) (actual time=0.002..0.002 rows=0 loops=436) Index Cond: (public.banners_links.id = public.fetch_banners.banners_links_id) -> Hash (cost=494.34..494.34 rows=11 width=24) (actual time=6.312..6.312 rows=336 loops=1) -> Subquery Scan users_banners_tot_sub (cost=494.09..494.34 rows=11 width=24) (actual time=5.519..6.056 rows=336 loops=1) -> HashAggregate (cost=494.09..494.23 rows=11 width=24) (actual time=5.518..5.747 rows=336 loops=1) -> Nested Loop (cost=360.46..494.01 rows=11 width=24) (actual time=2.814..5.166 rows=336 loops=1) -> Bitmap Heap Scan on users_banners (cost=360.46..402.65 rows=11 width=16) (actual time=2.801..3.079 rows=336 loops=1) Recheck Cond: ((affiliate_id = 217) AND ((status)::text = '3'::text)) -> BitmapAnd (cost=360.46..360.46 rows=11 width=0) (actual time=2.781..2.781 rows=0 loops=1) -> Bitmap Index Scan on users_banners_affiliate_id_idx (cost=0.00..5.31 rows=138 width=0) (actual time=0.088..0.088 rows=350 loops=1) Index Cond: (affiliate_id = 217) -> Bitmap Index Scan on users_banners_status_idx (cost=0.00..354.90 rows=19016 width=0) (actual time=2.673..2.673 rows=17406 loops=1) Index Cond: ((status)::text = '3'::text) -> Index Scan using users_banners_id_idx on users_banners_rotation (cost=0.00..8.29 rows=1 width=16) (actual time=0.004..0.004 rows=1 loops=336) Index Cond: (users_banners_rotation.users_banners_id = users_banners.id) -> Index Scan using banners_org_id_banner.idx on banners_org (cost=0.00..4.28 rows=1 width=387) (actual time=0.004..0.004 rows=1 loops=436) Index Cond: (public.banners_links.id = banners_org.id_banner) -> Index Scan using pre_calc_banners_id on precalculated_stats_banners_links (cost=0.00..9.87 rows=4 width=22) (actual time=0.004..0.008 rows=4 loops=436) Index Cond: (public.banners_links.id = precalculated_stats_banners_links.banners_links_id) -> Bitmap Heap Scan on reward_ratings (cost=4.27..15.33 rows=3 width=16) (actual time=0.004..0.004 rows=0 loops=1780) Recheck Cond: (public.banners_links.id = reward_ratings.banner_id) Filter: ((now() >= period_start) AND (now() <= period_end)) -> Bitmap Index Scan on reward_ratings_banner_id_idx (cost=0.00..4.27 rows=3 width=0) (actual time=0.003..0.003 rows=1 loops=1780) Index Cond: (public.banners_links.id = reward_ratings.banner_id) Total runtime: 111.046 ms I'm really having a hard time with this query. Every time when I change the smallest of things, the query time dramatically jumps up or down. I understand that some calculations just take time, but it seems to me that its not the particular things I actually do that makes the difference, but the fact that some actions or structure of the query 'just happen' to force a bad plan while others just happen to force a good plan. With my limited knowledge I absolutely see no connection between what actions influence what and why. Could someone shed some light on this issue? _________________________________________________________________ Play online games with your friends with Messenger http://www.join.msn.com/messenger/overview