Re: Redundant sub query triggers slow nested loop left join - Mailing list pgsql-performance
From | henk de wit |
---|---|
Subject | Re: Redundant sub query triggers slow nested loop left join |
Date | |
Msg-id | BAY106-F2367BB5AD345502ACBA1BCF54E0@phx.gbl Whole thread Raw |
In response to | Redundant sub query triggers slow nested loop left join ("henk de wit" <henk53602@hotmail.com>) |
List | pgsql-performance |
Perhaps one other interesting observation; when I earlier removed the status check for which the rows got so wrongly estimated, the query got dramatically faster. However, once I also remove all redundant checks the query gets slower again. This is the query with both status and redundant check removed: SELECT id, status, merchant_id, description, org_text, users_banners_id, banner_url, cookie_redirect, type, CASE WHEN special_deal IS null THEN '' ELSE 'special deal' END AS special_deal, CASE WHEN url_of_banner IS null THEN '' ELSE url_of_banner END AS url_of_banner, CASE WHEN period_end IS NULL THEN 'not_active' ELSE 'active' END AS active_not_active, CASE WHEN ecpc IS NULL THEN 0.00 ELSE ROUND(ecpc::numeric,2) END AS ecpc, CASE WHEN ecpc_merchant IS NULL THEN 0.00 ELSE ROUND(ecpc_merchant::numeric,2) END AS ecpc_merchant FROM /* SUBQUERY grand_total_fetch_banners */ ( /* SUBQUERY grand_total */( /* SUBQUERY banners_special_deals */ ( /* SUBQUERY banners */ ( SELECT * FROM /* SUBQUERY banners_links */ ( SELECT banners_links.id, merchant_id, banners_org.banner_text AS org_text, description, status, banner_url, ecpc, ecpc_merchant, COALESCE(cookie_redirect,0) AS cookie_redirect FROM /* SUBQUERY banners_links */ ( /* subselect tot join ecpc_per_banner_links on banners_links*/ /* SUBQUERY banners_links */ ( SELECT * FROM banners_links WHERE merchant_id = 217 ) AS banners_links LEFT OUTER JOIN /* SUBQUERY ecpc_per_banner_link */ ( SELECT CASE WHEN clicks_total > 0 THEN (revenue_total_affiliate/clicks_total)::float/1000.0 ELSE 0.0 END AS ecpc, CASE WHEN clicks_total > 0 THEN (revenue_total/clicks_total)::float/1000.0 ELSE 0.0 END AS ecpc_merchant, banners_links_id FROM precalculated_stats_banners_links WHERE status = 4 ) AS ecpc_per_banner_link ON (banners_links.id = ecpc_per_banner_link.banners_links_id) ) AS banners_links , banners_org WHERE banners_links.id = banners_org.id_banner AND (banners_links.id = -1 OR -1 = -1) ) AS banners_links LEFT OUTER JOIN /* SUBQUERY users_banners_tot_sub */( SELECT MAX (users_banners_id) AS users_banners_id, merchant_users_banners_id, banner_id FROM /* SUBQUERY users_banners_rotations_sub */( SELECT affiliate_id AS merchant_users_banners_id, users_banners.id AS users_banners_id, users_banners_rotation.banner_id FROM users_banners, users_banners_rotation WHERE users_banners_rotation.users_banners_id = users_banners.id AND users_banners.status = 3 ) AS users_banners_rotations_sub GROUP BY merchant_users_banners_id,banner_id ) AS users_banners_tot_sub ON ( banners_links.id = users_banners_tot_sub.banner_id AND banners_links.merchant_id = users_banners_tot_sub.merchant_users_banners_id ) ) AS banners LEFT OUTER JOIN /* SUBQUERY special_deals */( SELECT banner_deals.banner_id AS id, MAX(affiliate_id) AS special_deal FROM banner_deals GROUP BY banner_deals.banner_id ) AS special_deals USING (id) ) AS banners_special_deals LEFT OUTER JOIN /* SUBQUERY types */ ( SELECT banner_types.id AS type_id, banner_types.type AS type, banners_banner_types.banner_id AS id FROM banner_types,banners_banner_types WHERE banners_banner_types.type_id = banner_types.id ) AS types USING (id) ) as grand_total LEFT OUTER JOIN /* SUBQUERY fetch_banners */ ( SELECT banners_links_id AS id, url_of_banner FROM fetch_banners ) AS fetch_banners USING (id) ) AS grand_total_fetch_banners LEFT OUTER JOIN /* SUBQUERY active_banners */ ( SELECT banner_id AS id, period_end FROM reward_ratings WHERE now() BETWEEN period_start AND period_end ) AS active_banners USING (id) WHERE (type_id = -1 OR -1 = -1 ) AND (special_deal IS null) ORDER BY id DESC For this query, PG comes up with the following plan: Sort (cost=3772.80..3772.81 rows=2 width=597) (actual time=3203.143..3203.315 rows=436 loops=1) Sort Key: public.banners_links.id -> Nested Loop Left Join (cost=2345.33..3772.79 rows=2 width=597) (actual time=108.926..3201.931 rows=436 loops=1) -> Nested Loop Left Join (cost=2341.06..3742.03 rows=2 width=589) (actual time=108.902..3197.302 rows=436 loops=1) Join Filter: (public.banners_links.id = ecpc_per_banner_link.banners_links_id) -> Nested Loop (cost=1722.18..2763.47 rows=2 width=573) (actual time=68.228..78.611 rows=436 loops=1) -> Hash Left Join (cost=1722.18..2754.88 rows=2 width=194) (actual time=68.219..75.916 rows=436 loops=1) Hash Cond: (public.banners_links.id = users_banners_tot_sub.banner_id) -> Nested Loop Left Join (cost=1227.70..2260.38 rows=2 width=186) (actual time=61.822..68.891 rows=436 loops=1) -> Hash Left Join (cost=1227.70..2259.73 rows=2 width=116) (actual time=61.811..67.321 rows=436 loops=1) Hash Cond: (public.banners_links.id = banners_banner_types.banner_id) -> Hash Left Join (cost=103.40..946.54 rows=2 width=81) (actual time=6.135..7.009 rows=331 loops=1) Hash Cond: (public.banners_links.id = special_deals.id) Filter: (special_deals.special_deal IS NULL) -> Bitmap Heap Scan on banners_links (cost=6.86..816.67 rows=336 width=73) (actual time=0.111..0.496 rows=336 loops=1) Recheck Cond: (merchant_id = 217) -> Bitmap Index Scan on banners_links_merchant_id_idx (cost=0.00..6.77 rows=336 width=0) (actual time=0.079..0.079 rows=336 loops=1) Index Cond: (merchant_id = 217) -> Hash (cost=86.93..86.93 rows=769 width=16) (actual time=6.012..6.012 rows=780 loops=1) -> Subquery Scan special_deals (cost=69.62..86.93 rows=769 width=16) (actual time=4.240..5.451 rows=780 loops=1) -> HashAggregate (cost=69.62..79.24 rows=769 width=16) (actual time=4.239..4.748 rows=780 loops=1) -> Seq Scan on banner_deals (cost=0.00..53.75 rows=3175 width=16) (actual time=0.006..1.485 rows=3175 loops=1) -> Hash (cost=673.83..673.83 rows=21158 width=43) (actual time=55.659..55.659 rows=22112 loops=1) -> Hash Join (cost=2.45..673.83 rows=21158 width=43) (actual time=0.047..36.885 rows=22112 loops=1) Hash Cond: (banners_banner_types.type_id = banner_types.id) -> Seq Scan on banners_banner_types (cost=0.00..376.40 rows=22240 width=16) (actual time=0.005..10.653 rows=22240 loops=1) -> Hash (cost=2.20..2.20 rows=20 width=43) (actual time=0.034..0.034 rows=20 loops=1) -> Seq Scan on banner_types (cost=0.00..2.20 rows=20 width=43) (actual time=0.003..0.016 rows=20 loops=1) -> Index Scan using fetch_banners_banners_links_id_idx on fetch_banners (cost=0.00..0.32 rows=1 width=78) (actual time=0.002..0.002 rows=0 loops=436) Index Cond: (public.banners_links.id = public.fetch_banners.banners_links_id) -> Hash (cost=494.34..494.34 rows=11 width=24) (actual time=6.378..6.378 rows=336 loops=1) -> Subquery Scan users_banners_tot_sub (cost=494.09..494.34 rows=11 width=24) (actual time=5.588..6.124 rows=336 loops=1) -> HashAggregate (cost=494.09..494.23 rows=11 width=24) (actual time=5.586..5.810 rows=336 loops=1) -> Nested Loop (cost=360.46..494.01 rows=11 width=24) (actual time=2.876..5.232 rows=336 loops=1) -> Bitmap Heap Scan on users_banners (cost=360.46..402.65 rows=11 width=16) (actual time=2.863..3.133 rows=336 loops=1) Recheck Cond: ((affiliate_id = 217) AND ((status)::text = '3'::text)) -> BitmapAnd (cost=360.46..360.46 rows=11 width=0) (actual time=2.842..2.842 rows=0 loops=1) -> Bitmap Index Scan on users_banners_affiliate_id_idx (cost=0.00..5.31 rows=138 width=0) (actual time=0.072..0.072 rows=350 loops=1) Index Cond: (affiliate_id = 217) -> Bitmap Index Scan on users_banners_status_idx (cost=0.00..354.90 rows=19016 width=0) (actual time=2.741..2.741 rows=17406 loops=1) Index Cond: ((status)::text = '3'::text) -> Index Scan using users_banners_id_idx on users_banners_rotation (cost=0.00..8.29 rows=1 width=16) (actual time=0.004..0.004 rows=1 loops=336) Index Cond: (users_banners_rotation.users_banners_id = users_banners.id) -> Index Scan using banners_org_id_banner.idx on banners_org (cost=0.00..4.28 rows=1 width=387) (actual time=0.003..0.004 rows=1 loops=436) Index Cond: (public.banners_links.id = banners_org.id_banner) -> Materialize (cost=618.88..698.81 rows=7993 width=20) (actual time=0.000..3.299 rows=7923 loops=436) -> Index Scan using pre_calc_banners_status on precalculated_stats_banners_links (cost=0.00..530.96 rows=7993 width=30) (actual time=0.025..26.349 rows=7923 loops=1) Index Cond: (status = 4) -> Bitmap Heap Scan on reward_ratings (cost=4.27..15.33 rows=3 width=16) (actual time=0.005..0.005 rows=0 loops=436) Recheck Cond: (public.banners_links.id = reward_ratings.banner_id) Filter: ((now() >= period_start) AND (now() <= period_end)) -> Bitmap Index Scan on reward_ratings_banner_id_idx (cost=0.00..4.27 rows=3 width=0) (actual time=0.003..0.003 rows=0 loops=436) Index Cond: (public.banners_links.id = reward_ratings.banner_id) Total runtime: 3204.016 ms For the "banners_links.id = ecpc_per_banner_link.banners_links_id" join, it chooses the dreaded Nested loop left join again, which takes up the bulk of the query execution time. After some fiddling and experimentation with the query, I found that if I only removed both case statements in the "ecpc_per_banner_link" subquery, it becomes fast again: Sort (cost=2875.63..2875.64 rows=6 width=599) (actual time=107.824..109.456 rows=1780 loops=1) Sort Key: public.banners_links.id -> Nested Loop Left Join (cost=1726.45..2875.55 rows=6 width=599) (actual time=68.243..98.013 rows=1780 loops=1) -> Nested Loop Left Join (cost=1722.18..2783.30 rows=6 width=591) (actual time=68.220..84.351 rows=1780 loops=1) -> Nested Loop (cost=1722.18..2763.47 rows=2 width=573) (actual time=68.210..78.427 rows=436 loops=1) -> Hash Left Join (cost=1722.18..2754.88 rows=2 width=194) (actual time=68.196..75.592 rows=436 loops=1) Hash Cond: (public.banners_links.id = users_banners_tot_sub.banner_id) -> Nested Loop Left Join (cost=1227.70..2260.38 rows=2 width=186) (actual time=61.870..68.654 rows=436 loops=1) -> Hash Left Join (cost=1227.70..2259.73 rows=2 width=116) (actual time=61.859..67.140 rows=436 loops=1) Hash Cond: (public.banners_links.id = banners_banner_types.banner_id) -> Hash Left Join (cost=103.40..946.54 rows=2 width=81) (actual time=6.099..6.944 rows=331 loops=1) Hash Cond: (public.banners_links.id = special_deals.id) Filter: (special_deals.special_deal IS NULL) -> Bitmap Heap Scan on banners_links (cost=6.86..816.67 rows=336 width=73) (actual time=0.105..0.451 rows=336 loops=1) Recheck Cond: (merchant_id = 217) -> Bitmap Index Scan on banners_links_merchant_id_idx (cost=0.00..6.77 rows=336 width=0) (actual time=0.073..0.073 rows=336 loops=1) Index Cond: (merchant_id = 217) -> Hash (cost=86.93..86.93 rows=769 width=16) (actual time=5.989..5.989 rows=780 loops=1) -> Subquery Scan special_deals (cost=69.62..86.93 rows=769 width=16) (actual time=4.225..5.445 rows=780 loops=1) -> HashAggregate (cost=69.62..79.24 rows=769 width=16) (actual time=4.223..4.742 rows=780 loops=1) -> Seq Scan on banner_deals (cost=0.00..53.75 rows=3175 width=16) (actual time=0.006..1.484 rows=3175 loops=1) -> Hash (cost=673.83..673.83 rows=21158 width=43) (actual time=55.750..55.750 rows=22112 loops=1) -> Hash Join (cost=2.45..673.83 rows=21158 width=43) (actual time=0.042..36.943 rows=22112 loops=1) Hash Cond: (banners_banner_types.type_id = banner_types.id) -> Seq Scan on banners_banner_types (cost=0.00..376.40 rows=22240 width=16) (actual time=0.005..10.791 rows=22240 loops=1) -> Hash (cost=2.20..2.20 rows=20 width=43) (actual time=0.032..0.032 rows=20 loops=1) -> Seq Scan on banner_types (cost=0.00..2.20 rows=20 width=43) (actual time=0.004..0.016 rows=20 loops=1) -> Index Scan using fetch_banners_banners_links_id_idx on fetch_banners (cost=0.00..0.32 rows=1 width=78) (actual time=0.002..0.002 rows=0 loops=436) Index Cond: (public.banners_links.id = public.fetch_banners.banners_links_id) -> Hash (cost=494.34..494.34 rows=11 width=24) (actual time=6.312..6.312 rows=336 loops=1) -> Subquery Scan users_banners_tot_sub (cost=494.09..494.34 rows=11 width=24) (actual time=5.519..6.056 rows=336 loops=1) -> HashAggregate (cost=494.09..494.23 rows=11 width=24) (actual time=5.518..5.747 rows=336 loops=1) -> Nested Loop (cost=360.46..494.01 rows=11 width=24) (actual time=2.814..5.166 rows=336 loops=1) -> Bitmap Heap Scan on users_banners (cost=360.46..402.65 rows=11 width=16) (actual time=2.801..3.079 rows=336 loops=1) Recheck Cond: ((affiliate_id = 217) AND ((status)::text = '3'::text)) -> BitmapAnd (cost=360.46..360.46 rows=11 width=0) (actual time=2.781..2.781 rows=0 loops=1) -> Bitmap Index Scan on users_banners_affiliate_id_idx (cost=0.00..5.31 rows=138 width=0) (actual time=0.088..0.088 rows=350 loops=1) Index Cond: (affiliate_id = 217) -> Bitmap Index Scan on users_banners_status_idx (cost=0.00..354.90 rows=19016 width=0) (actual time=2.673..2.673 rows=17406 loops=1) Index Cond: ((status)::text = '3'::text) -> Index Scan using users_banners_id_idx on users_banners_rotation (cost=0.00..8.29 rows=1 width=16) (actual time=0.004..0.004 rows=1 loops=336) Index Cond: (users_banners_rotation.users_banners_id = users_banners.id) -> Index Scan using banners_org_id_banner.idx on banners_org (cost=0.00..4.28 rows=1 width=387) (actual time=0.004..0.004 rows=1 loops=436) Index Cond: (public.banners_links.id = banners_org.id_banner) -> Index Scan using pre_calc_banners_id on precalculated_stats_banners_links (cost=0.00..9.87 rows=4 width=22) (actual time=0.004..0.008 rows=4 loops=436) Index Cond: (public.banners_links.id = precalculated_stats_banners_links.banners_links_id) -> Bitmap Heap Scan on reward_ratings (cost=4.27..15.33 rows=3 width=16) (actual time=0.004..0.004 rows=0 loops=1780) Recheck Cond: (public.banners_links.id = reward_ratings.banner_id) Filter: ((now() >= period_start) AND (now() <= period_end)) -> Bitmap Index Scan on reward_ratings_banner_id_idx (cost=0.00..4.27 rows=3 width=0) (actual time=0.003..0.003 rows=1 loops=1780) Index Cond: (public.banners_links.id = reward_ratings.banner_id) Total runtime: 111.046 ms I'm really having a hard time with this query. Every time when I change the smallest of things, the query time dramatically jumps up or down. I understand that some calculations just take time, but it seems to me that its not the particular things I actually do that makes the difference, but the fact that some actions or structure of the query 'just happen' to force a bad plan while others just happen to force a good plan. With my limited knowledge I absolutely see no connection between what actions influence what and why. Could someone shed some light on this issue? _________________________________________________________________ Play online games with your friends with Messenger http://www.join.msn.com/messenger/overview
pgsql-performance by date: