Re: Redundant sub query triggers slow nested loop left join - Mailing list pgsql-performance

From henk de wit
Subject Re: Redundant sub query triggers slow nested loop left join
Date
Msg-id BAY106-F2367BB5AD345502ACBA1BCF54E0@phx.gbl
Whole thread Raw
In response to Redundant sub query triggers slow nested loop left join  ("henk de wit" <henk53602@hotmail.com>)
List pgsql-performance
Perhaps one other interesting observation; when I earlier removed the status
check for which the rows got so wrongly estimated, the query got
dramatically faster. However, once I also remove all redundant checks the
query gets slower again.

This is the query with both status and redundant check removed:

SELECT
    id,
    status,
    merchant_id,
    description,
    org_text,
    users_banners_id,
    banner_url,
    cookie_redirect,
    type,

    CASE WHEN special_deal IS null THEN
        ''
    ELSE
        'special deal'
    END AS special_deal,

    CASE WHEN url_of_banner IS null    THEN
        ''
    ELSE
        url_of_banner
    END AS url_of_banner,

    CASE WHEN period_end IS NULL THEN
        'not_active'
    ELSE
        'active'
    END AS active_not_active,

    CASE WHEN ecpc IS NULL THEN
        0.00
    ELSE
        ROUND(ecpc::numeric,2)
    END AS ecpc,

    CASE WHEN ecpc_merchant IS NULL THEN
        0.00
    ELSE
        ROUND(ecpc_merchant::numeric,2)
    END AS ecpc_merchant

FROM
    /* SUBQUERY grand_total_fetch_banners */ (
        /* SUBQUERY grand_total  */(
            /* SUBQUERY banners_special_deals */    (

                /* SUBQUERY banners */ (
                    SELECT
                        *
                    FROM
                        /* SUBQUERY banners_links */ (
                            SELECT
                                banners_links.id,
                                merchant_id,
                                banners_org.banner_text AS org_text,
                                description,
                                status,
                                banner_url,
                                ecpc,
                                ecpc_merchant,
                                COALESCE(cookie_redirect,0) AS cookie_redirect
                            FROM
                                /* SUBQUERY banners_links  */ (

                                    /* subselect tot join ecpc_per_banner_links on banners_links*/
                                    /* SUBQUERY banners_links */ (
                                        SELECT
                                            *
                                        FROM
                                            banners_links
                                        WHERE
                                            merchant_id = 217
                                    ) AS  banners_links

                                        LEFT OUTER JOIN

                                    /* SUBQUERY ecpc_per_banner_link */    (
                                        SELECT
                                            CASE WHEN clicks_total > 0 THEN
                                                (revenue_total_affiliate/clicks_total)::float/1000.0
                                            ELSE
                                                0.0
                                            END AS ecpc,
                                            CASE WHEN clicks_total > 0 THEN
                                                (revenue_total/clicks_total)::float/1000.0
                                            ELSE
                                                0.0
                                            END AS ecpc_merchant,

                                            banners_links_id
                                        FROM
                                            precalculated_stats_banners_links
                                        WHERE
                                            status = 4
                                    ) AS ecpc_per_banner_link

                                        ON (banners_links.id = ecpc_per_banner_link.banners_links_id)
                                ) AS banners_links

                                    ,

                                banners_org

                            WHERE
                                banners_links.id = banners_org.id_banner            AND
                                (banners_links.id = -1 OR -1 = -1)
                        ) AS banners_links

                            LEFT OUTER JOIN

                        /* SUBQUERY users_banners_tot_sub */(
                            SELECT
                                MAX (users_banners_id) AS users_banners_id,
                                merchant_users_banners_id,
                                banner_id
                            FROM
                                /* SUBQUERY users_banners_rotations_sub */(
                                    SELECT
                                        affiliate_id         AS merchant_users_banners_id,
                                        users_banners.id     AS users_banners_id,
                                        users_banners_rotation.banner_id
                                    FROM
                                        users_banners, users_banners_rotation
                                    WHERE
                                        users_banners_rotation.users_banners_id = users_banners.id    AND
                                        users_banners.status = 3
                                ) AS users_banners_rotations_sub
                            GROUP BY
                                merchant_users_banners_id,banner_id
                        ) AS users_banners_tot_sub

                            ON (
                                banners_links.id = users_banners_tot_sub.banner_id     AND
                                banners_links.merchant_id =
users_banners_tot_sub.merchant_users_banners_id
                            )
                    ) AS banners

                        LEFT OUTER JOIN

                    /* SUBQUERY special_deals */(
                        SELECT
                            banner_deals.banner_id     AS id,
                            MAX(affiliate_id)         AS special_deal
                        FROM
                            banner_deals
                        GROUP BY
                            banner_deals.banner_id
                    ) AS special_deals

                        USING (id)

            ) AS banners_special_deals

                LEFT OUTER JOIN

            /* SUBQUERY types */ (
                SELECT
                    banner_types.id                 AS type_id,
                    banner_types.type                 AS type,
                    banners_banner_types.banner_id     AS id
                FROM
                    banner_types,banners_banner_types
                WHERE

                    banners_banner_types.type_id = banner_types.id
            ) AS types

                USING (id)

        )  as grand_total

            LEFT OUTER JOIN

        /* SUBQUERY fetch_banners */ (
            SELECT
                banners_links_id AS id,
                url_of_banner
            FROM
                fetch_banners
        ) AS fetch_banners

            USING (id)
    ) AS grand_total_fetch_banners

        LEFT OUTER JOIN

    /* SUBQUERY active_banners */ (
        SELECT
            banner_id AS id,
            period_end
        FROM
            reward_ratings
        WHERE
            now() BETWEEN period_start AND period_end

    ) AS active_banners

        USING (id)
WHERE
    (type_id =  -1 OR -1 = -1 )    AND
    (special_deal IS null)

ORDER BY
    id DESC

For this query, PG comes up with the following plan:

Sort  (cost=3772.80..3772.81 rows=2 width=597) (actual
time=3203.143..3203.315 rows=436 loops=1)
  Sort Key: public.banners_links.id
  ->  Nested Loop Left Join  (cost=2345.33..3772.79 rows=2 width=597)
(actual time=108.926..3201.931 rows=436 loops=1)
        ->  Nested Loop Left Join  (cost=2341.06..3742.03 rows=2 width=589)
(actual time=108.902..3197.302 rows=436 loops=1)
              Join Filter: (public.banners_links.id =
ecpc_per_banner_link.banners_links_id)
              ->  Nested Loop  (cost=1722.18..2763.47 rows=2 width=573)
(actual time=68.228..78.611 rows=436 loops=1)
                    ->  Hash Left Join  (cost=1722.18..2754.88 rows=2
width=194) (actual time=68.219..75.916 rows=436 loops=1)
                          Hash Cond: (public.banners_links.id =
users_banners_tot_sub.banner_id)
                          ->  Nested Loop Left Join  (cost=1227.70..2260.38
rows=2 width=186) (actual time=61.822..68.891 rows=436 loops=1)
                                ->  Hash Left Join  (cost=1227.70..2259.73
rows=2 width=116) (actual time=61.811..67.321 rows=436 loops=1)
                                      Hash Cond: (public.banners_links.id =
banners_banner_types.banner_id)
                                      ->  Hash Left Join
(cost=103.40..946.54 rows=2 width=81) (actual time=6.135..7.009 rows=331
loops=1)
                                            Hash Cond:
(public.banners_links.id = special_deals.id)
                                            Filter:
(special_deals.special_deal IS NULL)
                                            ->  Bitmap Heap Scan on
banners_links  (cost=6.86..816.67 rows=336 width=73) (actual
time=0.111..0.496 rows=336 loops=1)
                                                  Recheck Cond: (merchant_id
= 217)
                                                  ->  Bitmap Index Scan on
banners_links_merchant_id_idx  (cost=0.00..6.77 rows=336 width=0) (actual
time=0.079..0.079 rows=336 loops=1)
                                                        Index Cond:
(merchant_id = 217)
                                            ->  Hash  (cost=86.93..86.93
rows=769 width=16) (actual time=6.012..6.012 rows=780 loops=1)
                                                  ->  Subquery Scan
special_deals  (cost=69.62..86.93 rows=769 width=16) (actual
time=4.240..5.451 rows=780 loops=1)
                                                        ->  HashAggregate
(cost=69.62..79.24 rows=769 width=16) (actual time=4.239..4.748 rows=780
loops=1)
                                                              ->  Seq Scan
on banner_deals  (cost=0.00..53.75 rows=3175 width=16) (actual
time=0.006..1.485 rows=3175 loops=1)
                                      ->  Hash  (cost=673.83..673.83
rows=21158 width=43) (actual time=55.659..55.659 rows=22112 loops=1)
                                            ->  Hash Join
(cost=2.45..673.83 rows=21158 width=43) (actual time=0.047..36.885
rows=22112 loops=1)
                                                  Hash Cond:
(banners_banner_types.type_id = banner_types.id)
                                                  ->  Seq Scan on
banners_banner_types  (cost=0.00..376.40 rows=22240 width=16) (actual
time=0.005..10.653 rows=22240 loops=1)
                                                  ->  Hash  (cost=2.20..2.20
rows=20 width=43) (actual time=0.034..0.034 rows=20 loops=1)
                                                        ->  Seq Scan on
banner_types  (cost=0.00..2.20 rows=20 width=43) (actual time=0.003..0.016
rows=20 loops=1)
                                ->  Index Scan using
fetch_banners_banners_links_id_idx on fetch_banners  (cost=0.00..0.32 rows=1
width=78) (actual time=0.002..0.002 rows=0 loops=436)
                                      Index Cond: (public.banners_links.id =
public.fetch_banners.banners_links_id)
                          ->  Hash  (cost=494.34..494.34 rows=11 width=24)
(actual time=6.378..6.378 rows=336 loops=1)
                                ->  Subquery Scan users_banners_tot_sub
(cost=494.09..494.34 rows=11 width=24) (actual time=5.588..6.124 rows=336
loops=1)
                                      ->  HashAggregate
(cost=494.09..494.23 rows=11 width=24) (actual time=5.586..5.810 rows=336
loops=1)
                                            ->  Nested Loop
(cost=360.46..494.01 rows=11 width=24) (actual time=2.876..5.232 rows=336
loops=1)
                                                  ->  Bitmap Heap Scan on
users_banners  (cost=360.46..402.65 rows=11 width=16) (actual
time=2.863..3.133 rows=336 loops=1)
                                                        Recheck Cond:
((affiliate_id = 217) AND ((status)::text = '3'::text))
                                                        ->  BitmapAnd
(cost=360.46..360.46 rows=11 width=0) (actual time=2.842..2.842 rows=0
loops=1)
                                                              ->  Bitmap
Index Scan on users_banners_affiliate_id_idx  (cost=0.00..5.31 rows=138
width=0) (actual time=0.072..0.072 rows=350 loops=1)
                                                                    Index
Cond: (affiliate_id = 217)
                                                              ->  Bitmap
Index Scan on users_banners_status_idx  (cost=0.00..354.90 rows=19016
width=0) (actual time=2.741..2.741 rows=17406 loops=1)
                                                                    Index
Cond: ((status)::text = '3'::text)
                                                  ->  Index Scan using
users_banners_id_idx on users_banners_rotation  (cost=0.00..8.29 rows=1
width=16) (actual time=0.004..0.004 rows=1 loops=336)
                                                        Index Cond:
(users_banners_rotation.users_banners_id = users_banners.id)
                    ->  Index Scan using banners_org_id_banner.idx on
banners_org  (cost=0.00..4.28 rows=1 width=387) (actual time=0.003..0.004
rows=1 loops=436)
                          Index Cond: (public.banners_links.id =
banners_org.id_banner)
              ->  Materialize  (cost=618.88..698.81 rows=7993 width=20)
(actual time=0.000..3.299 rows=7923 loops=436)
                    ->  Index Scan using pre_calc_banners_status on
precalculated_stats_banners_links  (cost=0.00..530.96 rows=7993 width=30)
(actual time=0.025..26.349 rows=7923 loops=1)
                          Index Cond: (status = 4)
        ->  Bitmap Heap Scan on reward_ratings  (cost=4.27..15.33 rows=3
width=16) (actual time=0.005..0.005 rows=0 loops=436)
              Recheck Cond: (public.banners_links.id =
reward_ratings.banner_id)
              Filter: ((now() >= period_start) AND (now() <= period_end))
              ->  Bitmap Index Scan on reward_ratings_banner_id_idx
(cost=0.00..4.27 rows=3 width=0) (actual time=0.003..0.003 rows=0 loops=436)
                    Index Cond: (public.banners_links.id =
reward_ratings.banner_id)
Total runtime: 3204.016 ms


For the "banners_links.id = ecpc_per_banner_link.banners_links_id" join, it
chooses the dreaded Nested loop left join again, which takes up the bulk of
the query execution time.

After some fiddling and experimentation with the query, I found that if I
only removed both case statements in the "ecpc_per_banner_link" subquery, it
becomes fast again:

Sort  (cost=2875.63..2875.64 rows=6 width=599) (actual time=107.824..109.456
rows=1780 loops=1)
  Sort Key: public.banners_links.id
  ->  Nested Loop Left Join  (cost=1726.45..2875.55 rows=6 width=599)
(actual time=68.243..98.013 rows=1780 loops=1)
        ->  Nested Loop Left Join  (cost=1722.18..2783.30 rows=6 width=591)
(actual time=68.220..84.351 rows=1780 loops=1)
              ->  Nested Loop  (cost=1722.18..2763.47 rows=2 width=573)
(actual time=68.210..78.427 rows=436 loops=1)
                    ->  Hash Left Join  (cost=1722.18..2754.88 rows=2
width=194) (actual time=68.196..75.592 rows=436 loops=1)
                          Hash Cond: (public.banners_links.id =
users_banners_tot_sub.banner_id)
                          ->  Nested Loop Left Join  (cost=1227.70..2260.38
rows=2 width=186) (actual time=61.870..68.654 rows=436 loops=1)
                                ->  Hash Left Join  (cost=1227.70..2259.73
rows=2 width=116) (actual time=61.859..67.140 rows=436 loops=1)
                                      Hash Cond: (public.banners_links.id =
banners_banner_types.banner_id)
                                      ->  Hash Left Join
(cost=103.40..946.54 rows=2 width=81) (actual time=6.099..6.944 rows=331
loops=1)
                                            Hash Cond:
(public.banners_links.id = special_deals.id)
                                            Filter:
(special_deals.special_deal IS NULL)
                                            ->  Bitmap Heap Scan on
banners_links  (cost=6.86..816.67 rows=336 width=73) (actual
time=0.105..0.451 rows=336 loops=1)
                                                  Recheck Cond: (merchant_id
= 217)
                                                  ->  Bitmap Index Scan on
banners_links_merchant_id_idx  (cost=0.00..6.77 rows=336 width=0) (actual
time=0.073..0.073 rows=336 loops=1)
                                                        Index Cond:
(merchant_id = 217)
                                            ->  Hash  (cost=86.93..86.93
rows=769 width=16) (actual time=5.989..5.989 rows=780 loops=1)
                                                  ->  Subquery Scan
special_deals  (cost=69.62..86.93 rows=769 width=16) (actual
time=4.225..5.445 rows=780 loops=1)
                                                        ->  HashAggregate
(cost=69.62..79.24 rows=769 width=16) (actual time=4.223..4.742 rows=780
loops=1)
                                                              ->  Seq Scan
on banner_deals  (cost=0.00..53.75 rows=3175 width=16) (actual
time=0.006..1.484 rows=3175 loops=1)
                                      ->  Hash  (cost=673.83..673.83
rows=21158 width=43) (actual time=55.750..55.750 rows=22112 loops=1)
                                            ->  Hash Join
(cost=2.45..673.83 rows=21158 width=43) (actual time=0.042..36.943
rows=22112 loops=1)
                                                  Hash Cond:
(banners_banner_types.type_id = banner_types.id)
                                                  ->  Seq Scan on
banners_banner_types  (cost=0.00..376.40 rows=22240 width=16) (actual
time=0.005..10.791 rows=22240 loops=1)
                                                  ->  Hash  (cost=2.20..2.20
rows=20 width=43) (actual time=0.032..0.032 rows=20 loops=1)
                                                        ->  Seq Scan on
banner_types  (cost=0.00..2.20 rows=20 width=43) (actual time=0.004..0.016
rows=20 loops=1)
                                ->  Index Scan using
fetch_banners_banners_links_id_idx on fetch_banners  (cost=0.00..0.32 rows=1
width=78) (actual time=0.002..0.002 rows=0 loops=436)
                                      Index Cond: (public.banners_links.id =
public.fetch_banners.banners_links_id)
                          ->  Hash  (cost=494.34..494.34 rows=11 width=24)
(actual time=6.312..6.312 rows=336 loops=1)
                                ->  Subquery Scan users_banners_tot_sub
(cost=494.09..494.34 rows=11 width=24) (actual time=5.519..6.056 rows=336
loops=1)
                                      ->  HashAggregate
(cost=494.09..494.23 rows=11 width=24) (actual time=5.518..5.747 rows=336
loops=1)
                                            ->  Nested Loop
(cost=360.46..494.01 rows=11 width=24) (actual time=2.814..5.166 rows=336
loops=1)
                                                  ->  Bitmap Heap Scan on
users_banners  (cost=360.46..402.65 rows=11 width=16) (actual
time=2.801..3.079 rows=336 loops=1)
                                                        Recheck Cond:
((affiliate_id = 217) AND ((status)::text = '3'::text))
                                                        ->  BitmapAnd
(cost=360.46..360.46 rows=11 width=0) (actual time=2.781..2.781 rows=0
loops=1)
                                                              ->  Bitmap
Index Scan on users_banners_affiliate_id_idx  (cost=0.00..5.31 rows=138
width=0) (actual time=0.088..0.088 rows=350 loops=1)
                                                                    Index
Cond: (affiliate_id = 217)
                                                              ->  Bitmap
Index Scan on users_banners_status_idx  (cost=0.00..354.90 rows=19016
width=0) (actual time=2.673..2.673 rows=17406 loops=1)
                                                                    Index
Cond: ((status)::text = '3'::text)
                                                  ->  Index Scan using
users_banners_id_idx on users_banners_rotation  (cost=0.00..8.29 rows=1
width=16) (actual time=0.004..0.004 rows=1 loops=336)
                                                        Index Cond:
(users_banners_rotation.users_banners_id = users_banners.id)
                    ->  Index Scan using banners_org_id_banner.idx on
banners_org  (cost=0.00..4.28 rows=1 width=387) (actual time=0.004..0.004
rows=1 loops=436)
                          Index Cond: (public.banners_links.id =
banners_org.id_banner)
              ->  Index Scan using pre_calc_banners_id on
precalculated_stats_banners_links  (cost=0.00..9.87 rows=4 width=22) (actual
time=0.004..0.008 rows=4 loops=436)
                    Index Cond: (public.banners_links.id =
precalculated_stats_banners_links.banners_links_id)
        ->  Bitmap Heap Scan on reward_ratings  (cost=4.27..15.33 rows=3
width=16) (actual time=0.004..0.004 rows=0 loops=1780)
              Recheck Cond: (public.banners_links.id =
reward_ratings.banner_id)
              Filter: ((now() >= period_start) AND (now() <= period_end))
              ->  Bitmap Index Scan on reward_ratings_banner_id_idx
(cost=0.00..4.27 rows=3 width=0) (actual time=0.003..0.003 rows=1
loops=1780)
                    Index Cond: (public.banners_links.id =
reward_ratings.banner_id)
Total runtime: 111.046 ms

I'm really having a hard time with this query. Every time when I change the
smallest of things, the query time dramatically jumps up or down.

I understand that some calculations just take time, but it seems to me that
its not the particular things I actually do that makes the difference, but
the fact that some actions or structure of the query 'just happen' to force
a bad plan while others just happen to force a good plan. With my limited
knowledge I absolutely see no connection between what actions influence what
and why.

Could someone shed some light on this issue?

_________________________________________________________________
Play online games with your friends with Messenger
http://www.join.msn.com/messenger/overview


pgsql-performance by date:

Previous
From: "Mauro N. Infantino"
Date:
Subject: Re: Very specific server situation
Next
From: Greg Smith
Date:
Subject: Re: Feature Request --- was: PostgreSQL Performance Tuning