Optimizing a VIEW - Mailing list pgsql-performance

From Madison Kelly
Subject Optimizing a VIEW
Date
Msg-id 48A5CCA4.2040206@alteeve.com
Whole thread Raw
Responses Re: Optimizing a VIEW  (Decibel! <decibel@decibel.org>)
Re: Optimizing a VIEW  ("Rodrigo E. De León Plicet" <rdeleonp@gmail.com>)
Re: Optimizing a VIEW  (Matthew Wakeling <matthew@flymine.org>)
List pgsql-performance
Hi all,

   I've got a simple table with a lot of data in it:

CREATE TABLE customer_data (
    cd_id            int        primary key    default(nextval('cd_seq')),
    cd_cust_id        int        not null,
    cd_variable        text        not null,
    cd_value        text,
    cd_tag            text,
    added_user        int        not null,
    added_date        timestamp    not null    default now(),
    modified_user        int        not null,
    modified_date        timestamp    not null    default now(),

    FOREIGN KEY(cd_cust_id) REFERENCES customer(cust_id)
);

   The 'cust_id' references the customer that the given data belongs to.
The reason for this "data bucket" (does this structure have a proper
name?) is that the data I need to store on a give customer is quite
variable and outside of my control. As it is, there is about 400
different variable/value pairs I need to store per customer.

   This table has a copy in a second historical schema that matches this
one in public but with an additional 'history_id' sequence. I use a
simple function to copy an INSERT or UPDATE to any entry in the
historical schema.

   Now I want to graph a certain subset of these variable/value pairs,
so I created a simple (in concept) view to pull out the historical data
set for a given customer. I do this by pulling up a set of records based
on the name of the 'cd_variable' and 'cd_tag' and connect the records
together using a matching timestamp.

   The problem is that this view has very quickly become terribly slow.

   I've got indexes on the 'cd_variable', 'cd_tag' and the parent
'cust_id' columns, and the plan seems to show that the indexes are
indeed being used, but the query against this view can take up to 10
minutes to respond. I am hoping to avoid making a dedicated table as
what I use to build this dataset may change over time.

   Below I will post the VIEW and a sample of the query's EXPLAIN
ANALYZE. Thanks for any tips/help/clue-stick-beating you may be able to
share!

Madi

-=] VIEW

CREATE VIEW view_sync_rate_history AS
     SELECT
         a.cust_id       AS    vsrh_cust_id,
         a.cust_name     AS    vsrh_cust_name,
         a.cust_business AS    vsrh_cust_business,
         a.cust_nexxia_id||'-'||a.cust_nexxia_seq    AS    vsrh_cust_nexxia,
         a.cust_phone    AS    vsrh_cust_phone,
         b.cd_value      AS    vsrh_up_speed,
         b.history_id    AS    vsrh_up_speed_history_id,
         c.cd_value      AS    vsrh_up_rco,
         c.history_id    AS    vsrh_up_rco_history_id,
         d.cd_value      AS    vsrh_up_nm,
         d.history_id    AS    vsrh_up_nm_history_id,
         e.cd_value      AS    vsrh_up_sp,
         e.history_id    AS    vsrh_up_sp_history_id,
         f.cd_value      AS    vsrh_up_atten,
         f.history_id    AS    vsrh_up_atten_history_id,
         g.cd_value      AS    vsrh_down_speed,
         g.history_id    AS    vsrh_down_speed_history_id,
         h.cd_value      AS    vsrh_down_rco,
         h.history_id    AS    vsrh_down_rco_history_id,
         i.cd_value      AS    vsrh_down_nm,
         i.history_id    AS    vsrh_down_nm_history_id,
         j.cd_value      AS    vsrh_down_sp,
         j.history_id    AS    vsrh_down_sp_history_id,
         k.cd_value      AS    vsrh_down_atten,
         k.history_id    AS    vsrh_down_atten_history_id,
         l.cd_value      AS    vsrh_updated,
         l.history_id    AS    vsrh_updated_history_id
     FROM
         customer a,
         history.customer_data b,
         history.customer_data c,
         history.customer_data d,
         history.customer_data e,
         history.customer_data f,
         history.customer_data g,
         history.customer_data h,
         history.customer_data i,
         history.customer_data j,
         history.customer_data k,
         history.customer_data l
     WHERE
         a.cust_id=b.cd_cust_id AND
         a.cust_id=c.cd_cust_id AND
         a.cust_id=d.cd_cust_id AND
         a.cust_id=e.cd_cust_id AND
         a.cust_id=f.cd_cust_id AND
         a.cust_id=g.cd_cust_id AND
         a.cust_id=h.cd_cust_id AND
         a.cust_id=i.cd_cust_id AND
         a.cust_id=j.cd_cust_id AND
         a.cust_id=k.cd_cust_id AND
         a.cust_id=l.cd_cust_id AND
         b.cd_tag='sync_rate' AND
         c.cd_tag='sync_rate' AND
         d.cd_tag='sync_rate' AND
         e.cd_tag='sync_rate' AND
         f.cd_tag='sync_rate' AND
         g.cd_tag='sync_rate' AND
         h.cd_tag='sync_rate' AND
         i.cd_tag='sync_rate' AND
         j.cd_tag='sync_rate' AND
         k.cd_tag='sync_rate' AND
         l.cd_tag='sync_rate' AND
         b.cd_variable='upstream_speed' AND
         c.cd_variable='upstream_relative_capacity_occupation' AND
         d.cd_variable='upstream_noise_margin' AND
         e.cd_variable='upstream_signal_power' AND
         f.cd_variable='upstream_attenuation' AND
         g.cd_variable='downstream_speed' AND
         h.cd_variable='downstream_relative_capacity_occupation' AND
         i.cd_variable='downstream_noise_margin' AND
         j.cd_variable='downstream_signal_power' AND
         k.cd_variable='downstream_attenuation' AND
         l.cd_variable='sync_rate_updated' AND
         b.modified_date=c.modified_date AND
         b.modified_date=d.modified_date AND
         b.modified_date=e.modified_date AND
         b.modified_date=f.modified_date AND
         b.modified_date=g.modified_date AND
         b.modified_date=h.modified_date AND
         b.modified_date=i.modified_date AND
         b.modified_date=j.modified_date AND
         b.modified_date=k.modified_date AND
         b.modified_date=l.modified_date;

-=] EXPLAIN ANALYZE of a sample query
In case this is hard to read in the mail program, here is a link:
http://mizu-bu.org/misc/long_explain_analyze.txt


                QUERY PLAN


---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Nested Loop  (cost=1263.93..3417.98 rows=1 width=262) (actual
time=88.005..248996.948 rows=131 loops=1)
    Join Filter: ("inner".modified_date = "outer".modified_date)
    ->  Nested Loop  (cost=577.65..1482.46 rows=1 width=154) (actual
time=58.664..5253.873 rows=131 loops=1)
          Join Filter: ("outer".modified_date = "inner".modified_date)
          ->  Nested Loop  (cost=369.98..870.28 rows=1 width=128)
(actual time=51.858..4328.108 rows=131 loops=1)
                Join Filter: ("inner".modified_date = "outer".modified_date)
                ->  Nested Loop  (cost=343.35..823.93 rows=1 width=116)
(actual time=42.851..3185.995 rows=131 loops=1)
                      Join Filter: ("inner".modified_date =
"outer".modified_date)
                      ->  Nested Loop  (cost=126.20..185.37 rows=1
width=90) (actual time=36.181..2280.245 rows=131 loops=1)
                            Join Filter: ("inner".modified_date =
"outer".modified_date)
                            ->  Nested Loop  (cost=99.57..139.02 rows=1
width=64) (actual time=27.918..1168.061 rows=131 loops=1)
                                  Join Filter: ("outer".modified_date =
"inner".modified_date)
                                  ->  Hash Join  (cost=72.94..92.67
rows=1 width=38) (actual time=17.769..18.572 rows=131 loops=1)
                                        Hash Cond:
("outer".modified_date = "inner".modified_date)
                                        ->  Bitmap Heap Scan on
customer_data i  (cost=26.63..46.30 rows=4 width=26) (actual
time=8.226..8.563 rows=131 loops=1)
                                              Recheck Cond:
((cd_variable = 'downstream_noise_margin'::text) AND (103 = cd_cust_id))
                                              Filter: (cd_tag =
'sync_rate'::text)
                                              ->  BitmapAnd
(cost=26.63..26.63 rows=5 width=0) (actual time=8.172..8.172 rows=0 loops=1)
                                                    ->  Bitmap Index
Scan on cd_variable_index  (cost=0.00..10.21 rows=918 width=0) (actual
time=6.409..6.409 rows=20981 loops=1)
                                                          Index Cond:
(cd_variable = 'downstream_noise_margin'::text)
                                                    ->  Bitmap Index
Scan on cd_id_index  (cost=0.00..16.17 rows=2333 width=0) (actual
time=0.502..0.502 rows=2619 loops=1)
                                                          Index Cond:
(103 = cd_cust_id)
                                        ->  Hash  (cost=46.30..46.30
rows=4 width=12) (actual time=9.526..9.526 rows=131 loops=1)
                                              ->  Bitmap Heap Scan on
customer_data e  (cost=26.63..46.30 rows=4 width=12) (actual
time=9.140..9.381 rows=131 loops=1)
                                                    Recheck Cond:
((cd_variable = 'upstream_signal_power'::text) AND (103 = cd_cust_id))
                                                    Filter: (cd_tag =
'sync_rate'::text)
                                                    ->  BitmapAnd
(cost=26.63..26.63 rows=5 width=0) (actual time=9.082..9.082 rows=0 loops=1)
                                                          ->  Bitmap
Index Scan on cd_variable_index  (cost=0.00..10.21 rows=918 width=0)
(actual time=7.298..7.298 rows=20981 loops=1)
                                                                Index
Cond: (cd_variable = 'upstream_signal_power'::text)
                                                          ->  Bitmap
Index Scan on cd_id_index  (cost=0.00..16.17 rows=2333 width=0) (actual
time=0.502..0.502 rows=2619 loops=1)
                                                                Index
Cond: (103 = cd_cust_id)
                                  ->  Bitmap Heap Scan on customer_data
c  (cost=26.63..46.30 rows=4 width=26) (actual time=8.492..8.693
rows=131 loops=131)
                                        Recheck Cond: ((cd_variable =
'upstream_relative_capacity_occupation'::text) AND (103 = cd_cust_id))
                                        Filter: (cd_tag = 'sync_rate'::text)
                                        ->  BitmapAnd
(cost=26.63..26.63 rows=5 width=0) (actual time=8.446..8.446 rows=0
loops=131)
                                              ->  Bitmap Index Scan on
cd_variable_index  (cost=0.00..10.21 rows=918 width=0) (actual
time=6.693..6.693 rows=20986 loops=131)
                                                    Index Cond:
(cd_variable = 'upstream_relative_capacity_occupation'::text)
                                              ->  Bitmap Index Scan on
cd_id_index  (cost=0.00..16.17 rows=2333 width=0) (actual
time=0.494..0.494 rows=2619 loops=131)
                                                    Index Cond: (103 =
cd_cust_id)
                            ->  Bitmap Heap Scan on customer_data b
(cost=26.63..46.30 rows=4 width=26) (actual time=8.216..8.405 rows=131
loops=131)
                                  Recheck Cond: ((cd_variable =
'upstream_speed'::text) AND (103 = cd_cust_id))
                                  Filter: (cd_tag = 'sync_rate'::text)
                                  ->  BitmapAnd  (cost=26.63..26.63
rows=5 width=0) (actual time=8.172..8.172 rows=0 loops=131)
                                        ->  Bitmap Index Scan on
cd_variable_index  (cost=0.00..10.21 rows=918 width=0) (actual
time=6.417..6.417 rows=20986 loops=131)
                                              Index Cond: (cd_variable =
'upstream_speed'::text)
                                        ->  Bitmap Index Scan on
cd_id_index  (cost=0.00..16.17 rows=2333 width=0) (actual
time=0.495..0.495 rows=2619 loops=131)
                                              Index Cond: (103 = cd_cust_id)
                      ->  Bitmap Heap Scan on customer_data l
(cost=217.14..637.28 rows=102 width=26) (actual time=6.653..6.843
rows=131 loops=131)
                            Recheck Cond: ((103 = cd_cust_id) AND
(cd_variable = 'sync_rate_updated'::text))
                            Filter: (cd_tag = 'sync_rate'::text)
                            ->  BitmapAnd  (cost=217.14..217.14 rows=117
width=0) (actual time=6.618..6.618 rows=0 loops=131)
                                  ->  Bitmap Index Scan on cd_id_index
(cost=0.00..16.17 rows=2333 width=0) (actual time=0.485..0.485 rows=2619
loops=131)
                                        Index Cond: (103 = cd_cust_id)
                                  ->  Bitmap Index Scan on
cd_variable_index  (cost=0.00..200.72 rows=21350 width=0) (actual
time=6.079..6.079 rows=20986 loops=131)
                                        Index Cond: (cd_variable =
'sync_rate_updated'::text)
                ->  Bitmap Heap Scan on customer_data k
(cost=26.63..46.30 rows=4 width=12) (actual time=8.442..8.638 rows=131
loops=131)
                      Recheck Cond: ((cd_variable =
'downstream_attenuation'::text) AND (103 = cd_cust_id))
                      Filter: (cd_tag = 'sync_rate'::text)
                      ->  BitmapAnd  (cost=26.63..26.63 rows=5 width=0)
(actual time=8.397..8.397 rows=0 loops=131)
                            ->  Bitmap Index Scan on cd_variable_index
(cost=0.00..10.21 rows=918 width=0) (actual time=6.624..6.624 rows=20986
loops=131)
                                  Index Cond: (cd_variable =
'downstream_attenuation'::text)
                            ->  Bitmap Index Scan on cd_id_index
(cost=0.00..16.17 rows=2333 width=0) (actual time=0.487..0.487 rows=2619
loops=131)
                                  Index Cond: (103 = cd_cust_id)
          ->  Bitmap Heap Scan on customer_data d  (cost=207.68..610.95
rows=98 width=26) (actual time=6.805..6.994 rows=131 loops=131)
                Recheck Cond: ((103 = cd_cust_id) AND (cd_variable =
'upstream_noise_margin'::text))
                Filter: (cd_tag = 'sync_rate'::text)
                ->  BitmapAnd  (cost=207.68..207.68 rows=112 width=0)
(actual time=6.769..6.769 rows=0 loops=131)
                      ->  Bitmap Index Scan on cd_id_index
(cost=0.00..16.17 rows=2333 width=0) (actual time=0.487..0.487 rows=2619
loops=131)
                            Index Cond: (103 = cd_cust_id)
                      ->  Bitmap Index Scan on cd_variable_index
(cost=0.00..191.26 rows=20360 width=0) (actual time=6.230..6.230
rows=20986 loops=131)
                            Index Cond: (cd_variable =
'upstream_noise_margin'::text)
    ->  Nested Loop  (cost=686.28..1935.49 rows=1 width=224) (actual
time=21.077..1860.475 rows=131 loops=131)
          ->  Seq Scan on customer a  (cost=0.00..5.22 rows=1 width=164)
(actual time=0.053..0.090 rows=1 loops=131)
                Filter: (cust_id = 103)
          ->  Nested Loop  (cost=686.28..1930.26 rows=1 width=76)
(actual time=21.014..1860.177 rows=131 loops=131)
                Join Filter: ("inner".modified_date = "outer".modified_date)
                ->  Nested Loop  (cost=472.13..1298.07 rows=1 width=50)
(actual time=14.460..971.017 rows=131 loops=131)
                      Join Filter: ("inner".modified_date =
"outer".modified_date)
                      ->  Hash Join  (cost=259.97..674.63 rows=1
width=38) (actual time=7.459..8.272 rows=131 loops=131)
                            Hash Cond: ("outer".modified_date =
"inner".modified_date)
                            ->  Bitmap Heap Scan on customer_data h
(cost=213.66..627.06 rows=100 width=26) (actual time=7.391..7.707
rows=131 loops=131)
                                  Recheck Cond: ((103 = cd_cust_id) AND
(cd_variable = 'downstream_relative_capacity_occupation'::text))
                                  Filter: (cd_tag = 'sync_rate'::text)
                                  ->  BitmapAnd  (cost=213.66..213.66
rows=115 width=0) (actual time=7.355..7.355 rows=0 loops=131)
                                        ->  Bitmap Index Scan on
cd_id_index  (cost=0.00..16.17 rows=2333 width=0) (actual
time=0.493..0.493 rows=2619 loops=131)
                                              Index Cond: (103 = cd_cust_id)
                                        ->  Bitmap Index Scan on
cd_variable_index  (cost=0.00..197.24 rows=20926 width=0) (actual
time=6.809..6.809 rows=20986 loops=131)
                                              Index Cond: (cd_variable =
'downstream_relative_capacity_occupation'::text)
                            ->  Hash  (cost=46.30..46.30 rows=4
width=12) (actual time=8.253..8.253 rows=131 loops=1)
                                  ->  Bitmap Heap Scan on customer_data
f  (cost=26.63..46.30 rows=4 width=12) (actual time=7.882..8.113
rows=131 loops=1)
                                        Recheck Cond: ((cd_variable =
'upstream_attenuation'::text) AND (103 = cd_cust_id))
                                        Filter: (cd_tag = 'sync_rate'::text)
                                        ->  BitmapAnd
(cost=26.63..26.63 rows=5 width=0) (actual time=7.832..7.832 rows=0 loops=1)
                                              ->  Bitmap Index Scan on
cd_variable_index  (cost=0.00..10.21 rows=918 width=0) (actual
time=6.065..6.065 rows=20981 loops=1)
                                                    Index Cond:
(cd_variable = 'upstream_attenuation'::text)
                                              ->  Bitmap Index Scan on
cd_id_index  (cost=0.00..16.17 rows=2333 width=0) (actual
time=0.489..0.489 rows=2619 loops=1)
                                                    Index Cond: (103 =
cd_cust_id)
                      ->  Bitmap Heap Scan on customer_data j
(cost=212.16..622.19 rows=100 width=12) (actual time=7.092..7.280
rows=131 loops=17161)
                            Recheck Cond: ((103 = cd_cust_id) AND
(cd_variable = 'downstream_signal_power'::text))
                            Filter: (cd_tag = 'sync_rate'::text)
                            ->  BitmapAnd  (cost=212.16..212.16 rows=114
width=0) (actual time=7.057..7.057 rows=0 loops=17161)
                                  ->  Bitmap Index Scan on cd_id_index
(cost=0.00..16.17 rows=2333 width=0) (actual time=0.493..0.493 rows=2619
loops=17161)
                                        Index Cond: (103 = cd_cust_id)
                                  ->  Bitmap Index Scan on
cd_variable_index  (cost=0.00..195.74 rows=20784 width=0) (actual
time=6.512..6.512 rows=20986 loops=17161)
                                        Index Cond: (cd_variable =
'downstream_signal_power'::text)
                ->  Bitmap Heap Scan on customer_data g
(cost=214.15..630.92 rows=101 width=26) (actual time=6.526..6.718
rows=131 loops=17161)
                      Recheck Cond: ((103 = cd_cust_id) AND (cd_variable
= 'downstream_speed'::text))
                      Filter: (cd_tag = 'sync_rate'::text)
                      ->  BitmapAnd  (cost=214.15..214.15 rows=116
width=0) (actual time=6.492..6.492 rows=0 loops=17161)
                            ->  Bitmap Index Scan on cd_id_index
(cost=0.00..16.17 rows=2333 width=0) (actual time=0.486..0.486 rows=2619
loops=17161)
                                  Index Cond: (103 = cd_cust_id)
                            ->  Bitmap Index Scan on cd_variable_index
(cost=0.00..197.73 rows=21067 width=0) (actual time=5.956..5.956
rows=20986 loops=17161)
                                  Index Cond: (cd_variable =
'downstream_speed'::text)
  Total runtime: 248997.571 ms
(114 rows)


pgsql-performance by date:

Previous
From: Ron Mayer
Date:
Subject: Re: Filesystem benchmarking for pg 8.3.3 server
Next
From: Bruce Momjian
Date:
Subject: Re: file system and raid performance