Thread: Slow execution time when querying view with WHERE clause
I have the following view: create or replace view market.p_areas as select a.* from _areas a where a.area in ( select b.area from _bins b, _inventories i, _offers o, _pricemembers p where b.bin = i.bin and i.inventory = o.inventory and o.pricegroup = p.pricegroup and p.buyer in ( select s.store from _stores s, _webusers w where w.webuser = getWebuser() and w.company = s.company union select s.store from _stores s, _companies c where s.company = c.company and c.companyid = 'DEFAULT' ) ); When I query the view without a where clause I get: explain analyze select * from p_areas; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Hash Join (cost=1273.12..1276.31 rows=47 width=163) (actual time=438.739..439.574 rows=34 loops=1) Hash Cond: ("outer".area = "inner".area) -> Seq Scan on _areas a (cost=0.00..2.48 rows=48 width=163) (actual time=0.015..0.169 rows=48 loops=1) -> Hash (cost=1273.01..1273.01 rows=47 width=8) (actual time=438.532..438.532 rows=0 loops=1) -> HashAggregate (cost=1273.01..1273.01 rows=47 width=8) (actual time=438.286..438.395 rows=34 loops=1) -> Hash Join (cost=558.53..1266.68 rows=2532 width=8) (actual time=160.923..416.968 rows=5264 loops=1) Hash Cond: ("outer".bin = "inner".bin) -> Hash Join (cost=544.02..1207.86 rows=2531 width=8) (actual time=156.097..356.560 rows=5264 loops=1) Hash Cond: ("outer".inventory = "inner".inventory) -> Seq Scan on _inventories i (cost=0.00..265.96 rows=11396 width=16) (actual time=0.010..44.047rows=11433 loops=1) -> Hash (cost=537.14..537.14 rows=2751 width=8) (actual time=155.891..155.891 rows=0 loops=1) -> Hash Join (cost=13.96..537.14 rows=2751 width=8) (actual time=11.967..136.598 rows=5264loops=1) Hash Cond: ("outer".pricegroup = "inner".pricegroup) -> Seq Scan on _offers o (cost=0.00..379.24 rows=15524 width=16) (actual time=0.008..50.335rows=15599 loops=1) -> Hash (cost=13.94..13.94 rows=9 width=8) (actual time=11.861..11.861 rows=0 loops=1) -> Hash IN Join (cost=8.74..13.94 rows=9 width=8) (actual time=10.801..11.801rows=12 loops=1) Hash Cond: ("outer".buyer = "inner".store) -> Seq Scan on _pricemembers p (cost=0.00..4.07 rows=207 width=16)(actual time=0.011..0.548 rows=207 loops=1) -> Hash (cost=8.72..8.72 rows=8 width=8) (actual time=10.687..10.687rows=0 loops=1) -> Subquery Scan "IN_subquery" (cost=8.60..8.72 rows=8 width=8)(actual time=10.645..10.654 rows=1 loops=1) -> Unique (cost=8.60..8.64 rows=8 width=8) (actual time=10.631..10.636rows=1 loops=1) -> Sort (cost=8.60..8.62 rows=8 width=8) (actualtime=10.625..10.627 rows=1 loops=1) Sort Key: store -> Append (cost=2.86..8.48 rows=8 width=8)(actual time=10.529..10.583 rows=1 loops=1) -> Subquery Scan "*SELECT* 1" (cost=2.86..5.15rows=5 width=8) (actual time=10.222..10.222 rows=0 loops=1) -> Hash Join (cost=2.86..5.10 rows=5width=8) (actual time=10.214..10.214 rows=0 loops=1) Hash Cond: ("outer".company= "inner".company) -> Seq Scan on _stores s (cost=0.00..2.13 rows=13 width=16) (actual time=0.019..0.059 rows=13 loops=1) -> Hash (cost=2.85..2.85rows=1 width=8) (actual time=10.031..10.031 rows=0 loops=1) -> Seq Scan on _webusersw (cost=0.00..2.85 rows=1 width=8) (actual time=10.023..10.023 rows=0 loops=1) Filter: (webuser= getwebuser()) -> Subquery Scan "*SELECT* 2" (cost=1.08..3.33rows=3 width=8) (actual time=0.298..0.349 rows=1 loops=1) -> Hash Join (cost=1.08..3.30 rows=3width=8) (actual time=0.287..0.335 rows=1 loops=1) Hash Cond: ("outer".company= "inner".company) -> Seq Scan on _stores s (cost=0.00..2.13 rows=13 width=16) (actual time=0.008..0.048 rows=13 loops=1) -> Hash (cost=1.07..1.07rows=1 width=8) (actual time=0.111..0.111 rows=0 loops=1) -> Seq Scan on _companiesc (cost=0.00..1.07 rows=1 width=8) (actual time=0.059..0.080 rows=1 loops=1) Filter: ((companyid)::text= 'DEFAULT'::text) -> Hash (cost=13.01..13.01 rows=601 width=16) (actual time=4.735..4.735 rows=0 loops=1) -> Seq Scan on _bins b (cost=0.00..13.01 rows=601 width=16) (actual time=0.054..2.846 rows=601loops=1) Total runtime: 441.685 ms (41 rows) --- When I query the view with a simple filter, I get: explain analyze select * from p_areas where deactive is null; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Nested Loop IN Join (cost=8.60..524.28 rows=1 width=163) (actual time=1023.291..20025.620 rows=34 loops=1) Join Filter: ("outer".area = "inner".area) -> Seq Scan on _areas a (cost=0.00..2.48 rows=1 width=163) (actual time=0.037..0.804 rows=48 loops=1) Filter: (deactive IS NULL) -> Nested Loop (cost=8.60..25530.60 rows=2532 width=8) (actual time=0.345..402.775 rows=3408 loops=48) -> Nested Loop (cost=8.60..16893.61 rows=2531 width=8) (actual time=0.304..264.929 rows=3408 loops=48) -> Merge Join (cost=8.60..2912.00 rows=2751 width=8) (actual time=0.258..120.841 rows=3408 loops=48) Merge Cond: ("outer".pricegroup = "inner".pricegroup) -> Nested Loop IN Join (cost=8.60..1837.73 rows=9 width=8) (actual time=0.216..4.612 rows=8 loops=48) Join Filter: ("outer".buyer = "inner".store) -> Index Scan using i_pricemembers3 on _pricemembers p (cost=0.00..10.96 rows=207 width=16)(actual time=0.022..1.045 rows=138 loops=48) -> Subquery Scan "IN_subquery" (cost=8.60..8.72 rows=8 width=8) (actual time=0.011..0.017 rows=1loops=6606) -> Unique (cost=8.60..8.64 rows=8 width=8) (actual time=0.006..0.010 rows=1 loops=6606) -> Sort (cost=8.60..8.62 rows=8 width=8) (actual time=0.003..0.004 rows=1 loops=6606) Sort Key: store -> Append (cost=2.86..8.48 rows=8 width=8) (actual time=7.667..7.757 rows=1loops=1) -> Subquery Scan "*SELECT* 1" (cost=2.86..5.15 rows=5 width=8) (actualtime=7.362..7.362 rows=0 loops=1) -> Hash Join (cost=2.86..5.10 rows=5 width=8) (actual time=7.355..7.355rows=0 loops=1) Hash Cond: ("outer".company = "inner".company) -> Seq Scan on _stores s (cost=0.00..2.13 rows=13 width=16)(actual time=0.013..0.054 rows=13 loops=1) -> Hash (cost=2.85..2.85 rows=1 width=8) (actual time=7.163..7.163rows=0 loops=1) -> Seq Scan on _webusers w (cost=0.00..2.85 rows=1width=8) (actual time=7.154..7.154 rows=0 loops=1) Filter: (webuser = getwebuser()) -> Subquery Scan "*SELECT* 2" (cost=1.08..3.33 rows=3 width=8) (actualtime=0.295..0.381 rows=1 loops=1) -> Hash Join (cost=1.08..3.30 rows=3 width=8) (actual time=0.286..0.368rows=1 loops=1) Hash Cond: ("outer".company = "inner".company) -> Seq Scan on _stores s (cost=0.00..2.13 rows=13 width=16)(actual time=0.008..0.080 rows=13 loops=1) -> Hash (cost=1.07..1.07 rows=1 width=8) (actual time=0.116..0.116rows=0 loops=1) -> Seq Scan on _companies c (cost=0.00..1.07 rows=1width=8) (actual time=0.062..0.083 rows=1 loops=1) Filter: ((companyid)::text = 'DEFAULT'::text) -> Index Scan using i_offers4 on _offers o (cost=0.00..1007.93 rows=15524 width=16) (actual time=0.023..67.183rows=10049 loops=48) -> Index Scan using i_inventories1 on _inventories i (cost=0.00..5.07 rows=1 width=16) (actual time=0.025..0.029rows=1 loops=163561) Index Cond: (i.inventory = "outer".inventory) -> Index Scan using i_bins1 on _bins b (cost=0.00..3.40 rows=1 width=16) (actual time=0.021..0.026 rows=1 loops=163561) Index Cond: (b.bin = "outer".bin) Total runtime: 20027.414 ms (36 rows) --- That's a slow-down on execution time by a factor of 50, even though the row count was the same: 34. In fact, it's MUCH faster to do: create temporary table foo as select * from p_areas; select * from foo where deactive is null; The database has been analyzed. Any tips would be greatly appreciated. Mike Mascari P.S.: I turned off word-wrap in my mail client for this post. Is that the right thing to do for analyze output?
Mike Mascari wrote: > I have the following view: > > create or replace view market.p_areas as > select a.* > from _areas a > where a.area in ( > select b.area > from _bins b, _inventories i, _offers o, _pricemembers p > where b.bin = i.bin and > i.inventory = o.inventory and > o.pricegroup = p.pricegroup and > p.buyer in ( > select s.store > from _stores s, _webusers w > where w.webuser = getWebuser() and > w.company = s.company > union > select s.store > from _stores s, _companies c > where s.company = c.company and > c.companyid = 'DEFAULT' > ) > ); ... I failed to report the version: select version(); PostgreSQL 7.4.5 on i686-pc-linux-gnu, compiled by GCC i686-pc-linux-gnu-gcc (GCC) 3.4.0 20040204 (prerelease) Sorry. Mike Mascari
Mike Mascari <mascarm@mascari.com> writes: > When I query the view with a simple filter, I get: > explain analyze select * from p_areas where deactive is null; The problem seems to be here: > -> Seq Scan on _areas a (cost=0.00..2.48 rows=1 width=163) (actual time=0.037..0.804 rows=48 loops=1) > Filter: (deactive IS NULL) Why is it so completely off about the selectivity of the IS NULL clause? Are you sure you ANALYZEd this table recently? regards, tom lane
Tom Lane wrote: > Mike Mascari <mascarm@mascari.com> writes: > >>When I query the view with a simple filter, I get: > > >>explain analyze select * from p_areas where deactive is null; > > > The problem seems to be here: > > >> -> Seq Scan on _areas a (cost=0.00..2.48 rows=1 width=163) (actual time=0.037..0.804 rows=48 loops=1) >> Filter: (deactive IS NULL) > > > Why is it so completely off about the selectivity of the IS NULL clause? > Are you sure you ANALYZEd this table recently? Yes. I just did: [estore@lexus] vacuum full analyze; VACUUM [estore@lexus] explain analyze select * from p_areas where deactive is null; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Nested Loop IN Join (cost=8.62..512.47 rows=1 width=162) (actual time=1143.969..21811.417 rows=37 loops=1) Join Filter: ("outer".area = "inner".area) -> Seq Scan on _areas a (cost=0.00..2.49 rows=1 width=162) (actual time=0.037..1.673 rows=49 loops=1) Filter: (deactive IS NULL) -> Nested Loop (cost=8.62..25740.20 rows=2681 width=8) (actual time=1.172..429.501 rows=3566 loops=49) -> Nested Loop (cost=8.62..16674.93 rows=2680 width=8) (actual time=1.125..281.570 rows=3566 loops=49) -> Merge Join (cost=8.62..3012.72 rows=2778 width=8) (actual time=0.876..128.908 rows=3566 loops=49) Merge Cond: ("outer".pricegroup = "inner".pricegroup) -> Nested Loop IN Join (cost=8.62..1929.41 rows=9 width=8) (actual time=0.613..5.504 rows=9 loops=49) Join Filter: ("outer".buyer = "inner".store) -> Index Scan using i_pricemembers3 on _pricemembers p (cost=0.00..11.13 rows=217 width=16)(actual time=0.403..1.476 rows=142 loops=49) -> Subquery Scan "IN_subquery" (cost=8.62..8.74 rows=8 width=8) (actual time=0.013..0.019 rows=1loops=6950) -> Unique (cost=8.62..8.66 rows=8 width=8) (actual time=0.007..0.010 rows=1 loops=6950) -> Sort (cost=8.62..8.64 rows=8 width=8) (actual time=0.003..0.004 rows=1 loops=6950) Sort Key: store -> Append (cost=2.87..8.50 rows=8 width=8) (actual time=8.394..8.446 rows=1loops=1) -> Subquery Scan "*SELECT* 1" (cost=2.87..5.17 rows=5 width=8) (actualtime=8.112..8.112 rows=0 loops=1) -> Hash Join (cost=2.87..5.12 rows=5 width=8) (actual time=8.106..8.106rows=0 loops=1) Hash Cond: ("outer".company = "inner".company) -> Seq Scan on _stores s (cost=0.00..2.13 rows=13 width=16)(actual time=0.014..0.052 rows=13 loops=1) -> Hash (cost=2.87..2.87 rows=1 width=8) (actual time=7.878..7.878rows=0 loops=1) -> Seq Scan on _webusers w (cost=0.00..2.87 rows=1width=8) (actual time=7.868..7.868 rows=0 loops=1) Filter: (webuser = getwebuser()) -> Subquery Scan "*SELECT* 2" (cost=1.08..3.33 rows=3 width=8) (actualtime=0.273..0.322 rows=1 loops=1) -> Hash Join (cost=1.08..3.30 rows=3 width=8) (actual time=0.263..0.308rows=1 loops=1) Hash Cond: ("outer".company = "inner".company) -> Seq Scan on _stores s (cost=0.00..2.13 rows=13 width=16)(actual time=0.008..0.042 rows=13 loops=1) -> Hash (cost=1.07..1.07 rows=1 width=8) (actual time=0.093..0.093rows=0 loops=1) -> Seq Scan on _companies c (cost=0.00..1.07 rows=1width=8) (actual time=0.061..0.081 rows=1 loops=1) Filter: ((companyid)::text = 'DEFAULT'::text) -> Index Scan using i_offers4 on _offers o (cost=0.00..1014.76 rows=16298 width=16) (actual time=0.244..72.742rows=10433 loops=49) -> Index Scan using i_inventories1 on _inventories i (cost=0.00..4.91 rows=1 width=16) (actual time=0.025..0.029rows=1 loops=174715) Index Cond: (i.inventory = "outer".inventory) -> Index Scan using i_bins1 on _bins b (cost=0.00..3.37 rows=1 width=16) (actual time=0.022..0.027 rows=1 loops=174715) Index Cond: (b.bin = "outer".bin) Total runtime: 21813.209 ms _areas looks like: [estore@lexus] \d _areas Table "temporal._areas" Column | Type | Modifiers --------------------+--------------------------+------------------------ area | bigint | not null store | bigint | not null name | character varying(32) | not null description | character varying(64) | not null email | character varying(48) | not null phoneno | character varying(16) | not null requisition_device | bigint | not null inventory_device | bigint | not null receive_device | bigint | not null invoice_device | bigint | not null activation_device | bigint | not null active | timestamp with time zone | not null default now() deactive | timestamp with time zone | Indexes: "i_areas1" unique, btree (area) "i_areas2" unique, btree (store, name) WHERE (deactive IS NULL) "i_areas3" btree (store, name) Triggers: t_areas1 BEFORE INSERT OR DELETE OR UPDATE ON _areas FOR EACH ROW EXECUTE PROCEDURE ri_areas() Note that if I disable nestedloop plans, I get: [estore@lexus] explain analyze select * from p_areas where deactive is null; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Hash Join (cost=1456.90..1457.15 rows=1 width=162) (actual time=423.273..424.156 rows=37 loops=1) Hash Cond: ("outer".area = "inner".area) -> HashAggregate (cost=1454.40..1454.40 rows=48 width=8) (actual time=422.192..422.334 rows=37 loops=1) -> Hash Join (cost=582.79..1447.70 rows=2681 width=8) (actual time=188.687..406.584 rows=5694 loops=1) Hash Cond: ("outer".bin = "inner".bin) -> Hash Join (cost=568.07..1386.07 rows=2680 width=8) (actual time=182.756..358.441 rows=5694 loops=1) Hash Cond: ("outer".inventory = "inner".inventory) -> Seq Scan on _inventories i (cost=0.00..280.04 rows=12004 width=16) (actual time=0.013..38.221rows=12004 loops=1) -> Hash (cost=561.12..561.12 rows=2778 width=8) (actual time=182.543..182.543 rows=0 loops=1) -> Hash Join (cost=14.13..561.12 rows=2778 width=8) (actual time=9.854..160.963 rows=5694 loops=1) Hash Cond: ("outer".pricegroup = "inner".pricegroup) -> Seq Scan on _offers o (cost=0.00..396.98 rows=16298 width=16) (actual time=0.011..58.422rows=16298 loops=1) -> Hash (cost=14.10..14.10 rows=9 width=8) (actual time=9.728..9.728 rows=0 loops=1) -> Hash IN Join (cost=8.76..14.10 rows=9 width=8) (actual time=8.616..9.657 rows=13loops=1) Hash Cond: ("outer".buyer = "inner".store) -> Seq Scan on _pricemembers p (cost=0.00..4.17 rows=217 width=16) (actualtime=0.011..0.565 rows=217 loops=1) -> Hash (cost=8.74..8.74 rows=8 width=8) (actual time=8.465..8.465 rows=0loops=1) -> Subquery Scan "IN_subquery" (cost=8.62..8.74 rows=8 width=8) (actualtime=8.446..8.455 rows=1 loops=1) -> Unique (cost=8.62..8.66 rows=8 width=8) (actual time=8.430..8.435rows=1 loops=1) -> Sort (cost=8.62..8.64 rows=8 width=8) (actual time=8.424..8.426rows=1 loops=1) Sort Key: store -> Append (cost=2.87..8.50 rows=8 width=8) (actualtime=8.004..8.058 rows=1 loops=1) -> Subquery Scan "*SELECT* 1" (cost=2.87..5.17rows=5 width=8) (actual time=7.710..7.710 rows=0 loops=1) -> Hash Join (cost=2.87..5.12 rows=5width=8) (actual time=7.701..7.701 rows=0 loops=1) Hash Cond: ("outer".company = "inner".company) -> Seq Scan on _stores s (cost=0.00..2.13rows=13 width=16) (actual time=0.013..0.052 rows=13 loops=1) -> Hash (cost=2.87..2.87 rows=1width=8) (actual time=7.486..7.486 rows=0 loops=1) -> Seq Scan on _webusers w (cost=0.00..2.87 rows=1 width=8) (actual time=7.478..7.478 rows=0 loops=1) Filter: (webuser = getwebuser()) -> Subquery Scan "*SELECT* 2" (cost=1.08..3.33rows=3 width=8) (actual time=0.284..0.336 rows=1 loops=1) -> Hash Join (cost=1.08..3.30 rows=3width=8) (actual time=0.274..0.321 rows=1 loops=1) Hash Cond: ("outer".company = "inner".company) -> Seq Scan on _stores s (cost=0.00..2.13rows=13 width=16) (actual time=0.008..0.046 rows=13 loops=1) -> Hash (cost=1.07..1.07 rows=1width=8) (actual time=0.096..0.096 rows=0 loops=1) -> Seq Scan on _companiesc (cost=0.00..1.07 rows=1 width=8) (actual time=0.064..0.083 rows=1 loops=1) Filter: ((companyid)::text= 'DEFAULT'::text) -> Hash (cost=13.18..13.18 rows=618 width=16) (actual time=5.849..5.849 rows=0 loops=1) -> Seq Scan on _bins b (cost=0.00..13.18 rows=618 width=16) (actual time=0.027..3.554 rows=618 loops=1) -> Hash (cost=2.49..2.49 rows=1 width=162) (actual time=0.960..0.960 rows=0 loops=1) -> Seq Scan on _areas a (cost=0.00..2.49 rows=1 width=162) (actual time=0.033..0.197 rows=49 loops=1) Filter: (deactive IS NULL) Total runtime: 427.390 ms Thanks! Mike Mascari
--- Mike Mascari <mascarm@mascari.com> escribió: > Tom Lane wrote: > > Mike Mascari <mascarm@mascari.com> writes: > > > >>When I query the view with a simple filter, I get: > > > > > >>explain analyze select * from p_areas where > deactive is null; > > > > > > The problem seems to be here: > > > > > >> -> Seq Scan on _areas a (cost=0.00..2.48 > rows=1 width=163) (actual time=0.037..0.804 rows=48 > loops=1) > >> Filter: (deactive IS NULL) > > > > > > Why is it so completely off about the selectivity > of the IS NULL clause? null values are not indexable, is that your question? If it is your question then create a partial index with where deactive is null. regards, Jaime Casanova _________________________________________________________ Do You Yahoo!? Información de Estados Unidos y América Latina, en Yahoo! Noticias. Visítanos en http://noticias.espanol.yahoo.com
Jaime Casanova <systemguards@yahoo.com> writes: > Tom Lane wrote: >> Why is it so completely off about the selectivity >> of the IS NULL clause? > null values are not indexable, is that your question? Uh, no. The problem is that the IS NULL condition matched all 48 rows of the table, but the planner thought it would only match one row. This is definitely covered by the pg_stats statistics, and with only 48 live rows there couldn't possibly have been any sampling error, so what the heck went wrong there? regards, tom lane
Mike Mascari <mascarm@mascari.com> writes: > Tom Lane wrote: >> Why is it so completely off about the selectivity of the IS NULL clause? > I think this is a bug in ANALYZE not constructing statistics for columns > whose data is entirely NULL: Um ... doh ... analyze.c about line 1550: /* We can only compute valid stats if we found some non-null values. */ if (nonnull_cnt > 0) ... There's a bit of an epistemological issue here: if we didn't actually find any nonnull values in our sample, is it legitimate to assume that the column is entirely null? On the other hand, if we find only "3" in our sample we will happily assume the column contains only "3", so I dunno why we are discriminating against null. This seems like a case that just hasn't come up before. regards, tom lane
Tom Lane wrote: > Um ... doh ... analyze.c about line 1550: > > /* We can only compute valid stats if we found some non-null values. */ > if (nonnull_cnt > 0) > ... > > There's a bit of an epistemological issue here: if we didn't actually > find any nonnull values in our sample, is it legitimate to assume that > the column is entirely null? On the other hand, if we find only "3" in > our sample we will happily assume the column contains only "3", so I > dunno why we are discriminating against null. This seems like a case > that just hasn't come up before. Will this discriminatory policy toward null end for 8.0? Mike Mascari