Thread: Views + UNION ALL = Slow ?
Hello ! I have two tables (which contains individual months' data). One of them contains 500 thousand records and the other one about 40k, 8 columns. When I do a simple query on them individually it takes milli-seconds to complete (see gory details below). For some querys I want to include data from multiple months so I created a view using Union all. But Using the view it takes 31 Seconds to complete the "same" query. I am obviously doing something wrong or using something the wrong way. Any Ideas ? /Otto Blomqvist test=# explain analyze select fid_2 from file_92_904 where fid_4=1024; NOTICE: QUERY PLAN: Index Scan using file_92_904_ltn_idx on file_92_904 (cost=0.00..219.90 rows=65 width=4) (actual time=0.49..0.49 rows=0 loops=1) Total runtime: 0.57 msec EXPLAIN test=# explain analyze select fid_2 from file_92_1004 where fid_4=1024; NOTICE: QUERY PLAN: Index Scan using file_92_1004_ltn_idx on file_92_1004 (cost=0.00..4505.20 rows=1197 width=4) (actual time=32.36..32.36 rows=0 loops=1) Total runtime: 32.46 msec EXPLAIN test=# create view twotables as select * from file_92_1004 UNION ALL Select * from file_92_904; CREATE test=# explain analyze select fid_2 from twotables where fid_4=1024; NOTICE: QUERY PLAN: Subquery Scan twotables (cost=100000000.00..200023000.53 rows=569553 width=203) (actual time=31590.97..31590.97 rows=0 loops=1) -> Append (cost=100000000.00..200023000.53 rows=569553 width=203) (actual time=12.13..30683.67 rows=569553 loops=1) -> Subquery Scan *SELECT* 1 (cost=100000000.00..100021799.06 rows=540306 width=199) (actual time=12.12..28417.81 rows=540306 loops=1) -> Seq Scan on file_92_1004 (cost=100000000.00..100021799.06 rows=540306 width=199) (actual time=12.09..14946.47 rows=540306 loops=1) -> Subquery Scan *SELECT* 2 (cost=100000000.00..100001201.47 rows=29247 width=203) (actual time=0.19..1525.18 rows=29247 loops=1) -> Seq Scan on file_92_904 (cost=100000000.00..100001201.47 rows=29247 width=203) (actual time=0.14..793.34 rows=29247 loops=1) Total runtime: 31591.34 msec EXPLAIN
The difficulty is, that your view-based statement do not make use of any index. So the query must look at each tuple. It seems, that union all requires a full scan of the participates relations. I dont know if it is possible but try to create an index on the view ;-) Hagen Otto Blomqvist wrote: >Hello ! > >I have two tables (which contains individual months' data). One of >them contains 500 thousand records and the other one about 40k, 8 >columns. When I do a simple query on them individually it takes >milli-seconds to complete (see gory details below). For some querys I >want to include data from multiple months so I created a view using >Union all. But Using the view it takes 31 Seconds to complete the >"same" query. > >I am obviously doing something wrong or using something the wrong way. > >Any Ideas ? > >/Otto Blomqvist > > > >test=# explain analyze select fid_2 from file_92_904 where fid_4=1024; >NOTICE: QUERY PLAN: > >Index Scan using file_92_904_ltn_idx on file_92_904 >(cost=0.00..219.90 rows=65 width=4) (actual time=0.49..0.49 rows=0 >loops=1) >Total runtime: 0.57 msec > >EXPLAIN > >test=# explain analyze select fid_2 from file_92_1004 where >fid_4=1024; >NOTICE: QUERY PLAN: > >Index Scan using file_92_1004_ltn_idx on file_92_1004 >(cost=0.00..4505.20 rows=1197 width=4) (actual time=32.36..32.36 >rows=0 loops=1) >Total runtime: 32.46 msec > >EXPLAIN > >test=# create view twotables as select * from file_92_1004 UNION ALL >Select * from file_92_904; >CREATE > >test=# explain analyze select fid_2 from twotables where fid_4=1024; >NOTICE: QUERY PLAN: > >Subquery Scan twotables (cost=100000000.00..200023000.53 rows=569553 >width=203) (actual time=31590.97..31590.97 rows=0 loops=1) > -> Append (cost=100000000.00..200023000.53 rows=569553 width=203) >(actual time=12.13..30683.67 rows=569553 loops=1) > -> Subquery Scan *SELECT* 1 (cost=100000000.00..100021799.06 >rows=540306 width=199) (actual time=12.12..28417.81 rows=540306 >loops=1) > -> Seq Scan on file_92_1004 >(cost=100000000.00..100021799.06 rows=540306 width=199) (actual >time=12.09..14946.47 rows=540306 loops=1) > -> Subquery Scan *SELECT* 2 (cost=100000000.00..100001201.47 >rows=29247 width=203) (actual time=0.19..1525.18 rows=29247 loops=1) > -> Seq Scan on file_92_904 >(cost=100000000.00..100001201.47 rows=29247 width=203) (actual >time=0.14..793.34 rows=29247 loops=1) >Total runtime: 31591.34 msec > >EXPLAIN > >---------------------------(end of broadcast)--------------------------- >TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly > > > >
o.blomqvist@secomintl.com (Otto Blomqvist) writes: > I am obviously doing something wrong or using something the wrong way. What PG version are you using? 7.3 and later can push the WHERE condition down into the view, but older versions won't. regards, tom lane
You can't create an index on a view (as far as I could figure out). However it can be easily argued that you don't need any because the way views seem to be implemented in PG is as an alias for the view query. So when you join a table to a view you are actually joining it to the related table in that view and it would therefore take advantage of any existing index on those fields. Sim Zacks IT Manager CompuLab 04-829-0145 - Office 04-832-5251 - Fax ________________________________________________________________________________ The difficulty is, that your view-based statement do not make use of any index. So the query must look at each tuple. It seems, that union all requires a full scan of the participates relations. I dont know if it is possible but try to create an index on the view ;-) Hagen Otto Blomqvist wrote: >Hello ! > >I have two tables (which contains individual months' data). One of >them contains 500 thousand records and the other one about 40k, 8 >columns. When I do a simple query on them individually it takes >milli-seconds to complete (see gory details below). For some querys I >want to include data from multiple months so I created a view using >Union all. But Using the view it takes 31 Seconds to complete the >"same" query. > >I am obviously doing something wrong or using something the wrong way. > >Any Ideas ? > >/Otto Blomqvist > > > >test=# explain analyze select fid_2 from file_92_904 where fid_4=1024; >NOTICE: QUERY PLAN: > >Index Scan using file_92_904_ltn_idx on file_92_904 >(cost=0.00..219.90 rows=65 width=4) (actual time=0.49..0.49 rows=0 >loops=1) >Total runtime: 0.57 msec > >EXPLAIN > >test=# explain analyze select fid_2 from file_92_1004 where >fid_4=1024; >NOTICE: QUERY PLAN: > >Index Scan using file_92_1004_ltn_idx on file_92_1004 >(cost=0.00..4505.20 rows=1197 width=4) (actual time=32.36..32.36 >rows=0 loops=1) >Total runtime: 32.46 msec > >EXPLAIN > >test=# create view twotables as select * from file_92_1004 UNION ALL >Select * from file_92_904; >CREATE > >test=# explain analyze select fid_2 from twotables where fid_4=1024; >NOTICE: QUERY PLAN: > >Subquery Scan twotables (cost=100000000.00..200023000.53 rows=569553 >width=203) (actual time=31590.97..31590.97 rows=0 loops=1) > -> Append (cost=100000000.00..200023000.53 rows=569553 width=203) >(actual time=12.13..30683.67 rows=569553 loops=1) > -> Subquery Scan *SELECT* 1 (cost=100000000.00..100021799.06 >rows=540306 width=199) (actual time=12.12..28417.81 rows=540306 >loops=1) > -> Seq Scan on file_92_1004 >(cost=100000000.00..100021799.06 rows=540306 width=199) (actual >time=12.09..14946.47 rows=540306 loops=1) > -> Subquery Scan *SELECT* 2 (cost=100000000.00..100001201.47 >rows=29247 width=203) (actual time=0.19..1525.18 rows=29247 loops=1) > -> Seq Scan on file_92_904 >(cost=100000000.00..100001201.47 rows=29247 width=203) (actual >time=0.14..793.34 rows=29247 loops=1) >Total runtime: 31591.34 msec > >EXPLAIN > >---------------------------(end of broadcast)--------------------------- >TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly > > > > ---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly