Thread: Can I force a query plan to materialise part?
Hi, Is there any way to force the query planner to do a materialisation stage? I have a query that joins two views, and takes 28 seconds to run. However if I create temporary tables that contain the contents of each view, and then join them, the total time is 1.3 seconds. Is there a way to hint to the query planner that it should be taking that approach? Thanks, Toby
On Oct 1, 2012, at 22:33, Toby Corkindale <toby.corkindale@strategicdata.com.au> wrote: > Hi, > Is there any way to force the query planner to do a materialisation stage? > > I have a query that joins two views, and takes 28 seconds to run. > However if I create temporary tables that contain the contents of each view, and then join them, the total time is 1.3seconds. > > Is there a way to hint to the query planner that it should be taking that approach? > > Thanks, > Toby > > Have you tried? With v1 as (), v2 as () select v1 join v2 David J.
Hi, On 2 October 2012 12:33, Toby Corkindale <toby.corkindale@strategicdata.com.au> wrote: > I have a query that joins two views, and takes 28 seconds to run. > However if I create temporary tables that contain the contents of each view, > and then join them, the total time is 1.3 seconds. try "offset 0" (or you can tweak statistics collector to get better estimates): select ... from (select * from view offset 0) as v .... http://blog.endpoint.com/2009/04/offset-0-ftw.html -- Ondrej Ivanic (ondrej.ivanic@gmail.com)
On 02/10/12 12:44, David Johnston wrote: > On Oct 1, 2012, at 22:33, Toby Corkindale <toby.corkindale@strategicdata.com.au> wrote: > >> Hi, >> Is there any way to force the query planner to do a materialisation stage? >> >> I have a query that joins two views, and takes 28 seconds to run. >> However if I create temporary tables that contain the contents of each view, and then join them, the total time is 1.3seconds. >> >> Is there a way to hint to the query planner that it should be taking that approach? >> >> Thanks, >> Toby > > Have you tried? > > With v1 as (), v2 as () select v1 join v2 Hi David, I just tried that now, and it ran in 350ms; much faster than even the temporary-table-creating method. thanks! Toby
On 02/10/12 13:02, Ondrej Ivanič wrote: > Hi, > > On 2 October 2012 12:33, Toby Corkindale > <toby.corkindale@strategicdata.com.au> wrote: >> I have a query that joins two views, and takes 28 seconds to run. >> However if I create temporary tables that contain the contents of each view, >> and then join them, the total time is 1.3 seconds. > > try "offset 0" (or you can tweak statistics collector to get better estimates): > select ... from (select * from view offset 0) as v .... > > http://blog.endpoint.com/2009/04/offset-0-ftw.html Thanks, that ran in 820ms. (Same query with subselects but without the OFFSET 0 ran in 28370ms) David's method is also running in 820ms. I erroneously stated it was 349ms before, but that was actually how long it took to display the results of EXPLAIN. I wish I could work out what's wrong with the statistics that cause the query plan to go awry.. the tables aren't actually very large and I've played with the statistics setup previously and it seemed right.. many thanks, Toby
Hi, On 2 October 2012 13:28, Toby Corkindale <toby.corkindale@strategicdata.com.au> wrote: >>> I have a query that joins two views, and takes 28 seconds to run. >>> However if I create temporary tables that contain the contents of each >>> view, >>> and then join them, the total time is 1.3 seconds. >> >> >> try "offset 0" (or you can tweak statistics collector to get better >> estimates): >> select ... from (select * from view offset 0) as v .... > > I wish I could work out what's wrong with the statistics that cause the > query plan to go awry.. the tables aren't actually very large and I've > played with the statistics setup previously and it seemed right.. Try this (in single session): explain analyze <your query> set default_statistics_target = 1000 (or 500 or 250; 1000 might take ages) analyze table_1; analyze table_2; ..., analyze table_N; (all involved tables in your query) explain analyze <your query> and compare explains outputs. If estimates are very different (magnitude or two) then you should tweak autovacuum frequency and set per column statistics (ie. keep default_statistics_target = 100 (default), and change it on per column basis) but this could be tedious: Although per-column tweaking of ANALYZE frequency might not be very productive, you might find it worthwhile to do per-column adjustment of the level of detail of the statistics collected by ANALYZE. Columns that are heavily used in WHERE clauses and have highly irregular data distributions might require a finer-grain data histogram than other columns. See ALTER TABLE SET STATISTICS, or change the database-wide default using the default_statistics_target configuration parameter. Also, by default there is limited information available about the selectivity of functions. However, if you create an expression index that uses a function call, useful statistics will be gathered about the function, which can greatly improve query plans that use the expression index. http://www.postgresql.org/docs/9.1/static/routine-vacuuming.html#VACUUM-FOR-STATISTICS -- Ondrej Ivanic (ondrej.ivanic@gmail.com)