Thread: Can I force a query plan to materialise part?

Can I force a query plan to materialise part?

From

Toby Corkindale

Date:

02 October 2012, 02:33:45

Hi,
Is there any way to force the query planner to do a materialisation stage?

I have a query that joins two views, and takes 28 seconds to run.
However if I create temporary tables that contain the contents of each
view, and then join them, the total time is 1.3 seconds.

Is there a way to hint to the query planner that it should be taking
that approach?

Thanks,
Toby

Re: Can I force a query plan to materialise part?

From

David Johnston

Date:

02 October 2012, 02:44:32

On Oct 1, 2012, at 22:33, Toby Corkindale <toby.corkindale@strategicdata.com.au> wrote:

> Hi,
> Is there any way to force the query planner to do a materialisation stage?
>
> I have a query that joins two views, and takes 28 seconds to run.
> However if I create temporary tables that contain the contents of each view, and then join them, the total time is
1.3seconds. 
>
> Is there a way to hint to the query planner that it should be taking that approach?
>
> Thanks,
> Toby
>
>

Have you tried?

With v1 as (), v2 as () select v1 join v2

David J.

Re: Can I force a query plan to materialise part?

From

Ondrej Ivanič

Date:

02 October 2012, 03:03:16

Hi,

On 2 October 2012 12:33, Toby Corkindale
<toby.corkindale@strategicdata.com.au> wrote:
> I have a query that joins two views, and takes 28 seconds to run.
> However if I create temporary tables that contain the contents of each view,
> and then join them, the total time is 1.3 seconds.

try "offset 0" (or you can tweak statistics collector to get better estimates):
select ... from (select * from view offset 0) as v ....

http://blog.endpoint.com/2009/04/offset-0-ftw.html

--
Ondrej Ivanic
(ondrej.ivanic@gmail.com)

Re: Can I force a query plan to materialise part?

From

Toby Corkindale

Date:

02 October 2012, 03:23:07

On 02/10/12 12:44, David Johnston wrote:
> On Oct 1, 2012, at 22:33, Toby Corkindale <toby.corkindale@strategicdata.com.au> wrote:
>
>> Hi,
>> Is there any way to force the query planner to do a materialisation stage?
>>
>> I have a query that joins two views, and takes 28 seconds to run.
>> However if I create temporary tables that contain the contents of each view, and then join them, the total time is
1.3seconds. 
>>
>> Is there a way to hint to the query planner that it should be taking that approach?
>>
>> Thanks,
>> Toby
>
> Have you tried?
>
> With v1 as (), v2 as () select v1 join v2

Hi David,
I just tried that now, and it ran in 350ms; much faster than even the
temporary-table-creating method.

thanks!
Toby

Re: Can I force a query plan to materialise part?

From

Toby Corkindale

Date:

02 October 2012, 03:28:42

On 02/10/12 13:02, Ondrej Ivanič wrote:
> Hi,
>
> On 2 October 2012 12:33, Toby Corkindale
> <toby.corkindale@strategicdata.com.au> wrote:
>> I have a query that joins two views, and takes 28 seconds to run.
>> However if I create temporary tables that contain the contents of each view,
>> and then join them, the total time is 1.3 seconds.
>
> try "offset 0" (or you can tweak statistics collector to get better estimates):
> select ... from (select * from view offset 0) as v ....
>
> http://blog.endpoint.com/2009/04/offset-0-ftw.html

Thanks, that ran in 820ms. (Same query with subselects but without the
OFFSET 0 ran in 28370ms)

David's method is also running in 820ms. I erroneously stated it was
349ms before, but that was actually how long it took to display the
results of EXPLAIN.

I wish I could work out what's wrong with the statistics that cause the
query plan to go awry.. the tables aren't actually very large and I've
played with the statistics setup previously and it seemed right..

many thanks,
Toby

Re: Can I force a query plan to materialise part?

From

Ondrej Ivanič

Date:

02 October 2012, 05:37:44

Hi,

On 2 October 2012 13:28, Toby Corkindale
<toby.corkindale@strategicdata.com.au> wrote:
>>> I have a query that joins two views, and takes 28 seconds to run.
>>> However if I create temporary tables that contain the contents of each
>>> view,
>>> and then join them, the total time is 1.3 seconds.
>>
>>
>> try "offset 0" (or you can tweak statistics collector to get better
>> estimates):
>> select ... from (select * from view offset 0) as v ....
>
> I wish I could work out what's wrong with the statistics that cause the
> query plan to go awry.. the tables aren't actually very large and I've
> played with the statistics setup previously and it seemed right..

Try this (in single session):
explain analyze <your query>
set default_statistics_target = 1000 (or 500 or 250; 1000 might take ages)
analyze table_1; analyze table_2; ..., analyze table_N; (all involved
tables in your query)
explain analyze <your query>

and compare explains outputs. If estimates are very different
(magnitude or two) then you should tweak autovacuum frequency and set
per column statistics (ie. keep  default_statistics_target = 100
(default), and change it on per column basis) but this could be
tedious:

Although per-column tweaking of ANALYZE frequency might not be very
productive, you might find it worthwhile to do per-column adjustment
of the level of detail of the statistics collected by ANALYZE. Columns
that are heavily used in WHERE clauses and have highly irregular data
distributions might require a finer-grain data histogram than other
columns. See ALTER TABLE SET STATISTICS, or change the database-wide
default using the default_statistics_target configuration parameter.
Also, by default there is limited information available about the
selectivity of functions. However, if you create an expression index
that uses a function call, useful statistics will be gathered about
the function, which can greatly improve query plans that use the
expression index.
http://www.postgresql.org/docs/9.1/static/routine-vacuuming.html#VACUUM-FOR-STATISTICS

--
Ondrej Ivanic
(ondrej.ivanic@gmail.com)