Thread: CTE optimization fence on the todo list?

CTE optimization fence on the todo list?

From
Daniel Browning
Date:
I would like to have the option of disabling the CTE optimization fence for 
certain CTEs and/or queries. Can that be added to the official todo list? If 
not, why not?

I would find the option beneficial because large, complicated queries are 
often a lot clearer, simpler, and easier to read with CTEs than the 
equivalent query without CTEs. In some cases, the query with CTEs is also 
faster because of the optimization fence. But in other cases, the fence 
makes it a lot slower. In the latter cases, you are left with a choice 
between ugly and slow.

If there was some method to disable the optimization fence for certain CTEs 
or entire queries, then it would be possible to have the best of both 
worlds.

I apologize if this has already been covered before. I could only find two 
earlier discussions on this topic:

http://archives.postgresql.org/pgsql-performance/2011-10/msg00208.php
http://archives.postgresql.org/pgsql-performance/2011-11/msg00015.php

In the latter, I counted four people would are in support of the general 
idea: Robert Haas, Andres Freund, Gavin Flower, Justin Pitts. However, I'm 
sure there are a lot of conflicting ideas on how exactly to go about it, 
such as whether to enable or disable it by default, the specific syntax to 
use, backwards compatibility, future-proofing, etc.

One good reason to reject it would be if it can't be done with SQL standard 
syntax and would require some sort of PG-specific hint or GUC variable for 
the query planner. If so, then I understand that it's opposed for all the 
same reasons that hints are opposed in general.

Another good reason to reject it might be because the only way to disable 
the CTE fence is to disable it by default. If that were the case, then I 
would imagine that it would break backwards compatibility, especially in the 
case of writable CTEs that currently depend on the fence for their current 
functionality. If there is no way to palatably enable it by default but 
allow certain CTEs or certain queries to disable it, then I don't see any 
way around that problem.

A third reason I can imagine is that the only desirable solution (e.g. the 
one without additional non-standard keywords or session GUC variables) is 
effectively impossible. For example, if it requires that the query planner 
determine definitively whether a CTE is read only or not, that may be a 
bridge too far.

A fourth possible reason is that the core team feels that CTEs do not 
improve readability, or that any such readability benefits are not worth the 
effort to support the option. Personally, I feel that the queries which 
could most benefit from the readability of CTEs are precisely the same ones 
that could most often benefit from the performance increase of disabling the 
fence (particularly if it could be done on a per-CTE basis rather than for 
the whole query at once).

Of course the real reason could be something else entirely, hence this post. 
Thanks in advance for your feedback.
--
DB



Re: CTE optimization fence on the todo list?

From
Merlin Moncure
Date:
On Wed, Sep 19, 2012 at 3:05 PM, Daniel Browning <db@kavod.com> wrote:
> Another good reason to reject it might be because the only way to disable
> the CTE fence is to disable it by default. If that were the case, then I
> would imagine that it would break backwards compatibility, especially in the
> case of writable CTEs that currently depend on the fence for their current
> functionality.

Yeah: I constantly rely on CTE fencing and it's a frequently suggested
performance trick on the lists.  LATERAL is coming out soon and this
will remove one of the largest reasons to fence but there are of
course others. Also, a GUC setting is almost certainly the wrong
approach.

I'm wondering if there are any technical/standards constraints that
are behind the fencing behavior.  If there aren't any, maybe an opt-in
keyword might do the trick -- WITH UNBOXED foo AS (..)?

merlin



Re: CTE optimization fence on the todo list?

From
Peter Geoghegan
Date:
On 1 October 2012 14:05, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Wed, Sep 19, 2012 at 3:05 PM, Daniel Browning <db@kavod.com> wrote:
> I'm wondering if there are any technical/standards constraints that
> are behind the fencing behavior.  If there aren't any, maybe an opt-in
> keyword might do the trick -- WITH UNBOXED foo AS (..)?

I may be mistaken, but I think that the fence is described in the SQL standard.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services



Re: CTE optimization fence on the todo list?

From
Dimitri Fontaine
Date:
Merlin Moncure <mmoncure@gmail.com> writes:
> I'm wondering if there are any technical/standards constraints that
> are behind the fencing behavior.  If there aren't any, maybe an opt-in

The fencing is per standard, and very useful when used in wCTEs.

> keyword might do the trick -- WITH UNBOXED foo AS (..)?

I would paint your proposal WITH VIEW foo AS (), which would make the
behaviour obvious I think.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support



Re: CTE optimization fence on the todo list?

From
Tom Lane
Date:
Merlin Moncure <mmoncure@gmail.com> writes:
> I'm wondering if there are any technical/standards constraints that
> are behind the fencing behavior.

I think the key reason is that we don't want partial execution of DML
operations (ie, INSERT/UPDATE/DELETE inside a WITH).  The fencing
behavior was put in originally because we foresaw adding DML later.

We could possibly relax the rule for WITH SELECT only, but it would
be rather inconsistent, not to mention unpleasant for all the people
who are relying on the current behavior for one reason or another.

Another issue is that if the CTE is scanned multiple times by the outer
query, you really can't optimize it on the basis of any one call site.
        regards, tom lane



Re: CTE optimization fence on the todo list?

From
Bruce Momjian
Date:
On Mon, Oct  1, 2012 at 10:07:01AM -0400, Tom Lane wrote:
> Merlin Moncure <mmoncure@gmail.com> writes:
> > I'm wondering if there are any technical/standards constraints that
> > are behind the fencing behavior.
> 
> I think the key reason is that we don't want partial execution of DML
> operations (ie, INSERT/UPDATE/DELETE inside a WITH).  The fencing
> behavior was put in originally because we foresaw adding DML later.
> 
> We could possibly relax the rule for WITH SELECT only, but it would
> be rather inconsistent, not to mention unpleasant for all the people
> who are relying on the current behavior for one reason or another.
> 
> Another issue is that if the CTE is scanned multiple times by the outer
> query, you really can't optimize it on the basis of any one call site.

If we wanted to relax the fencing, we might need to do it via an SQL
keyword on the SELECT, to avoid the confusion caused by GUCs.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



Re: CTE optimization fence on the todo list?

From
Stephen Frost
Date:
* Bruce Momjian (bruce@momjian.us) wrote:
> If we wanted to relax the fencing, we might need to do it via an SQL
> keyword on the SELECT, to avoid the confusion caused by GUCs.

I like the idea of providing a way for users to request non-fencing,
perhaps only allowed for SELECT CTEs.  I don't like the GUC approach.  I
also wonder if it'd make sense and/or be possible to have the fence
applied on a per-CTE basis (inside of the same overall query).  If we
add a keyword for this and it's not hard to do, I think that'd be a
really neat capability.  (No, unlike the OP, I don't have specific use
cases for that offhand, but why limit it to all or nothing for an entire
query..)
Thanks,
    Stephen

Re: CTE optimization fence on the todo list?

From
Chris Rogers
Date:
Has there been any movement on this in the last couple years?

I could really use the ability to optimize across CTE boundaries, and it seems like a lot of other people could too.

Re: CTE optimization fence on the todo list?

From
Robert Haas
Date:
On Thu, Apr 30, 2015 at 12:44 AM, Chris Rogers <teukros@gmail.com> wrote:
> Has there been any movement on this in the last couple years?
>
> I could really use the ability to optimize across CTE boundaries, and it
> seems like a lot of other people could too.

I'm not aware that anyone is working on it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: CTE optimization fence on the todo list?

From
Jim Nasby
Date:
On 4/30/15 6:35 AM, Robert Haas wrote:
> On Thu, Apr 30, 2015 at 12:44 AM, Chris Rogers <teukros@gmail.com> wrote:
>> Has there been any movement on this in the last couple years?
>>
>> I could really use the ability to optimize across CTE boundaries, and it
>> seems like a lot of other people could too.
>
> I'm not aware that anyone is working on it.

ISTR a comment to the effect of the SQL standard effectively requires 
current behavior.

I'd still love to see a way around that though, even if it means some 
kind of additional syntax; WITH is a lot nicer way to factor a query 
than 10 nested subselects...
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: CTE optimization fence on the todo list?

From
Robert Haas
Date:
On Fri, May 1, 2015 at 4:53 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> ISTR a comment to the effect of the SQL standard effectively requires
> current behavior.

I'd be astonished.  The SQL standard doesn't even know that there is
such a thing as an index, so I presume it doesn't dictate the behavior
of the query planner either.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: CTE optimization fence on the todo list?

From
Tom Lane
Date:
Jim Nasby <Jim.Nasby@BlueTreble.com> writes:
> On 4/30/15 6:35 AM, Robert Haas wrote:
>> On Thu, Apr 30, 2015 at 12:44 AM, Chris Rogers <teukros@gmail.com> wrote:
>>> I could really use the ability to optimize across CTE boundaries, and it
>>> seems like a lot of other people could too.

>> I'm not aware that anyone is working on it.

> ISTR a comment to the effect of the SQL standard effectively requires 
> current behavior.

I doubt that the spec says anything about it one way or another.
However, there are a lot of cases where we definitely can't push
constraints into a WITH:
* Data-modifying query in the WITH, eg UPDATE RETURNING --- pushing
outer constraints into it would change the set of rows updated.
* Multiply-referenced WITH item (unless the outer query applies
identical constraints to each reference, which seems silly and not
worth the cycles to check for).
* Recursive WITH item (well, maybe in some cases you could push down a
clause and not change the results, but it seems very hard to analyze).

So initially we just punted and didn't consider flattening WITHs at
all.  I'm not sure to what extent people are now expecting that behavior
and would be annoyed if we changed it.
        regards, tom lane



Re: CTE optimization fence on the todo list?

From
Peter Geoghegan
Date:
On Fri, May 1, 2015 at 2:36 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, May 1, 2015 at 4:53 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
>> ISTR a comment to the effect of the SQL standard effectively requires
>> current behavior.
>
> I'd be astonished.  The SQL standard doesn't even know that there is
> such a thing as an index, so I presume it doesn't dictate the behavior
> of the query planner either.

I agree. Somehow, the idea that this is within the standard caught on,
but I'm almost certain it's false.


-- 
Peter Geoghegan



Re: CTE optimization fence on the todo list?

From
David Steele
Date:
On 5/1/15 5:39 PM, Tom Lane wrote:
> Jim Nasby <Jim.Nasby@BlueTreble.com> writes:
>> On 4/30/15 6:35 AM, Robert Haas wrote:
>>> On Thu, Apr 30, 2015 at 12:44 AM, Chris Rogers <teukros@gmail.com> wrote:
>>>> I could really use the ability to optimize across CTE boundaries, and it
>>>> seems like a lot of other people could too.
>
>>> I'm not aware that anyone is working on it.
>
>> ISTR a comment to the effect of the SQL standard effectively requires
>> current behavior.
>
> I doubt that the spec says anything about it one way or another.
> However, there are a lot of cases where we definitely can't push
> constraints into a WITH:
> * Data-modifying query in the WITH, eg UPDATE RETURNING --- pushing
> outer constraints into it would change the set of rows updated.
> * Multiply-referenced WITH item (unless the outer query applies
> identical constraints to each reference, which seems silly and not
> worth the cycles to check for).
> * Recursive WITH item (well, maybe in some cases you could push down a
> clause and not change the results, but it seems very hard to analyze).
>
> So initially we just punted and didn't consider flattening WITHs at
> all.  I'm not sure to what extent people are now expecting that behavior
> and would be annoyed if we changed it.

I use CTEs for both organizational purposes and as optimization barriers
(in preference to using temp tables, when possible).

I'd definitely prefer to keep the barriers in place by default, perhaps
with a keyword to allow optimization across boundaries when appropriate.However, when I really need optimization across
boundariesI just use a 
subquery.

It doesn't seem like there's much to be gained in terms of net
functionality.

--
- David Steele
david@pgmasters.net


Re: CTE optimization fence on the todo list?

From
Tom Lane
Date:
David Steele <david@pgmasters.net> writes:
> On 5/1/15 5:39 PM, Tom Lane wrote:
>> I doubt that the spec says anything about it one way or another.
>> However, there are a lot of cases where we definitely can't push
>> constraints into a WITH:
>> * Data-modifying query in the WITH, eg UPDATE RETURNING --- pushing
>> outer constraints into it would change the set of rows updated.
>> * Multiply-referenced WITH item (unless the outer query applies
>> identical constraints to each reference, which seems silly and not
>> worth the cycles to check for).
>> * Recursive WITH item (well, maybe in some cases you could push down a
>> clause and not change the results, but it seems very hard to analyze).
>> 
>> So initially we just punted and didn't consider flattening WITHs at
>> all.  I'm not sure to what extent people are now expecting that behavior
>> and would be annoyed if we changed it.

> I use CTEs for both organizational purposes and as optimization barriers
> (in preference to using temp tables, when possible).

> I'd definitely prefer to keep the barriers in place by default, perhaps
> with a keyword to allow optimization across boundaries when appropriate.
>  However, when I really need optimization across boundaries I just use a
> subquery.

FWIW, a bit of thought suggests that it would not take a lot of code to
handle this: you'd just have to check the conditions mentioned above
and then convert the RTE_CTE item into an RTE_SUBQUERY, much like
inline_set_returning_functions does with RTE_FUNCTION items.

Assuming that that sketch is accurate, it would take more code to provide
a new user-visible knob to enable/disable the behavior than it would to
implement the optimization, which makes me pretty much -1 on providing
such a knob.  We should either do it or not.  If we do, people who want
optimization fences should use the traditional "OFFSET 0" hack.

(A possible compromise position would be to offer a new GUC to
enable/disable the optimization globally; that would add only a reasonably
small amount of control code, and people who were afraid of the change
breaking their apps would probably want a global disable anyway.)
        regards, tom lane



Re: CTE optimization fence on the todo list?

From
Peter Geoghegan
Date:
On Fri, May 1, 2015 at 3:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Assuming that that sketch is accurate, it would take more code to provide
> a new user-visible knob to enable/disable the behavior than it would to
> implement the optimization, which makes me pretty much -1 on providing
> such a knob.  We should either do it or not.  If we do, people who want
> optimization fences should use the traditional "OFFSET 0" hack.

+1


-- 
Peter Geoghegan



Re: CTE optimization fence on the todo list?

From
David Steele
Date:
On 5/1/15 6:32 PM, Peter Geoghegan wrote:
> On Fri, May 1, 2015 at 3:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Assuming that that sketch is accurate, it would take more code to provide
>> a new user-visible knob to enable/disable the behavior than it would to
>> implement the optimization, which makes me pretty much -1 on providing
>> such a knob.  We should either do it or not.  If we do, people who want
>> optimization fences should use the traditional "OFFSET 0" hack.
>
> +1

Not sure if I'm thrilled with the "OFFSET 0" hack but I guess it's not
much different from the CTE hack I've been using.

An "enable_cte_optimization" GUC would serve to keep old code from
breaking while giving new users/queries the advantage of optimization.

I'm not sure it's worth adding the complexity, though.  In my experience
not that many developers use CTEs.

--
- David Steele
david@pgmasters.net


Re: CTE optimization fence on the todo list?

From
Josh Berkus
Date:
On 05/01/2015 03:30 PM, Tom Lane wrote:
> Assuming that that sketch is accurate, it would take more code to provide
> a new user-visible knob to enable/disable the behavior than it would to
> implement the optimization, which makes me pretty much -1 on providing
> such a knob.  We should either do it or not.  If we do, people who want
> optimization fences should use the traditional "OFFSET 0" hack.

Yes.

> 
> (A possible compromise position would be to offer a new GUC to
> enable/disable the optimization globally; that would add only a reasonably
> small amount of control code, and people who were afraid of the change
> breaking their apps would probably want a global disable anyway.)

We'd need the GUC.  I know of a lot of cases where people are using WITH
clauses specifically to override the query planner, and requiring them
to edit all of their queries in order to enable the old behavior would
become an upgrade barrier.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: CTE optimization fence on the todo list?

From
"David G. Johnston"
Date:
On Fri, May 1, 2015 at 3:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Assuming that that sketch is accurate, it would take more code to provide
a new user-visible knob to enable/disable the behavior than it would to
implement the optimization, which makes me pretty much -1 on providing
such a knob.  We should either do it or not.  If we do, people who want
optimization fences should use the traditional "OFFSET 0" hack.

(A possible compromise position would be to offer a new GUC to
enable/disable the optimization globally; that would add only a reasonably
small amount of control code, and people who were afraid of the change
breaking their apps would probably want a global disable anyway.)

​+1 to both.  The default should be to allow the user to choose between CTE and inline subqueries for style reasons alone - as much as possible since you cannot have a correlated CTE nor a recursive subquery.

Trust in the planner, the planner is good.  If it isn't then requiring OFFSET 0 as the only means to create an optimization fence seems reasonable.

I like the GUC as an cheap means to keep the status-quo for those who desire it.

While the idea of overriding the status-quo on a per-query basis has some appeal the apparent cost-benefit ratio doesn't seem convincing.

David J.​

Re: CTE optimization fence on the todo list?

From
Andrew Dunstan
Date:
On 05/01/2015 07:24 PM, Josh Berkus wrote:
> O
>> (A possible compromise position would be to offer a new GUC to
>> enable/disable the optimization globally; that would add only a reasonably
>> small amount of control code, and people who were afraid of the change
>> breaking their apps would probably want a global disable anyway.)
> We'd need the GUC.  I know of a lot of cases where people are using WITH
> clauses specifically to override the query planner, and requiring them
> to edit all of their queries in order to enable the old behavior would
> become an upgrade barrier.
>

+100

This could be a very bad, almost impossible to catch, behaviour break. 
Even if we add the GUC, we're probably going to be imposing very 
significant code audit costs on some users.

cheers

andrew





Re: CTE optimization fence on the todo list?

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> On 05/01/2015 07:24 PM, Josh Berkus wrote:
>>> (A possible compromise position would be to offer a new GUC to
>>> enable/disable the optimization globally; that would add only a reasonably
>>> small amount of control code, and people who were afraid of the change
>>> breaking their apps would probably want a global disable anyway.)

> This could be a very bad, almost impossible to catch, behaviour break. 
> Even if we add the GUC, we're probably going to be imposing very 
> significant code audit costs on some users.

On what grounds do you claim it'd be a behavior break?  It's possible
that the subquery flattening would result in less-desirable plans not
more-desirable ones, but the results should still be correct.
        regards, tom lane



Re: CTE optimization fence on the todo list?

From
Andrew Dunstan
Date:
On 05/03/2015 11:49 AM, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> On 05/01/2015 07:24 PM, Josh Berkus wrote:
>>>> (A possible compromise position would be to offer a new GUC to
>>>> enable/disable the optimization globally; that would add only a reasonably
>>>> small amount of control code, and people who were afraid of the change
>>>> breaking their apps would probably want a global disable anyway.)
>> This could be a very bad, almost impossible to catch, behaviour break.
>> Even if we add the GUC, we're probably going to be imposing very
>> significant code audit costs on some users.
> On what grounds do you claim it'd be a behavior break?  It's possible
> that the subquery flattening would result in less-desirable plans not
> more-desirable ones, but the results should still be correct.
>
>             

I meant w.r.t. performance. Sorry if that wasn't clear.

cheers

andrew




Re: CTE optimization fence on the todo list?

From
Jim Nasby
Date:
On 5/3/15 11:59 AM, Andrew Dunstan wrote:
>
> On 05/03/2015 11:49 AM, Tom Lane wrote:
>> Andrew Dunstan <andrew@dunslane.net> writes:
>>> On 05/01/2015 07:24 PM, Josh Berkus wrote:
>>>>> (A possible compromise position would be to offer a new GUC to
>>>>> enable/disable the optimization globally; that would add only a
>>>>> reasonably
>>>>> small amount of control code, and people who were afraid of the change
>>>>> breaking their apps would probably want a global disable anyway.)
>>> This could be a very bad, almost impossible to catch, behaviour break.
>>> Even if we add the GUC, we're probably going to be imposing very
>>> significant code audit costs on some users.
>> On what grounds do you claim it'd be a behavior break?  It's possible
>> that the subquery flattening would result in less-desirable plans not
>> more-desirable ones, but the results should still be correct.
>
> I meant w.r.t. performance. Sorry if that wasn't clear.

To put this in perspective... I've seen things like this take query 
runtime from minutes to multiple hours or worse; bad enough that 
"behavior break" becomes a valid description.

We definitely need to highlight this in the release notes, and I think 
the GUC would be mandatory.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: CTE optimization fence on the todo list?

From
Chris Rogers
Date:
<div dir="ltr">I need this feature a lot.  Can anyone point me to a place in the code where I can hack together a
quick-and-dirty,compatibility-breaking implementation?  Thanks!<br /></div><div class="gmail_extra"><br /><div
class="gmail_quote">OnSun, May 3, 2015 at 10:03 PM, Jim Nasby <span dir="ltr"><<a
href="mailto:Jim.Nasby@bluetreble.com"target="_blank">Jim.Nasby@bluetreble.com</a>></span> wrote:<br /><blockquote
class="gmail_quote"style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On 5/3/15 11:59
AM,Andrew Dunstan wrote:<br /><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc
solid;padding-left:1ex"><br/> On 05/03/2015 11:49 AM, Tom Lane wrote:<br /><blockquote class="gmail_quote"
style="margin:00 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> Andrew Dunstan <<a
href="mailto:andrew@dunslane.net"target="_blank">andrew@dunslane.net</a>> writes:<br /><blockquote
class="gmail_quote"style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> On 05/01/2015 07:24 PM, Josh
Berkuswrote:<br /><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc
solid;padding-left:1ex"><blockquoteclass="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc
solid;padding-left:1ex">(A possible compromise position would be to offer a new GUC to<br /> enable/disable the
optimizationglobally; that would add only a<br /> reasonably<br /> small amount of control code, and people who were
afraidof the change<br /> breaking their apps would probably want a global disable anyway.)<br
/></blockquote></blockquote>This could be a very bad, almost impossible to catch, behaviour break.<br /> Even if we add
theGUC, we're probably going to be imposing very<br /> significant code audit costs on some users.<br /></blockquote>
Onwhat grounds do you claim it'd be a behavior break?  It's possible<br /> that the subquery flattening would result in
less-desirableplans not<br /> more-desirable ones, but the results should still be correct.<br /></blockquote><br /> I
meantw.r.t. performance. Sorry if that wasn't clear.<br /></blockquote><br /></span> To put this in perspective... I've
seenthings like this take query runtime from minutes to multiple hours or worse; bad enough that "behavior break"
becomesa valid description.<br /><br /> We definitely need to highlight this in the release notes, and I think the GUC
wouldbe mandatory.<span class="im HOEnZb"><br /> -- <br /> Jim Nasby, Data Architect, Blue Treble Consulting<br /> Data
inTrouble? Get it in Treble! <a href="http://BlueTreble.com" target="_blank">http://BlueTreble.com</a><br /><br /><br
/></span><divclass="HOEnZb"><div class="h5"> -- <br /> Sent via pgsql-hackers mailing list (<a
href="mailto:pgsql-hackers@postgresql.org"target="_blank">pgsql-hackers@postgresql.org</a>)<br /> To make changes to
yoursubscription:<br /><a href="http://www.postgresql.org/mailpref/pgsql-hackers"
target="_blank">http://www.postgresql.org/mailpref/pgsql-hackers</a><br/></div></div></blockquote></div><br /></div> 

Re: CTE optimization fence on the todo list?

From
Robert Haas
Date:
On Wed, May 20, 2015 at 12:58 AM, Chris Rogers <teukros@gmail.com> wrote:
> I need this feature a lot.  Can anyone point me to a place in the code where
> I can hack together a quick-and-dirty, compatibility-breaking
> implementation?  Thanks!

Does this help?

http://www.postgresql.org/message-id/38448.1430519406@sss.pgh.pa.us

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: CTE optimization fence on the todo list?

From
Qingqing Zhou
Date:
On Fri, May 1, 2015 at 2:39 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> * Multiply-referenced WITH item (unless the outer query applies
> identical constraints to each reference, which seems silly and not
> worth the cycles to check for).
>
Not sure if I understand this correctly. Look at this query, CTE q is
referenced twice and it is obviously expand it helps:

postgres=# explain with q as (select * from a) select * from q q1 join
q q2 on q1.i=q2.i where q1.i <= 10 and q2.i >=2;                             QUERY PLAN
----------------------------------------------------------------------Nested Loop  (cost=1443.59..1526.35 rows=9
width=16) CTE q    ->  Seq Scan on a a_2  (cost=0.00..1443.00 rows=100000 width=8)  ->  Index Scan using ai on a
(cost=0.29..8.45rows=9 width=8)        Index Cond: (i <= 10)  ->  Index Scan using ai on a a_1  (cost=0.29..8.31 rows=1
width=8)       Index Cond: ((i = a.i) AND (i >= 2))
 
(7 rows)

Another question is that CTEs might be used as an optimization fence.
Think about a query like this:
 WITH q1 as /* 5 table joins */, q2 as /* 5 tables join */, q3 ...,
q4 SELECT ...

If we expand all CTEs, we may end up optimize join with many tables
(could be bad). Or it is possible that users intentionally arrange
join in that way (more or less like hints) to override the optimizer.

We could look at geqo_threshold and decide how shall we expand, but
this may not be better than a GUC variable.

Regards,
Qingqing