Thread: wCTE behaviour

wCTE behaviour

From
Marko Tiikkaja
Date:
Hi all,

The discussion around wCTE during the last week or so has brought to my 
attention that we don't actually have a consensus on how exactly wCTEs 
should behave.  The question seems to be whether or not a statement 
should see the modifications of statements ran before it.  While I think 
making the modifications visible would be a lot more intuitive, it's not 
clear how we'd optimize the execution in the future without changing the 
behaviour (triggers are a big concern).

I've done some digging today and it seems that IBM's DB2 took the more 
intuitive approach: all statements are ran, in the order they're written 
in, to completion before the main statement, materializing the "deltas" 
into a temporary table and the modifications are made visible to the 
next statements.

I have no idea how many complaints they have received about this 
behaviour, but I'd be in favor of matching it.

Thoughts?


Regards,
Marko Tiikkaja


Re: wCTE behaviour

From
David Fetter
Date:
On Thu, Nov 11, 2010 at 04:15:34AM +0200, Marko Tiikkaja wrote:
> Hi all,
> 
> The discussion around wCTE during the last week or so has brought to
> my attention that we don't actually have a consensus on how exactly
> wCTEs should behave.  The question seems to be whether or not a
> statement should see the modifications of statements ran before it.
> While I think making the modifications visible would be a lot more
> intuitive, it's not clear how we'd optimize the execution in the
> future without changing the behaviour (triggers are a big concern).

+1 for letting writeable CTEs see the results of previous CTEs, just
as current non-writeable ones do.  A lot of the useful cases for this
feature depend on this visibility.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: wCTE behaviour

From
Marko Tiikkaja
Date:
On 2010-11-11 6:41 PM +0200, David Fetter wrote:
> On Thu, Nov 11, 2010 at 04:15:34AM +0200, Marko Tiikkaja wrote:
>> The discussion around wCTE during the last week or so has brought to
>> my attention that we don't actually have a consensus on how exactly
>> wCTEs should behave.  The question seems to be whether or not a
>> statement should see the modifications of statements ran before it.
>> While I think making the modifications visible would be a lot more
>> intuitive, it's not clear how we'd optimize the execution in the
>> future without changing the behaviour (triggers are a big concern).
>
> +1 for letting writeable CTEs see the results of previous CTEs, just
> as current non-writeable ones do.  A lot of the useful cases for this
> feature depend on this visibility.

Just to be clear, the main point is whether they see the data 
modifications or not.  The simplest case to point out this behaviour is:

WITH t AS (DELETE FROM foo)
SELECT * FROM foo;

And the big question is: what state of "foo" should the SELECT statement 
see?


Regards,
Marko Tiikkaja


Re: wCTE behaviour

From
Thom Brown
Date:
<div class="gmail_quote">On 11 November 2010 16:50, Marko Tiikkaja <span dir="ltr"><<a
href="mailto:marko.tiikkaja@cs.helsinki.fi">marko.tiikkaja@cs.helsinki.fi</a>></span>wrote:<br /><blockquote
class="gmail_quote"style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left:
1ex;"><divclass="im">On 2010-11-11 6:41 PM +0200, David Fetter wrote:<br /></div><blockquote class="gmail_quote"
style="margin:0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div class="im"> On
Thu,Nov 11, 2010 at 04:15:34AM +0200, Marko Tiikkaja wrote:<br /></div><div class="im"><blockquote class="gmail_quote"
style="margin:0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"> The discussion around
wCTEduring the last week or so has brought to<br /> my attention that we don't actually have a consensus on how
exactly<br/> wCTEs should behave.  The question seems to be whether or not a<br /> statement should see the
modificationsof statements ran before it.<br /> While I think making the modifications visible would be a lot more<br
/>intuitive, it's not clear how we'd optimize the execution in the<br /> future without changing the behaviour
(triggersare a big concern).<br /></blockquote><br /> +1 for letting writeable CTEs see the results of previous CTEs,
just<br/> as current non-writeable ones do.  A lot of the useful cases for this<br /> feature depend on this
visibility.<br/></div></blockquote><br /> Just to be clear, the main point is whether they see the data modifications
ornot.  The simplest case to point out this behaviour is:<br /><br /> WITH t AS (DELETE FROM foo)<br /> SELECT * FROM
foo;<br/><br /> And the big question is: what state of "foo" should the SELECT statement see?<br /><br
/></blockquote></div><br/>I would expect that select to return nothing.  And if the user wished to reference what was
deleted,they could use RETURNING anyway. </probable ignorance><br /><br />WITH t AS (UPDATE foo SET col =
true)<br/> SELECT * FROM foo WHERE col = false;<br /><br />... Wouldn't this be more practical to have foo's UPDATEs
appliedprior to SELECT?  Otherwise what would the usecase be?<br clear="all" /><br />-- <br />Thom Brown<br />Twitter:
@darkixion<br/> IRC (freenode): dark_ixion<br />Registered Linux user: #516935<br /> 

Re: wCTE behaviour

From
Tom Lane
Date:
Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes:
> On 2010-11-11 6:41 PM +0200, David Fetter wrote:
>> On Thu, Nov 11, 2010 at 04:15:34AM +0200, Marko Tiikkaja wrote:
>>> The discussion around wCTE during the last week or so has brought to
>>> my attention that we don't actually have a consensus on how exactly
>>> wCTEs should behave.  The question seems to be whether or not a
>>> statement should see the modifications of statements ran before it.

>> +1 for letting writeable CTEs see the results of previous CTEs, just
>> as current non-writeable ones do.  A lot of the useful cases for this
>> feature depend on this visibility.

> Just to be clear, the main point is whether they see the data 
> modifications or not.  The simplest case to point out this behaviour is:

> WITH t AS (DELETE FROM foo)
> SELECT * FROM foo;

> And the big question is: what state of "foo" should the SELECT statement 
> see?

You've already predetermined the outcome of the argument by phrasing it
that way: if you assume that the CTE runs "before" the main statement
then the conclusion is foregone.  To my mind, they should be thought of
as running in parallel, or at least in an indeterminate order, just
exactly the same way that different data modifications made in a single
INSERT/UPDATE/DELETE command are considered to be made simultaneously.

If someone came to us and complained because his ON UPDATE trigger
couldn't reliably see changes made to other rows by the same UPDATE
command, and could we please make UPDATE more deterministic, we'd
tell him to rethink what he was doing.  This is the same thing.

It is already the case that a user who pushes on things hard enough can
see that a WITH isn't really run "before" the main command.  For
example,

regression=# create sequence s1;
CREATE SEQUENCE
regression=# with tt(x,y) as (select x, nextval('s1') from generate_series(1,10) x)
regression-# select x,y, nextval('s1') as z from tt;x  | y  | z  
----+----+---- 1 |  1 |  2 2 |  3 |  4 3 |  5 |  6 4 |  7 |  8 5 |  9 | 10 6 | 11 | 12 7 | 13 | 14 8 | 15 | 16 9 | 17 |
1810| 19 | 20
 
(10 rows)

If we establish a precedent that WITHs can be thought of as executing
before the main command, we will eventually have to de-optimize existing
WITH behavior.  Or else make up reasons why the inconsistency is okay in
some cases and not others, but that will definitely be a case of
rationalizing after the fact.
        regards, tom lane


Re: wCTE behaviour

From
"David E. Wheeler"
Date:
On Nov 11, 2010, at 9:13 AM, Tom Lane wrote:

> If we establish a precedent that WITHs can be thought of as executing
> before the main command, we will eventually have to de-optimize existing
> WITH behavior.  Or else make up reasons why the inconsistency is okay in
> some cases and not others, but that will definitely be a case of
> rationalizing after the fact.

I can see that, but if one can't see the result of the write, or can't determine whether or not it will be visible in
advance,what's the point of writeable CTEs? 

Best,

David



Re: wCTE behaviour

From
Tom Lane
Date:
"David E. Wheeler" <david@kineticode.com> writes:
> I can see that, but if one can't see the result of the write, or can't determine whether or not it will be visible in
advance,what's the point of writeable CTEs?
 

The writeable CTE returns a RETURNING set, which you can and should use
in the outer query.  The thing that is being argued about here is what
you see if you look "directly" at the target table rather than making
use of RETURNING.  Essentially, I'm arguing that we shouldn't promise
any particular behavior at that level, just as we don't promise that
UPDATE updates different rows in any determinate order.
        regards, tom lane


Re: wCTE behaviour

From
Tom Lane
Date:
Thom Brown <thom@linux.com> writes:
> WITH t AS (UPDATE foo SET col = true)
> SELECT * FROM foo WHERE col = false;

> ... Wouldn't this be more practical to have foo's UPDATEs applied prior to
> SELECT?  Otherwise what would the usecase be?

If that's what you want, you might as well just issue two separate
statements.  There is no use-case for this at all unless the WITH
produces some RETURNING data that the SELECT makes use of.
        regards, tom lane


Re: wCTE behaviour

From
Marko Tiikkaja
Date:
On 11 Nov 2010, at 19:13, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes:
>> On 2010-11-11 6:41 PM +0200, David Fetter wrote:
>>> On Thu, Nov 11, 2010 at 04:15:34AM +0200, Marko Tiikkaja wrote:
>>>> The discussion around wCTE during the last week or so has brought  
>>>> to
>>>> my attention that we don't actually have a consensus on how exactly
>>>> wCTEs should behave.  The question seems to be whether or not a
>>>> statement should see the modifications of statements ran before it.
>
>>> +1 for letting writeable CTEs see the results of previous CTEs, just
>>> as current non-writeable ones do.  A lot of the useful cases for  
>>> this
>>> feature depend on this visibility.
>
>> Just to be clear, the main point is whether they see the data
>> modifications or not.  The simplest case to point out this  
>> behaviour is:
>
>> WITH t AS (DELETE FROM foo)
>> SELECT * FROM foo;
>
>> And the big question is: what state of "foo" should the SELECT  
>> statement
>> see?
>
> You've already predetermined the outcome of the argument by phrasing  
> it
> that way: if you assume that the CTE runs "before" the main statement
> then the conclusion is foregone.  To my mind, they should be thought  
> of
> as running in parallel, or at least in an indeterminate order, just
> exactly the same way that different data modifications made in a  
> single
> INSERT/UPDATE/DELETE command are considered to be made simultaneously.

> ..

> If we establish a precedent that WITHs can be thought of as executing
> before the main command, we will eventually have to de-optimize  
> existing
> WITH behavior.  Or else make up reasons why the inconsistency is  
> okay in
> some cases and not others, but that will definitely be a case of
> rationalizing after the fact.

I apologize, I had misunderstood what you are suggesting.  But now  
that I do, it seems to be an even worse idea to go your way.  Based on  
my research, I'm almost certain that the SQL standard says that the  
execution order is deterministic if there is at least one DML  
statement in the WITH list.

Can anyone confirm this?


Regards,
Marko Tiikkaja


Re: wCTE behaviour

From
Merlin Moncure
Date:
On Thu, Nov 11, 2010 at 12:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> then the conclusion is foregone.  To my mind, they should be thought of
> as running in parallel, or at least in an indeterminate order, just
> exactly the same way that different data modifications made in a single
> INSERT/UPDATE/DELETE command are considered to be made simultaneously.

+1

merlin


Re: wCTE behaviour

From
"David E. Wheeler"
Date:
On Nov 11, 2010, at 9:29 AM, Tom Lane wrote:

>> I can see that, but if one can't see the result of the write, or can't determine whether or not it will be visible
inadvance, what's the point of writeable CTEs? 
>
> The writeable CTE returns a RETURNING set, which you can and should use
> in the outer query.  The thing that is being argued about here is what
> you see if you look "directly" at the target table rather than making
> use of RETURNING.  Essentially, I'm arguing that we shouldn't promise
> any particular behavior at that level, just as we don't promise that
> UPDATE updates different rows in any determinate order.

Yes, if RETURNING guarantees the execution order, then great. That was the first thing I tried to do before I realized
thatthe current CTE implementation doesn't support w. 

David



Re: wCTE behaviour

From
Tom Lane
Date:
"David E. Wheeler" <david@kineticode.com> writes:
> On Nov 11, 2010, at 9:29 AM, Tom Lane wrote:
>> The writeable CTE returns a RETURNING set, which you can and should use
>> in the outer query.  The thing that is being argued about here is what
>> you see if you look "directly" at the target table rather than making
>> use of RETURNING.  Essentially, I'm arguing that we shouldn't promise
>> any particular behavior at that level, just as we don't promise that
>> UPDATE updates different rows in any determinate order.

> Yes, if RETURNING guarantees the execution order, then great. That was the first thing I tried to do before I
realizedthat the current CTE implementation doesn't support w.
 

Well, it doesn't "guarantee the execution order", it's just that that's
the defined conduit for getting information out of the WITH and into the
parent query.  Looking directly at the table is not that conduit.

I misspoke by saying that the behavior would be nondeterministic.
What I think we should do is run all elements of the tree with the
same snapshot, which would provide perfectly deterministic behavior:
if you look at the target table, you see the prior state.  You don't
see the updated state, which is what allows us to possibly optimize
things so that the updates aren't completely made before execution
of the main query starts.
        regards, tom lane


Re: wCTE behaviour

From
David Fetter
Date:
On Thu, Nov 11, 2010 at 12:36:38PM -0500, Merlin Moncure wrote:
> On Thu, Nov 11, 2010 at 12:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > then the conclusion is foregone.  To my mind, they should be thought of
> > as running in parallel, or at least in an indeterminate order, just
> > exactly the same way that different data modifications made in a single
> > INSERT/UPDATE/DELETE command are considered to be made simultaneously.
> 
> +1

-1.

When people want to see what has gone before, they can use RETURNING
clauses.  With the "indeterminate order" proposal, they cannot.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: wCTE behaviour

From
David Fetter
Date:
On Thu, Nov 11, 2010 at 12:34:55PM -0500, Tom Lane wrote:
> Thom Brown <thom@linux.com> writes:
> > WITH t AS (UPDATE foo SET col = true)
> > SELECT * FROM foo WHERE col = false;
> 
> > ... Wouldn't this be more practical to have foo's UPDATEs applied
> > prior to SELECT?  Otherwise what would the usecase be?
> 
> If that's what you want, you might as well just issue two separate
> statements.  There is no use-case for this at all unless the WITH
> produces some RETURNING data that the SELECT makes use of.

There are lots of use cases where it does exactly this.  One simple
example is maintaining a rollup table, so as less-rolled data get
deleted, they get aggregated into an INSERT into that table.  Think of
RRDtool, only with a real data store.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: wCTE behaviour

From
Tom Lane
Date:
David Fetter <david@fetter.org> writes:
> On Thu, Nov 11, 2010 at 12:36:38PM -0500, Merlin Moncure wrote:
>> On Thu, Nov 11, 2010 at 12:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> then the conclusion is foregone. �To my mind, they should be thought of
> as running in parallel, or at least in an indeterminate order, just
> exactly the same way that different data modifications made in a single
> INSERT/UPDATE/DELETE command are considered to be made simultaneously.
>> 
>> +1

> -1.

> When people want to see what has gone before, they can use RETURNING
> clauses.  With the "indeterminate order" proposal, they cannot.

Say what?  The RETURNING data is well defined in any case.
        regards, tom lane


Re: wCTE behaviour

From
Merlin Moncure
Date:
On Thu, Nov 11, 2010 at 1:53 PM, David Fetter <david@fetter.org> wrote:
> On Thu, Nov 11, 2010 at 12:36:38PM -0500, Merlin Moncure wrote:
>> On Thu, Nov 11, 2010 at 12:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> > then the conclusion is foregone.  To my mind, they should be thought of
>> > as running in parallel, or at least in an indeterminate order, just
>> > exactly the same way that different data modifications made in a single
>> > INSERT/UPDATE/DELETE command are considered to be made simultaneously.
>>
>> +1
>
> -1.
>
> When people want to see what has gone before, they can use RETURNING
> clauses.  With the "indeterminate order" proposal, they cannot.

If you want to see what happened 'before' you *must* use a returning
clause.  It's the link that pipelines data from one query to another.
There is in fact no 'before', just a way to define hook output into
input.  ISTM you have a lot more available routes of CTE optimization
if you go this way.

but, can you present an example of a case that depends on execution
order w/o returning? maybe I'm not seeing something...

merlin


Re: wCTE behaviour

From
Tom Lane
Date:
David Fetter <david@fetter.org> writes:
> On Thu, Nov 11, 2010 at 12:34:55PM -0500, Tom Lane wrote:
>> If that's what you want, you might as well just issue two separate
>> statements.  There is no use-case for this at all unless the WITH
>> produces some RETURNING data that the SELECT makes use of.

> There are lots of use cases where it does exactly this.

Name *one*.  If there is no RETURNING data, there is absolutely no
reason to use WITH instead of issuing the query separately.  In fact,
I would assume that a DML query without RETURNING would not even be
syntactically legal in WITH.

> One simple
> example is maintaining a rollup table, so as less-rolled data get
> deleted, they get aggregated into an INSERT into that table.

Yes, exactly.  The way you would do that is something like
WITH del AS (DELETE FROM foo WHERE whatever RETURNING *)INSERT INTO rollup SELECT * FROM del;

I am very interested to see how you will do the same thing without
using RETURNING and with the behavior you claim to want that the
DELETE is visibly complete before the INSERT starts.  Where's the
INSERT gonna get the already-deleted data from?

With my proposal (ie, both queries using same snapshot) you could
actually do it without RETURNING, like this:
WITH useless_cte AS (DELETE FROM foo WHERE whatever)INSERT INTO rollup SELECT * FROM foo WHERE same-whatever;

But I don't see any reason to think that that's a superior way to write
the query, especially since it might be subject to weird race conditions
against other concurrent modifications of the table.  RETURNING is just
a lot saner way to be sure that you're looking at exactly what the
DELETE deleted.
        regards, tom lane


Re: wCTE behaviour

From
Yeb Havinga
Date:
On 2010-11-11 17:50, Marko Tiikkaja wrote:
> Just to be clear, the main point is whether they see the data 
> modifications or not.  The simplest case to point out this behaviour is:
>
> WITH t AS (DELETE FROM foo)
> SELECT * FROM foo;
>
> And the big question is: what state of "foo" should the SELECT 
> statement see?
Since t is not referenced in the query, foo should not be deleted at 
all, like
WITH t AS (SELECT nextval('seq'))
SELECT * FROM foo
does not update the sequence.

But if t is referenced..
WITH t AS (DELETE FROM foo RETURNING *)
SELECT * FROM foo NATURAL JOIN t;

Since the extension of t can only be known by deleting foo, it makes 
sense that this query cannot return rows. "Select the rows from foo that 
I just deleted."

regards,
Yeb Havinga



Re: wCTE behaviour

From
Tom Lane
Date:
Yeb Havinga <yebhavinga@gmail.com> writes:
> On 2010-11-11 17:50, Marko Tiikkaja wrote:
>> Just to be clear, the main point is whether they see the data 
>> modifications or not.  The simplest case to point out this behaviour is:
>> 
>> WITH t AS (DELETE FROM foo)
>> SELECT * FROM foo;
>> 
>> And the big question is: what state of "foo" should the SELECT 
>> statement see?

> Since t is not referenced in the query, foo should not be deleted at 
> all,

Yeah, that's another interesting question: should we somehow force
unreferenced CTEs to be evaluated anyhow?  Now that I think about it,
there was also some concern about the possibility of the outer query
not reading the CTE all the way to the end, ie
WITH t AS (DELETE FROM foo RETURNING *)SELECT * FROM t LIMIT 1;

How many rows does this delete?  I think we concluded that we should
force the DELETE to be run to conclusion even if the outer query didn't
read it all.  From an implementation standpoint that makes it more
attractive to do the DELETE first and stick its results in a tuplestore
--- but I still think we should view that as an implementation detail,
not as part of the specification.
        regards, tom lane


Re: wCTE behaviour

From
Robert Haas
Date:
On Fri, Nov 12, 2010 at 10:25 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Yeb Havinga <yebhavinga@gmail.com> writes:
>> On 2010-11-11 17:50, Marko Tiikkaja wrote:
>>> Just to be clear, the main point is whether they see the data
>>> modifications or not.  The simplest case to point out this behaviour is:
>>>
>>> WITH t AS (DELETE FROM foo)
>>> SELECT * FROM foo;
>>>
>>> And the big question is: what state of "foo" should the SELECT
>>> statement see?
>
>> Since t is not referenced in the query, foo should not be deleted at
>> all,
>
> Yeah, that's another interesting question: should we somehow force
> unreferenced CTEs to be evaluated anyhow?  Now that I think about it,
> there was also some concern about the possibility of the outer query
> not reading the CTE all the way to the end, ie
>
>        WITH t AS (DELETE FROM foo RETURNING *)
>        SELECT * FROM t LIMIT 1;
>
> How many rows does this delete?  I think we concluded that we should
> force the DELETE to be run to conclusion even if the outer query didn't
> read it all.  From an implementation standpoint that makes it more
> attractive to do the DELETE first and stick its results in a tuplestore
> --- but I still think we should view that as an implementation detail,
> not as part of the specification.

Yeah, I think we have to force any DML statements in CTEs to run to
completion, whether we need the results or not, and even if they are
unreferenced.  Otherwise it's going to be really confusing, I fear.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: wCTE behaviour

From
David Fetter
Date:
On Fri, Nov 12, 2010 at 10:25:51AM -0500, Tom Lane wrote:
> Yeb Havinga <yebhavinga@gmail.com> writes:
> > On 2010-11-11 17:50, Marko Tiikkaja wrote:
> >> Just to be clear, the main point is whether they see the data 
> >> modifications or not.  The simplest case to point out this behaviour is:
> >> 
> >> WITH t AS (DELETE FROM foo)
> >> SELECT * FROM foo;
> >> 
> >> And the big question is: what state of "foo" should the SELECT 
> >> statement see?
> 
> > Since t is not referenced in the query, foo should not be deleted at 
> > all,
> 
> Yeah, that's another interesting question: should we somehow force
> unreferenced CTEs to be evaluated anyhow?

Yes.

> Now that I think about it,
> there was also some concern about the possibility of the outer query
> not reading the CTE all the way to the end, ie
> 
>     WITH t AS (DELETE FROM foo RETURNING *)
>     SELECT * FROM t LIMIT 1;
> 
> How many rows does this delete?  I think we concluded that we should
> force the DELETE to be run to conclusion even if the outer query didn't
> read it all.

Yes.

> From an implementation standpoint that makes it more
> attractive to do the DELETE first and stick its results in a tuplestore
> --- but I still think we should view that as an implementation detail,
> not as part of the specification.

Right :)

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: wCTE behaviour

From
David Fetter
Date:
On Fri, Nov 12, 2010 at 10:50:52AM -0500, Robert Haas wrote:
> On Fri, Nov 12, 2010 at 10:25 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Yeb Havinga <yebhavinga@gmail.com> writes:
> >> On 2010-11-11 17:50, Marko Tiikkaja wrote:
> >>> Just to be clear, the main point is whether they see the data
> >>> modifications or not.  The simplest case to point out this behaviour is:
> >>>
> >>> WITH t AS (DELETE FROM foo)
> >>> SELECT * FROM foo;
> >>>
> >>> And the big question is: what state of "foo" should the SELECT
> >>> statement see?
> >
> >> Since t is not referenced in the query, foo should not be deleted at
> >> all,
> >
> > Yeah, that's another interesting question: should we somehow force
> > unreferenced CTEs to be evaluated anyhow?  Now that I think about it,
> > there was also some concern about the possibility of the outer query
> > not reading the CTE all the way to the end, ie
> >
> >        WITH t AS (DELETE FROM foo RETURNING *)
> >        SELECT * FROM t LIMIT 1;
> >
> > How many rows does this delete?  I think we concluded that we should
> > force the DELETE to be run to conclusion even if the outer query didn't
> > read it all.  From an implementation standpoint that makes it more
> > attractive to do the DELETE first and stick its results in a tuplestore
> > --- but I still think we should view that as an implementation detail,
> > not as part of the specification.
> 
> Yeah, I think we have to force any DML statements in CTEs to run to
> completion, whether we need the results or not, and even if they are
> unreferenced.  Otherwise it's going to be really confusing, I fear.

Yes, and as we add more things--COPY is the first but probably not the
last--to CTEs, this "no action-at-a-distance" behavior will become
even more important.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: wCTE behaviour

From
Hitoshi Harada
Date:
2010/11/13 Robert Haas <robertmhaas@gmail.com>:
> On Fri, Nov 12, 2010 at 10:25 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Yeb Havinga <yebhavinga@gmail.com> writes:
>>> On 2010-11-11 17:50, Marko Tiikkaja wrote:
>>>> Just to be clear, the main point is whether they see the data
>>>> modifications or not.  The simplest case to point out this behaviour is:
>>>>
>>>> WITH t AS (DELETE FROM foo)
>>>> SELECT * FROM foo;
>>>>
>>>> And the big question is: what state of "foo" should the SELECT
>>>> statement see?
>>
>>> Since t is not referenced in the query, foo should not be deleted at
>>> all,
>>
>> Yeah, that's another interesting question: should we somehow force
>> unreferenced CTEs to be evaluated anyhow?  Now that I think about it,
>> there was also some concern about the possibility of the outer query
>> not reading the CTE all the way to the end, ie
>>
>>        WITH t AS (DELETE FROM foo RETURNING *)
>>        SELECT * FROM t LIMIT 1;
>>
>> How many rows does this delete?  I think we concluded that we should
>> force the DELETE to be run to conclusion even if the outer query didn't
>> read it all.  From an implementation standpoint that makes it more
>> attractive to do the DELETE first and stick its results in a tuplestore
>> --- but I still think we should view that as an implementation detail,
>> not as part of the specification.
>
> Yeah, I think we have to force any DML statements in CTEs to run to
> completion, whether we need the results or not, and even if they are
> unreferenced.  Otherwise it's going to be really confusing, I fear.

One thing that has annoyed me while designing this feature is if as
Tom suggests the all queries are executed in the same snapshot and
optimized as the current read-only CTE does  we are tempted to support
recursive and forward-reference in even DML CTE. It explodes out my
head and I'd like not to think about it if we can.

On the other hand, different-snapshot, serialized execution model
occurs the problem I originally rose in the previous thread, in which
the space to store the data shared among different plans is missing.
It's of course doable, but the easier implementation the better.

I'm inclined to agree with the same snapshot model, that is not only
easier to implement but also fits the current SQL processing design
and the existing CTE specification. Not only from the developer's view
but consistency from user's view. Whatever the standard says on the
DML *subquery*, we're going to create our new *CTE* feature. Yes, this
is CTE. For recursive and forward-reference issue, we can just forbid
them in DML CTE at first.


Regards,

--
Hitoshi Harada


Re: wCTE behaviour

From
Marko Tiikkaja
Date:
Hi all,

It appears that we have a consensus on the behaviour.

I'm going to take some time off this weekend to get a patch with this 
behaviour to the next commitfest.


Regards,
Marko Tiikkaja


Re: wCTE behaviour

From
David Fetter
Date:
On Sat, Nov 13, 2010 at 01:50:46AM +0900, Hitoshi Harada wrote:
> 2010/11/13 Robert Haas <robertmhaas@gmail.com>:
> > On Fri, Nov 12, 2010 at 10:25 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Yeb Havinga <yebhavinga@gmail.com> writes:
> >>> On 2010-11-11 17:50, Marko Tiikkaja wrote:
> >>>> Just to be clear, the main point is whether they see the data
> >>>> modifications or not.  The simplest case to point out this behaviour is:
> >>>>
> >>>> WITH t AS (DELETE FROM foo)
> >>>> SELECT * FROM foo;
> >>>>
> >>>> And the big question is: what state of "foo" should the SELECT
> >>>> statement see?
> >>
> >>> Since t is not referenced in the query, foo should not be deleted at
> >>> all,
> >>
> >> Yeah, that's another interesting question: should we somehow force
> >> unreferenced CTEs to be evaluated anyhow?  Now that I think about it,
> >> there was also some concern about the possibility of the outer query
> >> not reading the CTE all the way to the end, ie
> >>
> >>        WITH t AS (DELETE FROM foo RETURNING *)
> >>        SELECT * FROM t LIMIT 1;
> >>
> >> How many rows does this delete?  I think we concluded that we should
> >> force the DELETE to be run to conclusion even if the outer query didn't
> >> read it all.  From an implementation standpoint that makes it more
> >> attractive to do the DELETE first and stick its results in a tuplestore
> >> --- but I still think we should view that as an implementation detail,
> >> not as part of the specification.
> >
> > Yeah, I think we have to force any DML statements in CTEs to run to
> > completion, whether we need the results or not, and even if they are
> > unreferenced.  Otherwise it's going to be really confusing, I fear.
> 
> One thing that has annoyed me while designing this feature is if as
> Tom suggests the all queries are executed in the same snapshot and
> optimized as the current read-only CTE does  we are tempted to
> support recursive and forward-reference in even DML CTE.  It
> explodes out my head and I'd like not to think about it if we can.

Does this have about the same head-explodiness as the mutually
recursive CTEs described in the SQL standard?  More?  Less?

> On the other hand, different-snapshot, serialized execution model
> occurs the problem I originally rose in the previous thread, in which
> the space to store the data shared among different plans is missing.
> It's of course doable, but the easier implementation the better.
> 
> I'm inclined to agree with the same snapshot model, that is not only
> easier to implement but also fits the current SQL processing design
> and the existing CTE specification. Not only from the developer's view
> but consistency from user's view. Whatever the standard says on the
> DML *subquery*, we're going to create our new *CTE* feature. Yes, this
> is CTE. For recursive and forward-reference issue, we can just forbid
> them in DML CTE at first.

Sounds good :)

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: wCTE behaviour

From
Dimitri Fontaine
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:
>     WITH t AS (DELETE FROM foo RETURNING *)
>     SELECT * FROM t LIMIT 1;
>
> How many rows does this delete?  I think we concluded that we should
> force the DELETE to be run to conclusion even if the outer query didn't
> read it all

The counter-example that jumps to mind is unix pipes. It's read-only at
the consumer level but as soon as you stop reading, the producer stops.
I guess that's only talking about the surprise factor, though.

I'm not sure how far we go with the SIGPIPE analogy, but I wanted to say
that maybe that would not feel so strange to some people if the DELETE
were not run to completion but only until the reader is done.

What about this one:
 WITH d AS (DELETE FROM foo RETURNING id),      q AS (INSERT INTO queue SELECT 'D', id FROM d) SELECT * FROM q ORDER BY
idLIMIT 10;
 

For next example, replace INSERT with a MERGE to remove a previously
existing 'I' or 'U' event in the queue when we add a 'D'. Bonus points
if wCTE allows to implement the query without resorting to MERGE at all,
which would be nice in my mind.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: wCTE behaviour

From
Tom Lane
Date:
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> WITH t AS (DELETE FROM foo RETURNING *)
>> SELECT * FROM t LIMIT 1;
>> 
>> How many rows does this delete?  I think we concluded that we should
>> force the DELETE to be run to conclusion even if the outer query didn't
>> read it all

> The counter-example that jumps to mind is unix pipes. It's read-only at
> the consumer level but as soon as you stop reading, the producer stops.
> I guess that's only talking about the surprise factor, though.

> I'm not sure how far we go with the SIGPIPE analogy, but I wanted to say
> that maybe that would not feel so strange to some people if the DELETE
> were not run to completion but only until the reader is done.

I can see that there's a fair argument for that position in cases like
the above, but the trouble is that there are also cases where it's very
hard for the user to predict how many rows will be read.  As examples,
mergejoins may stop short of reading all of one input depending on what
the last key value is from the other, and semijoins or antijoins will
stop whenenever they hit a match in the inner input.  I think in the
join cases we had better establish a simple rule "it'll get executed
to completion".  We could maybe do things differently if the outer
query is non-join with a LIMIT, but that seems pretty inconsistent.
        regards, tom lane


Re: wCTE behaviour

From
Dimitri Fontaine
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:
> I can see that there's a fair argument for that position in cases like
> the above, but the trouble is that there are also cases where it's very
> hard for the user to predict how many rows will be read.  As examples,
> mergejoins may stop short of reading all of one input depending on what
> the last key value is from the other, and semijoins or antijoins will
> stop whenenever they hit a match in the inner input.

Oh. Indeed, I now understand what you mean by surprises. I keep
forgetting that DML and JOINs can live together…

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: wCTE behaviour

From
Alvaro Herrera
Date:
Excerpts from Dimitri Fontaine's message of vie nov 12 17:13:59 -0300 2010:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
> >     WITH t AS (DELETE FROM foo RETURNING *)
> >     SELECT * FROM t LIMIT 1;
> >
> > How many rows does this delete?  I think we concluded that we should
> > force the DELETE to be run to conclusion even if the outer query didn't
> > read it all
> 
> The counter-example that jumps to mind is unix pipes. It's read-only at
> the consumer level but as soon as you stop reading, the producer stops.
> I guess that's only talking about the surprise factor, though.

It's not that straighforward though, in that the producer could stop a
bit ahead of what the consumer reads, due to there being a buffer in the
middle.  Witness this simple example

$ cat > producer
#!/bin/sh
for i in `seq 1 1000`; do   echo $i >> /tmp/mylog  echo $i
done
$ chmod a+x producer 
$ ./producer | head -5
1
2
3
4
5
$ cat /tmp/mylog 
1
2
3
4
5
6
7

I certainly wouldn't want our implementation to behave like this.

> I'm not sure how far we go with the SIGPIPE analogy, but I wanted to say
> that maybe that would not feel so strange to some people if the DELETE
> were not run to completion but only until the reader is done.
> 
> What about this one:
> 
>   WITH d AS (DELETE FROM foo RETURNING id),
>        q AS (INSERT INTO queue SELECT 'D', id FROM d)
>   SELECT * FROM q ORDER BY id LIMIT 10;

Personally I find this one less surprising:
WITH d AS (DELETE FROM foo LIMIT 10 RETURNING id),     q AS (INSERT INTO queue SELECT 'D', id FROM d)SELECT * FROM q
ORDERBY id;
 

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: wCTE behaviour

From
Yeb Havinga
Date:
On 2010-11-12 16:51, David Fetter wrote:
> On Fri, Nov 12, 2010 at 10:25:51AM -0500, Tom Lane wrote:
>>
>> Yeah, that's another interesting question: should we somehow force
>> unreferenced CTEs to be evaluated anyhow?
> Yes.
After a night's sleep I'm still thinking no. Arguments:
1) the name "Common Table Expression" suggests that t must be regarded 
as an expression, hence syntactically / proof theoretic and not as a 
table, set of rows / model theoretic. I.e. it is not a "Common Table".
2) The expressions can be referenced zero, one or more times. To me it 
therefore makes the most sense that a DML expressions that is defined 
but not references has no effect. Referenced once: run the plan once. 
Referenced again: run the plan again.

What should the result be of
WITH t AS (INSERT INTO foo SELECT nextval('seq') RETURNING *)
SELECT * FROM t
UNION
SELECT * FROM t;

1 or 1,2 ?

regards,
Yeb Havinga



Re: wCTE behaviour

From
David Fetter
Date:
On Sat, Nov 13, 2010 at 02:28:35PM +0100, Yeb Havinga wrote:
> On 2010-11-12 16:51, David Fetter wrote:
> >On Fri, Nov 12, 2010 at 10:25:51AM -0500, Tom Lane wrote:
> >>
> >>Yeah, that's another interesting question: should we somehow force
> >>unreferenced CTEs to be evaluated anyhow?
> >Yes.
> After a night's sleep I'm still thinking no. Arguments:
> 1) the name "Common Table Expression" suggests that t must be
> regarded as an expression, hence syntactically / proof theoretic and
> not as a table, set of rows / model theoretic. I.e. it is not a
> "Common Table".

Disagree.  A table never referred to in a query still exists.
Similarly, if a normal CTE called a data-changing function but was
nevertheless not referred to, it would still run.

> 2) The expressions can be referenced zero, one or more times. To me
> it therefore makes the most sense that a DML expressions that is
> defined but not references has no effect. Referenced once: run the
> plan once. Referenced again: run the plan again.

No.  When I designed this feature, it was precisely to take advantage
of the "run exactly once" behavior of CTEs.  Under no circumstances
should we break this.

> 
> What should the result be of
> WITH t AS (INSERT INTO foo SELECT nextval('seq') RETURNING *)
> SELECT * FROM t
> UNION
> SELECT * FROM t;
> 
> 1 or 1,2 ?

1.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: wCTE behaviour

From
Yeb Havinga
Date:
On 2010-11-13 14:41, David Fetter wrote:
> On Sat, Nov 13, 2010 at 02:28:35PM +0100, Yeb Havinga wrote:
>> 1) the name "Common Table Expression" suggests that t must be
>> regarded as an expression, hence syntactically / proof theoretic and
>> not as a table, set of rows / model theoretic. I.e. it is not a
>> "Common Table".
> Disagree.  A table never referred to in a query still exists.
> Similarly, if a normal CTE called a data-changing function but was
> nevertheless not referred to, it would still run.
with t as (select nextval('seq'))
select 1;

does not update the sequence.

>> 2) The expressions can be referenced zero, one or more times. To me
>> it therefore makes the most sense that a DML expressions that is
>> defined but not references has no effect. Referenced once: run the
>> plan once. Referenced again: run the plan again.
> No.  When I designed this feature, it was precisely to take advantage
> of the "run exactly once" behavior of CTEs.  Under no circumstances
> should we break this.
I found the pgday2009 presentation 
http://wiki.postgresql.org/images/c/c0/PGDay2009-EN-Writeable_CTEs_The_Next_Big_Thing.pdf 
- the IO minimization example is cool, and I now understand that it 
would be artificial if the CTE had to be referenced, for it to be 
executed. Makes sense.

regards,
Yeb Havinga



Re: wCTE behaviour

From
Marko Tiikkaja
Date:
On 13 Nov 2010, at 15:41, David Fetter <david@fetter.org> wrote:

> On Sat, Nov 13, 2010 at 02:28:35PM +0100, Yeb Havinga wrote:
>> On 2010-11-12 16:51, David Fetter wrote:
>>> On Fri, Nov 12, 2010 at 10:25:51AM -0500, Tom Lane wrote:
>>>>
>>>> Yeah, that's another interesting question: should we somehow force
>>>> unreferenced CTEs to be evaluated anyhow?
>>> Yes.
>> After a night's sleep I'm still thinking no. Arguments:
>> 1) the name "Common Table Expression" suggests that t must be
>> regarded as an expression, hence syntactically / proof theoretic and
>> not as a table, set of rows / model theoretic. I.e. it is not a
>> "Common Table".
>
> Disagree.  A table never referred to in a query still exists.
> Similarly, if a normal CTE called a data-changing function but was
> nevertheless not referred to, it would still run.

Actually, it wouldn't.

But if we make the behaviour of wCTEs hard(er) to predict, we are  
going to have a pretty bad feature in our hands.  Let's not repeat our  
mistakes, please.


Regards,
Marko Tiikkaja


Re: wCTE behaviour

From
Tom Lane
Date:
Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes:
> On 13 Nov 2010, at 15:41, David Fetter <david@fetter.org> wrote:
>> Similarly, if a normal CTE called a data-changing function but was
>> nevertheless not referred to, it would still run.

> Actually, it wouldn't.

Indeed, and that was considered a feature when we did it.  I think
that having wCTEs behave arbitrarily differently on this point
might be a bad idea.
        regards, tom lane


Re: wCTE behaviour

From
Marko Tiikkaja
Date:
On 2010-11-13 5:08 PM +0200, Tom Lane wrote:
> Marko Tiikkaja<marko.tiikkaja@cs.helsinki.fi>  writes:
>> On 13 Nov 2010, at 15:41, David Fetter<david@fetter.org>  wrote:
>>> Similarly, if a normal CTE called a data-changing function but was
>>> nevertheless not referred to, it would still run.
>
>> Actually, it wouldn't.
>
> Indeed, and that was considered a feature when we did it.  I think
> that having wCTEs behave arbitrarily differently on this point
> might be a bad idea.

So these queries would behave differently?

WITH t AS (DELETE FROM foo RETURNING *)
SELECT 1 WHERE false;

WITH t AS (DELETE FROM foo RETURNING *)
SELECT 1 FROM t LIMIT 0;


Regards,
Marko Tiikkaja


Re: wCTE behaviour

From
"Clark C. Evans"
Date:
On Sat, 13 Nov 2010 17:23 +0200, "Marko Tiikkaja" wrote:
> So these queries would behave differently?
> 
> WITH t AS (DELETE FROM foo RETURNING *)
> SELECT 1 WHERE false;
>
> WITH t AS (DELETE FROM foo RETURNING *)
> SELECT 1 FROM t LIMIT 0;

I'm still trying to wrap my head around this
new mechanism.  What would this return?

UPDATE foo SET access = 0;

WITH t AS (UPDATE foo SET access = access + 1 RETURNING *)
SELECT x.access, y.accessFROM t CROSS JOIN t;



Re: wCTE behaviour

From
Marko Tiikkaja
Date:
On 2010-11-13 5:36 PM +0200, Clark C. Evans wrote:
> On Sat, 13 Nov 2010 17:23 +0200, "Marko Tiikkaja" wrote:
>> So these queries would behave differently?
>>
>> WITH t AS (DELETE FROM foo RETURNING *)
>> SELECT 1 WHERE false;
>>
>> WITH t AS (DELETE FROM foo RETURNING *)
>> SELECT 1 FROM t LIMIT 0;
>
> I'm still trying to wrap my head around this
> new mechanism.  What would this return?
>
> UPDATE foo SET access = 0;
>
> WITH t AS (UPDATE foo SET access = access + 1 RETURNING *)
> SELECT x.access, y.access
>   FROM t CROSS JOIN t;

I'm assuming you forgot to give the tables aliases:

WITH t AS (UPDATE foo SET access = access + 1 RETURNING *)
SELECT x.access, y.access   FROM t x CROSS JOIN t y;

This would return n * n rows with values (1,1) where n is the number of 
rows in foo when the snapshot was taken.  I.e. every row in foo would 
now have access=1.  I'm also ignoring the possibility that someone 
modified the table between those two queries.


Regards,
Marko Tiikkaja


Re: wCTE behaviour

From
David Fetter
Date:
On Sat, Nov 13, 2010 at 05:23:34PM +0200, Marko Tiikkaja wrote:
> On 2010-11-13 5:08 PM +0200, Tom Lane wrote:
> >Marko Tiikkaja<marko.tiikkaja@cs.helsinki.fi>  writes:
> >>On 13 Nov 2010, at 15:41, David Fetter<david@fetter.org>  wrote:
> >>>Similarly, if a normal CTE called a data-changing function but
> >>>was nevertheless not referred to, it would still run.
> >
> >>Actually, it wouldn't.
> >
> >Indeed, and that was considered a feature when we did it.  I think
> >that having wCTEs behave arbitrarily differently on this point
> >might be a bad idea.
> 
> So these queries would behave differently?
> 
> WITH t AS (DELETE FROM foo RETURNING *) SELECT 1 WHERE false;
> 
> WITH t AS (DELETE FROM foo RETURNING *) SELECT 1 FROM t LIMIT 0;

No.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: wCTE behaviour

From
David Fetter
Date:
On Sat, Nov 13, 2010 at 03:23:42PM +0100, Yeb Havinga wrote:
> On 2010-11-13 14:41, David Fetter wrote:
> >On Sat, Nov 13, 2010 at 02:28:35PM +0100, Yeb Havinga wrote:
> >>1) the name "Common Table Expression" suggests that t must be
> >>regarded as an expression, hence syntactically / proof theoretic and
> >>not as a table, set of rows / model theoretic. I.e. it is not a
> >>"Common Table".
> >Disagree.  A table never referred to in a query still exists.
> >Similarly, if a normal CTE called a data-changing function but was
> >nevertheless not referred to, it would still run.
> with t as (select nextval('seq'))
> select 1;
> 
> does not update the sequence.

I think you've found a bug in the form of an over-aggressive
optimization for the data-changing case.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: wCTE behaviour

From
Dimitri Fontaine
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> It's not that straighforward though, in that the producer could stop a
> bit ahead of what the consumer reads, due to there being a buffer in the
> middle.  Witness this simple example

Yeah, another example where the analogy fails for us.
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: wCTE behaviour

From
Marko Tiikkaja
Date:
On 2010-11-12 8:25 PM +0200, I wrote:
> I'm going to take some time off this weekend to get a patch with this
> behaviour to the next commitfest.

.. and a wild patch appears.

This is almost exactly the patch from 2010-02 without
CommandCounterIncrement()s.  It's still a bit rough around the edges and
needs some more comments, but I'm posting it here anyway.

This patch passes all regression tests, but feel free to try to break
it, there are probably ways to do that.  This one also has the "always
run DMLs to completion, and exactly once" behaviour.


Regards,
Marko Tiikkaja

Attachment

Re: wCTE behaviour

From
Hitoshi Harada
Date:
2010/11/14 Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi>:
> On 2010-11-12 8:25 PM +0200, I wrote:
>>
>> I'm going to take some time off this weekend to get a patch with this
>> behaviour to the next commitfest.
>
> .. and a wild patch appears.
>
> This is almost exactly the patch from 2010-02 without
> CommandCounterIncrement()s.  It's still a bit rough around the edges and
> needs some more comments, but I'm posting it here anyway.
>
> This patch passes all regression tests, but feel free to try to break it,
> there are probably ways to do that.  This one also has the "always run DMLs
> to completion, and exactly once" behaviour.
>

Could you update wiki on this feature if you think we've reached the consensus?
http://wiki.postgresql.org/wiki/WriteableCTEs

Also, wrapping up the discussion like pros & cons on the different
execution models helps not only the advance discussions but also
reviews of this patch.

Regards,


--
Hitoshi Harada


Re: wCTE behaviour

From
Marko Tiikkaja
Date:
On 2010-11-14 5:28 PM +0200, Hitoshi Harada wrote:
> 2010/11/14 Marko Tiikkaja<marko.tiikkaja@cs.helsinki.fi>:
>> .. and a wild patch appears.
>
> Could you update wiki on this feature if you think we've reached the consensus?

You're probably referring to
http://archives.postgresql.org/pgsql-hackers/2010-11/msg00660.php
which was unfortunately just me talking too soon.  There still doesn't 
appear to be a consensus on the difference (if any) between these queries:

WITH t AS (DELETE FROM foo RETURNING *)
SELECT 1        LIMIT 0; -- unreferenced CTE

WITH t AS (DELETE FROM foo RETURNING *)
SELECT 1 FROM t LIMIT 0; -- referenced, but not read

WITH t AS (DELETE FROM foo RETURNING *)
SELECT 1 FROM t LIMIT 1; -- referenced, but only partly read

WITH t AS (DELETE FROM foo RETURNING *)
SELECT 1 FROM t t1, t t2; -- referenced, read multiple times

In my opinion, all of these should have the same effect: DELETE all rows 
from "foo".  Any other option means we're going to have trouble 
predicting how a query is going to behave.

As far as I know, we do have a consensus that the order of execution 
should be an implementation detail, and that the statements should 
always be executed in the exact same snapshot (i.e. no CID bump between).

> Also, wrapping up the discussion like pros&  cons on the different
> execution models helps not only the advance discussions but also
> reviews of this patch.

Do you mean between the "execute in order, bump CID" and "execute in 
whatever order but to completion" behaviours?


Regards,
Marko Tiikkaja


Re: wCTE behaviour

From
Marko Tiikkaja
Date:
On 2010-11-14 8:01 PM +0200, I wrote:
> In my opinion, all of these should have the same effect: DELETE all rows
> from "foo".

Since the example wasn't entirely clear on this one: in my opinion the 
DML should also only be executed once.  So:

WITH t AS (INSERT INTO foo VALUES (0) RETURNING *)
SELECT 1 FROM t t1, t t2;

would only insert one row in any case.


Regards,
Marko Tiikkaja


Re: wCTE behaviour

From
Robert Haas
Date:
On Sun, Nov 14, 2010 at 1:01 PM, Marko Tiikkaja
<marko.tiikkaja@cs.helsinki.fi> wrote:
> In my opinion, all of these should have the same effect: DELETE all rows
> from "foo".  Any other option means we're going to have trouble predicting
> how a query is going to behave.

I think it's clear that's the only sensible behavior.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: wCTE behaviour

From
Yeb Havinga
Date:
On 2010-11-14 19:35, Robert Haas wrote:
> On Sun, Nov 14, 2010 at 1:01 PM, Marko Tiikkaja
> <marko.tiikkaja@cs.helsinki.fi>  wrote:
>> In my opinion, all of these should have the same effect: DELETE all rows
>> from "foo".  Any other option means we're going to have trouble predicting
>> how a query is going to behave.
> I think it's clear that's the only sensible behavior.
What if CTE's ever get input parameters?

regards,
Yeb Havinga



Re: wCTE behaviour

From
Marko Tiikkaja
Date:
On 2010-11-14 8:51 PM +0200, Yeb Havinga wrote:
> On 2010-11-14 19:35, Robert Haas wrote:
>> On Sun, Nov 14, 2010 at 1:01 PM, Marko Tiikkaja
>> <marko.tiikkaja@cs.helsinki.fi>   wrote:
>>> In my opinion, all of these should have the same effect: DELETE all rows
>>> from "foo".  Any other option means we're going to have trouble predicting
>>> how a query is going to behave.
>> I think it's clear that's the only sensible behavior.
> What if CTE's ever get input parameters?

What about input parameters?


Regards,
Marko Tiikkaja


Re: wCTE behaviour

From
Robert Haas
Date:
On Sun, Nov 14, 2010 at 1:51 PM, Yeb Havinga <yebhavinga@gmail.com> wrote:
> On 2010-11-14 19:35, Robert Haas wrote:
>>
>> On Sun, Nov 14, 2010 at 1:01 PM, Marko Tiikkaja
>> <marko.tiikkaja@cs.helsinki.fi>  wrote:
>>>
>>> In my opinion, all of these should have the same effect: DELETE all rows
>>> from "foo".  Any other option means we're going to have trouble
>>> predicting
>>> how a query is going to behave.
>>
>> I think it's clear that's the only sensible behavior.
>
> What if CTE's ever get input parameters?

Then they'd be functions, which we already have.  As Tom recently
pointed out, you can even make them temporary with an explicit pg_temp
schema qualification.  Perhaps someday we'll have lambda-expressions,
but I have no reason to believe that they'll use any of the wCTE
syntax.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: wCTE behaviour

From
David Fetter
Date:
On Sun, Nov 14, 2010 at 08:07:22PM +0200, Marko Tiikkaja wrote:
> On 2010-11-14 8:01 PM +0200, I wrote:
> >In my opinion, all of these should have the same effect: DELETE all rows
> >from "foo".
> 
> Since the example wasn't entirely clear on this one: in my opinion
> the DML should also only be executed once.  So:
> 
> WITH t AS (INSERT INTO foo VALUES (0) RETURNING *)
> SELECT 1 FROM t t1, t t2;
> 
> would only insert one row in any case.

Right :)

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: wCTE behaviour

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Sun, Nov 14, 2010 at 1:51 PM, Yeb Havinga <yebhavinga@gmail.com> wrote:
>> What if CTE's ever get input parameters?

> Then they'd be functions, which we already have.

If you mean something like
prepare foo(int) as    with x as (delete from tab where id = $1 returning *)    insert into log_table select * from x;

I don't see that the parameter makes things any less well-defined.

If you mean a parameter in the sense of an executor parameter passed
in from a surrounding nestloop, that'd scare me too --- but I thought
we were going to disallow wCTEs except at the top level of a query,
so the case wouldn't arise.
        regards, tom lane


Re: wCTE behaviour

From
Yeb Havinga
Date:
On 2010-11-14 21:06, Marko Tiikkaja wrote:
> On 2010-11-14 8:51 PM +0200, Yeb Havinga wrote:
>> On 2010-11-14 19:35, Robert Haas wrote:
>>> On Sun, Nov 14, 2010 at 1:01 PM, Marko Tiikkaja
>>> <marko.tiikkaja@cs.helsinki.fi>   wrote:
>>>> In my opinion, all of these should have the same effect: DELETE all 
>>>> rows
>>>> from "foo".  Any other option means we're going to have trouble 
>>>> predicting
>>>> how a query is going to behave.
>>> I think it's clear that's the only sensible behavior.
>> What if CTE's ever get input parameters?
>
> What about input parameters?
With input parameters there is a clear link between a CTE and a caller. 
If a CTE is called more than once, it must be executed more than once, 
e.g. (notation t:x means cte has parameter x)

WITH t:x AS (INSERT INTO foo VALUES(x) RETURNING *)
SELECT (SELECT * FROM t(1)), (SELECT * FROM t(2));
runs the cte two times, hence two new rows in foo.

But what about
WITH t:x AS (INSERT INTO foo VALUES(x) RETURNING *)
SELECT (SELECT t(1)), (SELECT t(1));
it would be strange to expect a single row in foo here, since the only 
thing different from the previous query is a constant value.

Though I like the easyness of "run exactly once" for uncorrelated cte's, 
I still have the feeling that it somehow mixes the expression and 
operational realm. In logic there's a difference between a proposition 
and an assertion. With "run exactly once", stating a proposition is made 
synonymous to asserting it. That makes syntactic operations or rewriting 
of writable CTEs hard, if not impossible. For instance, variable 
substitution in the second example makes a CTE without parameters:
WITH t' AS (INSERT INTO foo VALUES(1) RETURNING *),
t'' AS AS (INSERT INTO foo VALUES(1) RETURNING *),
SELECT (SELECT t'), (SELECT t'');

since t' and t'' are equal,

WITH t' AS (INSERT INTO foo VALUES(1) RETURNING *)
SELECT (SELECT t'), (SELECT t');

A syntactic operation like this on the query should not result in a 
different operation when it's run. Hence two new rows in foo are still 
expected, but the "run exactly once" dictates one new row for that query.

regards,
Yeb Havinga



Re: wCTE behaviour

From
Peter Eisentraut
Date:
On tor, 2010-11-11 at 19:35 +0200, Marko Tiikkaja wrote:
> I apologize, I had misunderstood what you are suggesting.  But now  
> that I do, it seems to be an even worse idea to go your way.  Based on  
> my research, I'm almost certain that the SQL standard says that the  
> execution order is deterministic if there is at least one DML  
> statement in the WITH list.
> 
> Can anyone confirm this?

SQL:2008 doesn't allow any DML in the WITH list.

SQL:2011 has the "combined data store and retrieval" feature that was
discussed in another thread which basically implements the same thing.
They apparently avoid the whole issue by allowing only one data change
delta table per query.




Re: wCTE behaviour

From
David Fetter
Date:
On Sun, Nov 14, 2010 at 11:02:08PM +0100, Yeb Havinga wrote:
> On 2010-11-14 21:06, Marko Tiikkaja wrote:
> >On 2010-11-14 8:51 PM +0200, Yeb Havinga wrote:
> >>On 2010-11-14 19:35, Robert Haas wrote:
> >>>On Sun, Nov 14, 2010 at 1:01 PM, Marko Tiikkaja
> >>><marko.tiikkaja@cs.helsinki.fi>   wrote:
> >>>>In my opinion, all of these should have the same effect:
> >>>>DELETE all rows
> >>>>from "foo".  Any other option means we're going to have
> >>>>trouble predicting
> >>>>how a query is going to behave.
> >>>I think it's clear that's the only sensible behavior.
> >>What if CTE's ever get input parameters?
> >
> >What about input parameters?
> With input parameters there is a clear link between a CTE and a
> caller. If a CTE is called more than once, it must be executed more
> than once, e.g. (notation t:x means cte has parameter x)
> 
> WITH t:x AS (INSERT INTO foo VALUES(x) RETURNING *)
> SELECT (SELECT * FROM t(1)), (SELECT * FROM t(2));
> runs the cte two times, hence two new rows in foo.

I think we can worry about that if we ever have run-time functions
done as WITH, but I think they'd be a *much* better fit for DO.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: wCTE behaviour

From
Greg Smith
Date:
Marko Tiikkaja wrote:
> This is almost exactly the patch from 2010-02 without 
> CommandCounterIncrement()s.  It's still a bit rough around the edges 
> and needs some more comments, but I'm posting it here anyway.
>
> This patch passes all regression tests, but feel free to try to break 
> it, there are probably ways to do that.  This one also has the "always 
> run DMLs to completion, and exactly once" behaviour.

So this patch was marked "Ready for Committer", but a) no committer has 
picked it up yet and b) Marko has made changes here that nobody else has 
tested out yet that I've seen on the last.  Accordingly, that 
classification may have been optimistic.  It seems to me that another 
testing run-through from someone like David might be appropriate to 
build some confidence this latest patch should be a commit candidate.  
If there is a committer intending to work on this as-is, they haven't 
identified themselves.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services and Support        www.2ndQuadrant.us




Re: wCTE behaviour

From
David Fetter
Date:
On Sun, Dec 05, 2010 at 01:33:39PM -0500, Greg Smith wrote:
> Marko Tiikkaja wrote:
> >This is almost exactly the patch from 2010-02 without
> >CommandCounterIncrement()s.  It's still a bit rough around the
> >edges and needs some more comments, but I'm posting it here
> >anyway.
> >
> >This patch passes all regression tests, but feel free to try to
> >break it, there are probably ways to do that.  This one also has
> >the "always run DMLs to completion, and exactly once" behaviour.
> 
> So this patch was marked "Ready for Committer", but a) no committer
> has picked it up yet and b) Marko has made changes here that nobody
> else has tested out yet that I've seen on the last.  Accordingly,
> that classification may have been optimistic.  It seems to me that
> another testing run-through from someone like David might be
> appropriate to build some confidence this latest patch should be a
> commit candidate.  If there is a committer intending to work on this
> as-is, they haven't identified themselves.

I've tested this one and not managed to break it.  One thing it could
use is support for EXPLAIN ANALYZE.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: wCTE behaviour

From
Marko Tiikkaja
Date:
On 2010-12-08 10:19 AM +0200, David Fetter wrote:
> On Sun, Dec 05, 2010 at 01:33:39PM -0500, Greg Smith wrote:
>> So this patch was marked "Ready for Committer", but a) no committer
>> has picked it up yet and b) Marko has made changes here that nobody
>> else has tested out yet that I've seen on the last.  Accordingly,
>> that classification may have been optimistic.  It seems to me that
>> another testing run-through from someone like David might be
>> appropriate to build some confidence this latest patch should be a
>> commit candidate.  If there is a committer intending to work on this
>> as-is, they haven't identified themselves.
>
> I've tested this one and not managed to break it.  One thing it could
> use is support for EXPLAIN ANALYZE.

What's wrong with EXPLAIN ANALYZE?  Here's what I see:

=# explain analyze with t as (insert into foo values(0) returning *) 
select * from t;                                            QUERY PLAN 

-------------------------------------------------------------------------------------------------- CTE Scan on t
(cost=0.01..0.03rows=1 width=4) (actual 
 
time=0.017..0.017 rows=1 loops=2)   CTE t     ->  Insert  (cost=0.00..0.01 rows=1 width=0) (actual 
time=0.029..0.030 rows=1 loops=1)           ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual 
time=0.002..0.002 rows=1 loops=1) Total runtime: 0.104 ms
(5 rows)


Regards,
Marko Tiikkaja


Re: wCTE behaviour

From
David Fetter
Date:
On Wed, Dec 08, 2010 at 01:23:59PM +0200, Marko Tiikkaja wrote:
> On 2010-12-08 10:19 AM +0200, David Fetter wrote:
> >On Sun, Dec 05, 2010 at 01:33:39PM -0500, Greg Smith wrote:
> >>So this patch was marked "Ready for Committer", but a) no committer
> >>has picked it up yet and b) Marko has made changes here that nobody
> >>else has tested out yet that I've seen on the last.  Accordingly,
> >>that classification may have been optimistic.  It seems to me that
> >>another testing run-through from someone like David might be
> >>appropriate to build some confidence this latest patch should be a
> >>commit candidate.  If there is a committer intending to work on this
> >>as-is, they haven't identified themselves.
> >
> >I've tested this one and not managed to break it.  One thing it could
> >use is support for EXPLAIN ANALYZE.
> 
> What's wrong with EXPLAIN ANALYZE?  Here's what I see:

Oops!

I am terribly sorry.  It was an earlier patch I didn't manage to
break.  I've tried all the same things on this one, and no breakage so
far.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: wCTE behaviour

From
Peter Eisentraut
Date:
On sön, 2010-11-14 at 04:45 +0200, Marko Tiikkaja wrote:
> On 2010-11-12 8:25 PM +0200, I wrote:
> > I'm going to take some time off this weekend to get a patch with this
> > behaviour to the next commitfest.
> 
> .. and a wild patch appears.
> 
> This is almost exactly the patch from 2010-02 without 
> CommandCounterIncrement()s.  It's still a bit rough around the edges and 
> needs some more comments, but I'm posting it here anyway.

To pick up an earlier thread again, has any serious thought been given
to adapting the SQL2001/DB2 syntax instead of our own?




Re: wCTE behaviour

From
David Fetter
Date:
On Tue, Dec 21, 2010 at 11:14:31PM +0200, Peter Eisentraut wrote:
> On sön, 2010-11-14 at 04:45 +0200, Marko Tiikkaja wrote:
> > On 2010-11-12 8:25 PM +0200, I wrote:
> > > I'm going to take some time off this weekend to get a patch with this
> > > behaviour to the next commitfest.
> > 
> > .. and a wild patch appears.
> > 
> > This is almost exactly the patch from 2010-02 without 
> > CommandCounterIncrement()s.  It's still a bit rough around the edges and 
> > needs some more comments, but I'm posting it here anyway.
> 
> To pick up an earlier thread again, has any serious thought been given
> to adapting the SQL2001/DB2 syntax instead of our own?

Yes, and it's a good deal more limited and less intuitive than ours.

This is one place where we got it right and the standard just got
pushed into doing whatever IBM did.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: wCTE behaviour

From
Peter Eisentraut
Date:
On sön, 2010-11-14 at 04:45 +0200, Marko Tiikkaja wrote:
> .. and a wild patch appears.
> 
> This is almost exactly the patch from 2010-02 without 
> CommandCounterIncrement()s.  It's still a bit rough around the edges
> and 
> needs some more comments, but I'm posting it here anyway.

Is this the patch of record?  There are no changes to the documentation
included.



Re: wCTE behaviour

From
Peter Eisentraut
Date:
On tis, 2010-12-21 at 13:20 -0800, David Fetter wrote:
> On Tue, Dec 21, 2010 at 11:14:31PM +0200, Peter Eisentraut wrote:
> > On sön, 2010-11-14 at 04:45 +0200, Marko Tiikkaja wrote:
> > > On 2010-11-12 8:25 PM +0200, I wrote:
> > > > I'm going to take some time off this weekend to get a patch with this
> > > > behaviour to the next commitfest.
> > > 
> > > .. and a wild patch appears.
> > > 
> > > This is almost exactly the patch from 2010-02 without 
> > > CommandCounterIncrement()s.  It's still a bit rough around the edges and 
> > > needs some more comments, but I'm posting it here anyway.
> > 
> > To pick up an earlier thread again, has any serious thought been given
> > to adapting the SQL2001/DB2 syntax instead of our own?
> 
> Yes, and it's a good deal more limited and less intuitive than ours.

Less intuitive, possibly, but how is it more limited?



Re: wCTE behaviour

From
Marko Tiikkaja
Date:
On 2010-12-22 8:24 PM, Peter Eisentraut wrote:
> On sön, 2010-11-14 at 04:45 +0200, Marko Tiikkaja wrote:
>> .. and a wild patch appears.
>>
>> This is almost exactly the patch from 2010-02 without
>> CommandCounterIncrement()s.  It's still a bit rough around the edges
>> and
>> needs some more comments, but I'm posting it here anyway.
>
> Is this the patch of record?  There are no changes to the documentation
> included.

I've kept the documentation as a separate patch, but I haven't touched 
it in a very long time.  I will work on the documentation if there's a 
chance of the patch getting accepted for 9.1.  This arrangement makes 
more sense to me and I'm sure others will agree.


Regards,
Marko Tiikkaja


Re: wCTE behaviour

From
Tom Lane
Date:
Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes:
> On 2010-12-22 8:24 PM, Peter Eisentraut wrote:
>> Is this the patch of record?  There are no changes to the documentation
>> included.

> I've kept the documentation as a separate patch, but I haven't touched 
> it in a very long time.  I will work on the documentation if there's a 
> chance of the patch getting accepted for 9.1.  This arrangement makes 
> more sense to me and I'm sure others will agree.

Well, it's difficult to review a documentation-free patch.
        regards, tom lane


Re: wCTE behaviour

From
David Fetter
Date:
On Wed, Dec 22, 2010 at 03:54:08PM -0500, Tom Lane wrote:
> Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes:
> > On 2010-12-22 8:24 PM, Peter Eisentraut wrote:
> >> Is this the patch of record?  There are no changes to the documentation
> >> included.
>
> > I've kept the documentation as a separate patch, but I haven't touched
> > it in a very long time.  I will work on the documentation if there's a
> > chance of the patch getting accepted for 9.1.  This arrangement makes
> > more sense to me and I'm sure others will agree.
>
> Well, it's difficult to review a documentation-free patch.

Here's a document-included version :)

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachment

Re: wCTE behaviour

From
Marko Tiikkaja
Date:
On 2010-12-23 6:55 PM +0200, David Fetter wrote:
> Here's a document-included version :)

And here's the latest version of the patch for the last commit fest for 9.1.

I fixed an issue with the portal logic, and now we use
PORTAL_ONE_RETURNING for wCTE queries, even if the main query is not a
DML or does not have RETURNING.  This also means that we materialize the
results of the main query sometimes unnecessarily, but that doesn't look
like an easy thing to fix.  PORTAL_ONE_RETURNING as a name is also a bit
misleading now, so maybe that needs changing..

Any feedback welcome.


Regards,
Marko Tiikkaja

Attachment

Re: wCTE behaviour

From
Tom Lane
Date:
Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes:
> I fixed an issue with the portal logic, and now we use 
> PORTAL_ONE_RETURNING for wCTE queries, even if the main query is not a 
> DML or does not have RETURNING.  This also means that we materialize the 
> results of the main query sometimes unnecessarily, but that doesn't look 
> like an easy thing to fix.  PORTAL_ONE_RETURNING as a name is also a bit 
> misleading now, so maybe that needs changing..

Why is it necessary to hack the portal logic at all?  The patch seems to
work for me without that.  (I've fixed quite a few bugs though, so maybe
what this is really doing is masking a problem elsewhere.)

Also, why are we forbidding wCTEs in cursors?  Given the current
definitions, that case seems to work fine too: the wCTEs will be
executed as soon as you fetch something from the cursor.  Are you
just worried about not allowing a case that might be hard to support
later?
        regards, tom lane


Re: wCTE behaviour

From
Marko Tiikkaja
Date:
On 2011-02-25 1:36 AM, Tom Lane wrote:
> Marko Tiikkaja<marko.tiikkaja@cs.helsinki.fi>  writes:
>> I fixed an issue with the portal logic, and now we use
>> PORTAL_ONE_RETURNING for wCTE queries, even if the main query is not a
>> DML or does not have RETURNING.  This also means that we materialize the
>> results of the main query sometimes unnecessarily, but that doesn't look
>> like an easy thing to fix.  PORTAL_ONE_RETURNING as a name is also a bit
>> misleading now, so maybe that needs changing..
>
> Why is it necessary to hack the portal logic at all?  The patch seems to
> work for me without that.  (I've fixed quite a few bugs though, so maybe
> what this is really doing is masking a problem elsewhere.)

Without hacking it broke when PQdescribePrepared was called on a 
prepared query like:

WITH t AS (DELETE FROM foo)
SELECT 1;

Not sure if that's an actual problem, but it seemed like something worht 
fixing.

> Also, why are we forbidding wCTEs in cursors?  Given the current
> definitions, that case seems to work fine too: the wCTEs will be
> executed as soon as you fetch something from the cursor.  Are you
> just worried about not allowing a case that might be hard to support
> later?

Honestly, I have no idea.  It might be a leftover from the previous 
design.  If it looks like it's easy to support, then go for it.


Regards,
Marko Tiikkaja


Re: wCTE behaviour

From
Tom Lane
Date:
Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes:
> On 2011-02-25 1:36 AM, Tom Lane wrote:
>> Why is it necessary to hack the portal logic at all?  The patch seems to
>> work for me without that.  (I've fixed quite a few bugs though, so maybe
>> what this is really doing is masking a problem elsewhere.)

> Without hacking it broke when PQdescribePrepared was called on a 
> prepared query like:

> WITH t AS (DELETE FROM foo)
> SELECT 1;

> Not sure if that's an actual problem, but it seemed like something worht 
> fixing.

I can't replicate such a problem here --- do you have a concrete test
case?  ISTM the issue would only have been a problem back when you
were trying to generate multiple PlannedStmts from a query like the
above.  The current implementation with everything in one plantree
really ought to look just like a SELECT so far as the portal code
is concerned.

>> Also, why are we forbidding wCTEs in cursors?  Given the current
>> definitions, that case seems to work fine too: the wCTEs will be
>> executed as soon as you fetch something from the cursor.  Are you
>> just worried about not allowing a case that might be hard to support
>> later?

> Honestly, I have no idea.  It might be a leftover from the previous 
> design.  If it looks like it's easy to support, then go for it.

Right now I'm thinking that it is best to continue to forbid it.
If we go over to the less-sequential implementation that I'm advocating
in another thread, the timing of the updates would become a lot less
predictable than I say above.  If we refuse it for now, we can always
remove the restriction later, but the other way is more painful.
        regards, tom lane


Re: wCTE behaviour

From
Marko Tiikkaja
Date:
On 2011-02-25 6:12 PM, Tom Lane wrote:
> Marko Tiikkaja<marko.tiikkaja@cs.helsinki.fi>  writes:
>> Without hacking it broke when PQdescribePrepared was called on a
>> prepared query like:
>
>> WITH t AS (DELETE FROM foo)
>> SELECT 1;
>
>> Not sure if that's an actual problem, but it seemed like something worht
>> fixing.
>
> I can't replicate such a problem here --- do you have a concrete test
> case?  ISTM the issue would only have been a problem back when you
> were trying to generate multiple PlannedStmts from a query like the
> above.

I don't have one right now (I lost the one I had because of a hardware 
failure in a virtual machine), but I can write you one if you want to. 
But see below.

> The current implementation with everything in one plantree
> really ought to look just like a SELECT so far as the portal code
> is concerned.

The problem was that the old code was using PORTAL_MULTI_QUERY whenever 
a wCTE was present.  Are you saying that you are using 
PORTAL_ONE_SELECT?  Doesn't that have problems with triggers, for example?

>>> Also, why are we forbidding wCTEs in cursors?  Given the current
>>> definitions, that case seems to work fine too: the wCTEs will be
>>> executed as soon as you fetch something from the cursor.  Are you
>>> just worried about not allowing a case that might be hard to support
>>> later?
>
>> Honestly, I have no idea.  It might be a leftover from the previous
>> design.  If it looks like it's easy to support, then go for it.
>
> Right now I'm thinking that it is best to continue to forbid it.
> If we go over to the less-sequential implementation that I'm advocating
> in another thread, the timing of the updates would become a lot less
> predictable than I say above.  If we refuse it for now, we can always
> remove the restriction later, but the other way is more painful.

Fair enough.


Regards,
Marko Tiikkaja


Re: wCTE behaviour

From
Tom Lane
Date:
Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes:
> On 2011-02-25 6:12 PM, Tom Lane wrote:
>> The current implementation with everything in one plantree
>> really ought to look just like a SELECT so far as the portal code
>> is concerned.

> The problem was that the old code was using PORTAL_MULTI_QUERY whenever 
> a wCTE was present.  Are you saying that you are using 
> PORTAL_ONE_SELECT?  Doesn't that have problems with triggers, for example?

Hmmm ... good question.  I notice the lack of any regression test cases
involving triggers.  Will check this.
        regards, tom lane