Thread: Status report on writeable CTEs

Status report on writeable CTEs

From

Marko Tiikkaja

Date:

12 July 2010, 15:14:52

Hi,


I've been working on writeable CTEs during the last couple of months,
but right now it looks like I'm going to miss the first commit fest for
9.1.  I was trying to make it work by expanding all wCTEs to their own
Queries during the rewrite stage (a very crude patch trying to do that
for regular CTEs attached), but I don't think that it's a good way of
approaching the problem.  Consider:

WITH t  AS (SELECT 1),
      t2 AS (SELECT * FROM t2)
VALUES (true);

The first big problem I hit is was that when the query for t2 is planned
separately, it doesn't have the CTE information.  But even if it had
that information (we could easily copy it during the rewrite stage), all
RTEs referencing CTEs that were expanded would have the wrong levelsup
(this is where the patch fails at regression tests).  One could probably
come up with ways of fixing even that, but I don't think that's the
right direction to be heading.

So what I'm now thinking of is making the planner plan that as a single
Query, and make the planner expand it into multiple PlannedStmts if
necessary.  This would break the existing planner hooks, but I don't
think that's a huge problem.  Does anyone see any obvious flaws in this?

Any feedback is welcome.  I'd also be happy to get some help on this
project.


Regards,
Marko Tiikkaja

Attachment

wcte.patch

Re: Status report on writeable CTEs

From

Marko Tiikkaja

Date:

12 July 2010, 15:22:02

On 7/12/10 9:07 PM +0300, I wrote:
> Consider:
>
> WITH t  AS (SELECT 1),
>        t2 AS (SELECT * FROM t2)
> VALUES (true);

That should of course have been SELECT * FROM t, not t2.


Regards,
Marko Tiikkaja

Re: Status report on writeable CTEs

From

Tom Lane

Date:

12 July 2010, 15:35:47

Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes:
> ... So what I'm now thinking of is making the planner plan that as a single 
> Query, and make the planner expand it into multiple PlannedStmts if 
> necessary.  This would break the existing planner hooks, but I don't 
> think that's a huge problem.  Does anyone see any obvious flaws in this?

How will that interact with the existing rewriter?  It sounds a lot
like a recipe for duplicating functionality ...
        regards, tom lane

Re: Status report on writeable CTEs

From

Marko Tiikkaja

Date:

12 July 2010, 15:41:41

On 7/12/10 9:34 PM +0300, Tom Lane wrote:
> Marko Tiikkaja<marko.tiikkaja@cs.helsinki.fi>  writes:
>> ... So what I'm now thinking of is making the planner plan that as a single
>> Query, and make the planner expand it into multiple PlannedStmts if
>> necessary.  This would break the existing planner hooks, but I don't
>> think that's a huge problem.  Does anyone see any obvious flaws in this?
>
> How will that interact with the existing rewriter?  It sounds a lot
> like a recipe for duplicating functionality ...

I was thinking that the rewriter would look at the top-level CTEs and 
rewrite all non-SELECT queries into a list of Queries and store that 
list into the CommonTableExprs.  The planner would then use this 
information and also expand the rewrite products.

Regards,
Marko Tiikkaja

Re: Status report on writeable CTEs

From

Hitoshi Harada

Date:

16 July 2010, 12:16:12

2010/7/13 Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi>:
> Hi,
>
>
> I've been working on writeable CTEs during the last couple of months, but
> right now it looks like I'm going to miss the first commit fest for 9.1.  I
> was trying to make it work by expanding all wCTEs to their own Queries
> during the rewrite stage (a very crude patch trying to do that for regular
> CTEs attached), but I don't think that it's a good way of approaching the
> problem.  Consider:

Sorry it's not relevant to the topic you post but it seems you're
going to add new executor node called DtScanNode and let it hold
tuplestore passed by the Query indicated by index number. However, I
suppose there are other ways:

1. Use MaterialNode instead of adding DtScanNode. Since MaterialNode
is exsiting one that work with single tuplestore, it might be sane to
modify this so that it accepts tuplestore from Query instead of its
child node.
2. Use temp table instead of tuplestore list. Since we agreed we need
to execute each plan one by one starting and shutting down executor,
it now looks very simple strategy.

I'm not familiar with the long discussion on this feature so not sure
they are possible, but ISTM  they are enough to be discussed (or
discussed already?).


Regards,


--
Hitoshi Harada

Re: Status report on writeable CTEs

From

Marko Tiikkaja

Date:

16 July 2010, 12:52:18

On 7/16/10 6:15 PM +0300, Hitoshi Harada wrote:
> Sorry it's not relevant to the topic you post but ..

Relevant enough :-)

> .. it seems you're
> going to add new executor node called DtScanNode and let it hold
> tuplestore passed by the Query indicated by index number. However, I
> suppose there are other ways:
>
> 1. Use MaterialNode instead of adding DtScanNode. Since MaterialNode
> is exsiting one that work with single tuplestore, it might be sane to
> modify this so that it accepts tuplestore from Query instead of its
> child node.

I thought about this, but I don't necessarily like the idea of 
overloading executor nodes.

> 2. Use temp table instead of tuplestore list. Since we agreed we need
> to execute each plan one by one starting and shutting down executor,
> it now looks very simple strategy.

I didn't look at this because I thought using a "tuplestore receiver" in 
the portal logic was simple enough.  Any thoughts on how this would work?

> I'm not familiar with the long discussion on this feature so not sure
> they are possible, but ISTM  they are enough to be discussed (or
> discussed already?).

We haven't discussed this part of the design yet..  Now is a good time 
to do it.


Regards,
Marko Tiikkaja

Re: Status report on writeable CTEs

From

Hitoshi Harada

Date:

16 July 2010, 13:15:33

2010/7/17 Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi>:
> On 7/16/10 6:15 PM +0300, Hitoshi Harada wrote:
>>
>> 1. Use MaterialNode instead of adding DtScanNode. Since MaterialNode
>> is exsiting one that work with single tuplestore, it might be sane to
>> modify this so that it accepts tuplestore from Query instead of its
>> child node.
>
> I thought about this, but I don't necessarily like the idea of overloading
> executor nodes.

Neither do I have good shape for this solution. Maybe it's not good
idea. But my concern is adding DtScanNode, which looks similar to
MaterialNode. Of course each purpose is different, but quite big part
will overlap each other, I think.

>> 2. Use temp table instead of tuplestore list. Since we agreed we need
>> to execute each plan one by one starting and shutting down executor,
>> it now looks very simple strategy.
>
> I didn't look at this because I thought using a "tuplestore receiver" in the
> portal logic was simple enough.  Any thoughts on how this would work?

It's just deconstructing queries like:

WITH t AS (INSERT INTO x ... RETURING *)
SELECT * FROM t;

to

CREATE TEMP TABLE t AS INSERT INTO x ... RETURING *;
SELECT * FROM t;

While the second statement is not implemented yet, it will be so simpler.

Another concern is tuplestore's memory exhausting. Tuplestore holds
tuples in memory as far as the estimated memory usage is within
work_mem (for *each* not total of all tuplestores!), but if you create
dozens of tuplestore (and it's quite possible in wCTE use cases) we
will possibly fail into memory overflow problems.

>> I'm not familiar with the long discussion on this feature so not sure
>> they are possible, but ISTM  they are enough to be discussed (or
>> discussed already?).
>
> We haven't discussed this part of the design yet..  Now is a good time to do
> it.

Yeah, we should.  Anyone has another idea? Or adding DtScanNode for
this features is fair enough?

Regards,

--
Hitoshi Harada

Re: Status report on writeable CTEs

From

Marko Tiikkaja

Date:

16 July 2010, 13:31:24

On 7/16/10 7:15 PM +0300, Hitoshi Harada wrote:
> 2010/7/17 Marko Tiikkaja<marko.tiikkaja@cs.helsinki.fi>:
>> I thought about this, but I don't necessarily like the idea of overloading
>> executor nodes.
>
> Neither do I have good shape for this solution. Maybe it's not good
> idea. But my concern is adding DtScanNode, which looks similar to
> MaterialNode. Of course each purpose is different, but quite big part
> will overlap each other, I think.

The way I see it is that reading from a tuplestore is so simple that we 
shouldn't be trying to merge together nodes just on that basis.  It 
seems to me that we'd have to add CteScan and WorkTableScan nodes there 
too and at that point it would become complicated.

>> I didn't look at this because I thought using a "tuplestore receiver" in the
>> portal logic was simple enough.  Any thoughts on how this would work?
>
> It's just deconstructing queries like:
>
> WITH t AS (INSERT INTO x ... RETURING *)
> SELECT * FROM t;
>
> to
>
> CREATE TEMP TABLE t AS INSERT INTO x ... RETURING *;
> SELECT * FROM t;

That's an idea.  Can we somehow avoid name clashes with user-defined 
temporary tables?

> Another concern is tuplestore's memory exhausting. Tuplestore holds
> tuples in memory as far as the estimated memory usage is within
> work_mem (for *each* not total of all tuplestores!), but if you create
> dozens of tuplestore (and it's quite possible in wCTE use cases) we
> will possibly fail into memory overflow problems.

That doesn't seem very different from a big SELECT query, except with 
wCTEs, you actually *know* how many times the work_mem can be used 
before you run the query and can adjust work_mem accordingly.

That said, I personally could live with a separate GUC for just 
adjusting the work_mem of "wcte tuplestores".  Another option would be 
to unconditionally force the tuplestores to disk, but that sounds painful.

Regards,
Marko Tiikkaja

Re: Status report on writeable CTEs

From

David Fetter

Date:

20 July 2010, 17:49:50

On Sat, Jul 17, 2010 at 01:15:22AM +0900, Hitoshi Harada wrote:
> 2010/7/17 Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi>:
> > On 7/16/10 6:15 PM +0300, Hitoshi Harada wrote:
> >>
> >> 1. Use MaterialNode instead of adding DtScanNode. Since MaterialNode
> >> is exsiting one that work with single tuplestore, it might be sane to
> >> modify this so that it accepts tuplestore from Query instead of its
> >> child node.
> >
> > I thought about this, but I don't necessarily like the idea of overloading
> > executor nodes.
> 
> Neither do I have good shape for this solution. Maybe it's not good
> idea. But my concern is adding DtScanNode, which looks similar to
> MaterialNode. Of course each purpose is different, but quite big part
> will overlap each other, I think.

Any ideas as to how to factor this out?  Handiest ideas would be in
the form of a patch ;)

> >> 2. Use temp table instead of tuplestore list. Since we agreed we need
> >> to execute each plan one by one starting and shutting down executor,
> >> it now looks very simple strategy.
> >
> > I didn't look at this because I thought using a "tuplestore receiver" in the
> > portal logic was simple enough.  Any thoughts on how this would work?
> 
> It's just deconstructing queries like:
> 
> WITH t AS (INSERT INTO x ... RETURING *)
> SELECT * FROM t;
> 
> to
> 
> CREATE TEMP TABLE t AS INSERT INTO x ... RETURING *;
> SELECT * FROM t;
> 
> While the second statement is not implemented yet, it will be so simpler.

So CTAS would get expanded into CTA[row-returning query] ?
Interesting.  How much work would that part be?

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Status report on writeable CTEs

From

Merlin Moncure

Date:

20 July 2010, 18:13:14

On Fri, Jul 16, 2010 at 12:15 PM, Hitoshi Harada <umi.tanuki@gmail.com> wrote:
> 2010/7/17 Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi>:
>> On 7/16/10 6:15 PM +0300, Hitoshi Harada wrote:
>>>
>>> 1. Use MaterialNode instead of adding DtScanNode. Since MaterialNode
>>> is exsiting one that work with single tuplestore, it might be sane to
>>> modify this so that it accepts tuplestore from Query instead of its
>>> child node.
>>
>> I thought about this, but I don't necessarily like the idea of overloading
>> executor nodes.
>
> Neither do I have good shape for this solution. Maybe it's not good
> idea. But my concern is adding DtScanNode, which looks similar to
> MaterialNode. Of course each purpose is different, but quite big part
> will overlap each other, I think.
>
>>> 2. Use temp table instead of tuplestore list. Since we agreed we need
>>> to execute each plan one by one starting and shutting down executor,
>>> it now looks very simple strategy.
>>
>> I didn't look at this because I thought using a "tuplestore receiver" in the
>> portal logic was simple enough.  Any thoughts on how this would work?
>
> It's just deconstructing queries like:
>
> WITH t AS (INSERT INTO x ... RETURING *)
> SELECT * FROM t;
>
> to
>
> CREATE TEMP TABLE t AS INSERT INTO x ... RETURING *;
> SELECT * FROM t;

Is it acceptable for a wCTE query to manipulate the system catalogs?
Couldn't this cause performance issues in some cases?

merlin

Re: Status report on writeable CTEs

From

Robert Haas

Date:

20 July 2010, 21:34:17

On Tue, Jul 20, 2010 at 5:13 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
>>>> 2. Use temp table instead of tuplestore list. Since we agreed we need
>>>> to execute each plan one by one starting and shutting down executor,
>>>> it now looks very simple strategy.
>>>
>>> I didn't look at this because I thought using a "tuplestore receiver" in the
>>> portal logic was simple enough.  Any thoughts on how this would work?
>>
>> It's just deconstructing queries like:
>>
>> WITH t AS (INSERT INTO x ... RETURING *)
>> SELECT * FROM t;
>>
>> to
>>
>> CREATE TEMP TABLE t AS INSERT INTO x ... RETURING *;
>> SELECT * FROM t;
>
> Is it acceptable for a wCTE query to manipulate the system catalogs?
> Couldn't this cause performance issues in some cases?

Yeah, I suspect the performance wouldn't be too hot.  I think the idea
of writing into a tuplestore and then reading back out from it is the
way to go.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: Status report on writeable CTEs

From

Hitoshi Harada

Date:

03 August 2010, 13:31:05

2010/7/21 David Fetter <david@fetter.org>:
> On Sat, Jul 17, 2010 at 01:15:22AM +0900, Hitoshi Harada wrote:
>> 2010/7/17 Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi>:
>> > On 7/16/10 6:15 PM +0300, Hitoshi Harada wrote:
>> >> 1. Use MaterialNode instead of adding DtScanNode. Since MaterialNode
>> >> is exsiting one that work with single tuplestore, it might be sane to
>> >> modify this so that it accepts tuplestore from Query instead of its
>> >> child node.
>> >
>> > I thought about this, but I don't necessarily like the idea of overloading
>> > executor nodes.
>>
>> Neither do I have good shape for this solution. Maybe it's not good
>> idea. But my concern is adding DtScanNode, which looks similar to
>> MaterialNode. Of course each purpose is different, but quite big part
>> will overlap each other, I think.
>
> Any ideas as to how to factor this out?  Handiest ideas would be in
> the form of a patch ;)

Yeah, that would be handiest, but I think I must wait for his first
compilable patch to modify to try the idea. Current version looks
quite messy and can't build.

>> >> 2. Use temp table instead of tuplestore list. Since we agreed we need
>> >> to execute each plan one by one starting and shutting down executor,
>> >> it now looks very simple strategy.
>> >
>> > I didn't look at this because I thought using a "tuplestore receiver" in the
>> > portal logic was simple enough.  Any thoughts on how this would work?
>>
>> It's just deconstructing queries like:
>>
>> WITH t AS (INSERT INTO x ... RETURING *)
>> SELECT * FROM t;
>>
>> to
>>
>> CREATE TEMP TABLE t AS INSERT INTO x ... RETURING *;
>> SELECT * FROM t;
>>
>> While the second statement is not implemented yet, it will be so simpler.
>
> So CTAS would get expanded into CTA[row-returning query] ?
> Interesting.  How much work would that part be?

FWIW, this is getting interesting to me these days, and I think this
can be separated from wCTE

As hackers say, the first to try should be Marko's first design that
use the list of tuplestore and DTScanNode. So if he has solid image to
reach there, we can wait for him to complete his first compilable
version. Then let's take it back and forth. Is it possible?

And I concern we might not have concrete consensus about list of
features in document form. Does it accept Recursive query? What if x
refers to y that refers to x cyclicly? An external design sometimes
fix the internal design, and it helps people to review the
implementation. If I missed something please point me to the link.


Regards,


--
Hitoshi Harada

Re: Status report on writeable CTEs

From

Marko Tiikkaja

Date:

03 August 2010, 14:54:56

On 8/3/2010 7:30 PM, Hitoshi Harada wrote:
> As hackers say, the first to try should be Marko's first design that
> use the list of tuplestore and DTScanNode. So if he has solid image to
> reach there, we can wait for him to complete his first compilable
> version. Then let's take it back and forth. Is it possible?

I am currently working on a version that treats all WITH queries like 
wCTEs.  My progress can be seen in my git repo [1], branch "wcte".  In 
its current form, the patch compiles and passes all applicable 
regression tests but it's still very messy.  I'm going to send a cleaner 
WIP patch to the list the minute I have one, but anyone's free to look 
at the git repo (and even work on it if they want!).

> And I concern we might not have concrete consensus about list of
> features in document form. Does it accept Recursive query? What if x
> refers to y that refers to x cyclicly? An external design sometimes
> fix the internal design, and it helps people to review the
> implementation. If I missed something please point me to the link.

A recursive query should be fine as long as 1) it's SELECT-only and 2) 
it doesn't loop forever.  A wCTE can of course refer to the result of 
the recursive SELECT query with INSERT .. SELECT, UPDATE .. FROM or 
DELETE .. USING.  Cyclic dependencies are out of the scope of this 
patch; I'm not planning on adding any new features to regular CTEs.

[1] http://git.postgresql.org/gitweb?p=users/johto/postgres.git;a=summary

Regards,
Marko Tiikkaja

Re: Status report on writeable CTEs

From

Hitoshi Harada

Date:

15 August 2010, 08:37:15

2010/7/13 Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi>:
> On 7/12/10 9:34 PM +0300, Tom Lane wrote:
>>
>> Marko Tiikkaja<marko.tiikkaja@cs.helsinki.fi>  writes:
>>>
>>> ... So what I'm now thinking of is making the planner plan that as a
>>> single
>>> Query, and make the planner expand it into multiple PlannedStmts if
>>> necessary.  This would break the existing planner hooks, but I don't
>>> think that's a huge problem.  Does anyone see any obvious flaws in this?
>>
>> How will that interact with the existing rewriter?  It sounds a lot
>> like a recipe for duplicating functionality ...
>
> I was thinking that the rewriter would look at the top-level CTEs and
> rewrite all non-SELECT queries into a list of Queries and store that list
> into the CommonTableExprs.  The planner would then use this information and
> also expand the rewrite products.

Why didn't you choose this strategy? ISTM changing signature of
planner() and standard_planner() makes it more difficult to commit
this feature unnecessarily.

I thought at first that it is possible and better to split the
statement into multiple Querys in the rewriter and pass them to the
planner, but as you mentioned they should collaborate each other like
in ctelevelsup, so to pass a single Query to the planner is probably
the answer. However, you can store the sub (ie, in WITH clause)
PlannedStmt in the top level PlannedStmt and extract the sub
statements in pg_plan_queries() or anywhere in the PortalXXX() then
add them to the execution list.

Regards,

--
Hitoshi Harada