Thread: top-level DML under CTEs
The patch attached is based on the one rejected at the last CF for 9.0 last year. http://archives.postgresql.org/message-id/16303.1266023203@sss.pgh.pa.us This patch implements the feature that allows top-level DMLs under CTE WITH clause. For example: WITH t AS (SELECT * FROM x) UPDATE y SET val = t.val FROM t WHERE y.key = t.key; This feature is part of writeable CTEs proposed by David Fetter originally. There were two issues at the CF. 1. WITH clause atop INSERT Although the previous discussion got the consensus that we forbid WITH atop INSERT, it seems to me that it can be allowed. I managed to do it by treating the top WITH clause (of INSERT) as if the one of SELECT (or VALUES). It is possible to disallow the CTE over INSERT statement, but the lack for INSERT, though there are for UPDATE and DELETE, sounds inconsistent enough. 2. OLD/NEW in rules Following the subsequent discussion after the post linked above, I add code to throw an appropriate error when OLD/NEW is used in WITH clauses. It is true that OLD/NEW references look sane to general users, but actually (at least in our implementation) they are located in the top-level query's Range Table List. Consequently, they are invisible inside the WITH clause. To allow them, we should rewrite the rule systems overall. Thus, we forbid them in WITH though we should throw an error indicating appropriate message. I'll add the entry to CF app later. Any feedback is welcome. Regards, -- Hitoshi Harada
Attachment
On Mon, Sep 13, 2010 at 9:15 AM, Hitoshi Harada <umi.tanuki@gmail.com> wrote: > The patch attached is based on the one rejected at the last CF for 9.0 > last year. > > http://archives.postgresql.org/message-id/16303.1266023203@sss.pgh.pa.us > > This patch implements the feature that allows top-level DMLs under CTE > WITH clause. For example: > > WITH t AS (SELECT * FROM x) > UPDATE y SET val = t.val FROM t > WHERE y.key = t.key; > > This feature is part of writeable CTEs proposed by David Fetter originally. Thanks for pursuing this. I think this will be a useful feature if we can get it committed, and plus David Fetter will be very, very happy. :-) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
On Mon, Sep 13, 2010 at 9:20 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Mon, Sep 13, 2010 at 9:15 AM, Hitoshi Harada <umi.tanuki@gmail.com> wrote: >> The patch attached is based on the one rejected at the last CF for 9.0 >> last year. >> >> http://archives.postgresql.org/message-id/16303.1266023203@sss.pgh.pa.us >> >> This patch implements the feature that allows top-level DMLs under CTE >> WITH clause. For example: >> >> WITH t AS (SELECT * FROM x) >> UPDATE y SET val = t.val FROM t >> WHERE y.key = t.key; >> >> This feature is part of writeable CTEs proposed by David Fetter originally. > > Thanks for pursuing this. I think this will be a useful feature if we > can get it committed, and plus David Fetter will be very, very happy. > :-) Just to be clear, the attached patch is missing the part of the wCTE that allows queries of the form: WITH foo AS (DELETE * FROM bar RETURNING *) <any query using foo> IOW, your CTE query has to be a select. This is still highly useful however. The patch itself looks very clean after a quick glance (which is all I can offer ATM unfortunately). merlin
On Mon, Sep 13, 2010 at 2:43 PM, Merlin Moncure <mmoncure@gmail.com> wrote: > Just to be clear, the attached patch is missing the part of the wCTE > that allows queries of the form: > WITH foo AS (DELETE * FROM bar RETURNING *) <any query using foo> Understood. > IOW, your CTE query has to be a select. This is still highly useful > however. Agreed. > The patch itself looks very clean after a quick glance > (which is all I can offer ATM unfortunately). Yeah, I haven't had a chance to look at it either, just wanted to send props. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
2010/9/14 Merlin Moncure <mmoncure@gmail.com>: > On Mon, Sep 13, 2010 at 9:20 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Mon, Sep 13, 2010 at 9:15 AM, Hitoshi Harada <umi.tanuki@gmail.com> wrote: >>> The patch attached is based on the one rejected at the last CF for 9.0 >>> last year. >>> >>> http://archives.postgresql.org/message-id/16303.1266023203@sss.pgh.pa.us >>> >>> This patch implements the feature that allows top-level DMLs under CTE >>> WITH clause. For example: >>> >>> WITH t AS (SELECT * FROM x) >>> UPDATE y SET val = t.val FROM t >>> WHERE y.key = t.key; >>> >>> This feature is part of writeable CTEs proposed by David Fetter originally. >> >> Thanks for pursuing this. I think this will be a useful feature if we >> can get it committed, and plus David Fetter will be very, very happy. >> :-) > > Just to be clear, the attached patch is missing the part of the wCTE Yes, the main part of wCTE that allows DMLs in WITH is still under Marko Tikkaja. To work parallel, we split the tasks into pieces. Regards, -- Hitoshi Harada
On 2010-09-13 4:15 PM +0300, Hitoshi Harada wrote: > 1. WITH clause atop INSERT > Although the previous discussion got the consensus that we forbid WITH > atop INSERT, it seems to me that it can be allowed. I managed to do it > by treating the top WITH clause (of INSERT) as if the one of SELECT > (or VALUES). In the email you referred to, Tom was concerned about the case where these WITH lists have different RECURSIVE declarations. This patch makes both RECURSIVE if either of them is. I can think of cases where that might lead to surprising behaviour, but the chances of any of those happening in real life seem pretty slim. > It is possible to disallow the CTE over INSERT statement, > but the lack for INSERT, though there are for UPDATE and DELETE, > sounds inconsistent enough. Also because wCTEs are not allowed below the top level, not being able to use INSERT as the top level statement would force people to wrap that INSERT in another CTE. Regards, Marko Tiikkaja
2010/9/15 Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi>: > On 2010-09-13 4:15 PM +0300, Hitoshi Harada wrote: >> >> 1. WITH clause atop INSERT >> Although the previous discussion got the consensus that we forbid WITH >> atop INSERT, it seems to me that it can be allowed. I managed to do it >> by treating the top WITH clause (of INSERT) as if the one of SELECT >> (or VALUES). > > In the email you referred to, Tom was concerned about the case where these > WITH lists have different RECURSIVE declarations. This patch makes both > RECURSIVE if either of them is. I can think of cases where that might lead > to surprising behaviour, but the chances of any of those happening in real > life seem pretty slim. I might not understand the RECURSIVE issue correctly. I put my effort to make such query WITH RECURSIVE r AS (SELECT 1 i UNION ALL SELECT i + 1 FROM r WHERE i < 10) INSERT INTO WITH t AS (SELECT 0) VALUES((SELECT * FROM r LIMIT 1)),((SELECT * FROM t)); look like INSERT INTO WITH RECURSIVE r AS (SELECT 1 i UNION ALL SELECT i + 1 FROM r WHERE i < 10), t AS (SELECT 0) VALUES((SELECT * FROM r LIMIT 1)),((SELECT * FROM t)); Does that cause surprising behavior? > but the chances of any of those happening in real > life seem pretty slim. The OLD/NEW issue is also near impossible to be problem in the real life, except for the misleading error message. But once users see the non-understandable behavior, they make lines to claim as it's a "bug". So we need to put effort to avoid it as possible, I believe. Regards, -- Hitoshi Harada
Hitoshi Harada <umi.tanuki@gmail.com> writes: > 2010/9/15 Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi>: >> In the email you referred to, Tom was concerned about the case where these >> WITH lists have different RECURSIVE declarations. �This patch makes both >> RECURSIVE if either of them is. �I can think of cases where that might lead >> to surprising behaviour, but the chances of any of those happening in real >> life seem pretty slim. > Does that cause surprising behavior? My recollection is that whether a CTE is marked RECURSIVE or not affects its scope of visibility, so that confusing the two cases can result in flat-out incorrect parser behavior. It would probably be all right to combine the cases internally, at the rewriter or planner stage. It's not okay to do it in the parser, not even after doing parse analysis of the individual CTEs, because then it would be impossible for ruleutils.c to reverse-list the query correctly. regards, tom lane
On 2010-09-14 10:51 PM, Tom Lane wrote: > Hitoshi Harada<umi.tanuki@gmail.com> writes: >> 2010/9/15 Marko Tiikkaja<marko.tiikkaja@cs.helsinki.fi>: >>> In the email you referred to, Tom was concerned about the case where these >>> WITH lists have different RECURSIVE declarations. This patch makes both >>> RECURSIVE if either of them is. I can think of cases where that might lead >>> to surprising behaviour, but the chances of any of those happening in real >>> life seem pretty slim. > >> Does that cause surprising behavior? > > My recollection is that whether a CTE is marked RECURSIVE or not affects > its scope of visibility, so that confusing the two cases can result in > flat-out incorrect parser behavior. The worst I can think of is: CREATE TABLE foo(a int); WITH t AS (SELECT * FROM foo) INSERT INTO bar WITH RECURSIVE foo (SELECT 1 AS a) SELECT * FROM t; t will actually be populated with the results of the CTE, not the table foo. I don't think this is a huge problem in real life, but if someone thinks otherwise, I think we could just error out if the lists have a different RECURSIVE definition. Anyone have a worse example? Thoughts? Regards, Marko Tiikkaja
On Tue, Sep 14, 2010 at 4:28 PM, Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> wrote: > On 2010-09-14 10:51 PM, Tom Lane wrote: >> >> Hitoshi Harada<umi.tanuki@gmail.com> writes: >>> >>> 2010/9/15 Marko Tiikkaja<marko.tiikkaja@cs.helsinki.fi>: >>>> >>>> In the email you referred to, Tom was concerned about the case where >>>> these >>>> WITH lists have different RECURSIVE declarations. This patch makes both >>>> RECURSIVE if either of them is. I can think of cases where that might >>>> lead >>>> to surprising behaviour, but the chances of any of those happening in >>>> real >>>> life seem pretty slim. >> >>> Does that cause surprising behavior? >> >> My recollection is that whether a CTE is marked RECURSIVE or not affects >> its scope of visibility, so that confusing the two cases can result in >> flat-out incorrect parser behavior. > > The worst I can think of is: > > CREATE TABLE foo(a int); > > WITH t AS (SELECT * FROM foo) > INSERT INTO bar > WITH RECURSIVE foo (SELECT 1 AS a) > SELECT * FROM t; > > t will actually be populated with the results of the CTE, not the table foo. Unless I'm confused, that seems pretty clearly wrong. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes: > On 2010-09-14 10:51 PM, Tom Lane wrote: >> My recollection is that whether a CTE is marked RECURSIVE or not affects >> its scope of visibility, so that confusing the two cases can result in >> flat-out incorrect parser behavior. > The worst I can think of is: > CREATE TABLE foo(a int); > WITH t AS (SELECT * FROM foo) > INSERT INTO bar > WITH RECURSIVE foo (SELECT 1 AS a) > SELECT * FROM t; > t will actually be populated with the results of the CTE, not the table foo. > I don't think this is a huge problem in real life, but if someone thinks > otherwise, I think we could just error out if the lists have a different > RECURSIVE definition. Wrong is wrong. Doesn't matter whether it's "a huge problem in real life". Why is it so difficult to do this correctly? regards, tom lane
2010/9/15 Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi>: > On 2010-09-14 10:51 PM, Tom Lane wrote: >> My recollection is that whether a CTE is marked RECURSIVE or not affects >> its scope of visibility, so that confusing the two cases can result in >> flat-out incorrect parser behavior. > > The worst I can think of is: > > CREATE TABLE foo(a int); > > WITH t AS (SELECT * FROM foo) > INSERT INTO bar > WITH RECURSIVE foo (SELECT 1 AS a) > SELECT * FROM t; > > t will actually be populated with the results of the CTE, not the table foo. Hmmm, that's true. But it seems unrelated to RECURSIVE option, right? Regards, -- Hitoshi Harada
2010/9/15 Tom Lane <tgl@sss.pgh.pa.us>: > Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes: >> On 2010-09-14 10:51 PM, Tom Lane wrote: >>> My recollection is that whether a CTE is marked RECURSIVE or not affects >>> its scope of visibility, so that confusing the two cases can result in >>> flat-out incorrect parser behavior. > >> The worst I can think of is: > >> CREATE TABLE foo(a int); > >> WITH t AS (SELECT * FROM foo) >> INSERT INTO bar >> WITH RECURSIVE foo (SELECT 1 AS a) >> SELECT * FROM t; > >> t will actually be populated with the results of the CTE, not the table foo. > >> I don't think this is a huge problem in real life, but if someone thinks >> otherwise, I think we could just error out if the lists have a different >> RECURSIVE definition. > > Wrong is wrong. Doesn't matter whether it's "a huge problem in real life". > > Why is it so difficult to do this correctly? Because INSERT INTO ... (SELECT|VALUES) is already a collection of kludge (as comments say). It was possible to parse the two WITHs separately, but it results in ambiguous naming issue; parseWithClause() asserts there's only one WITH clause in the Stmt and detects duplicated CTE name in it. It seems possible to call parseWithClause() twice by cheating ParseState and to try to find name duplication outside it, though it is another kludge :-( Now that we find the worst situation, I start to think I have to take the kludy way anyway. Regards, -- Hitoshi Harada
Hitoshi Harada <umi.tanuki@gmail.com> writes: > 2010/9/15 Tom Lane <tgl@sss.pgh.pa.us>: >> Why is it so difficult to do this correctly? > Because INSERT INTO ... (SELECT|VALUES) is already a collection of > kludge (as comments say). It was possible to parse the two WITHs > separately, but it results in ambiguous naming issue; > parseWithClause() asserts there's only one WITH clause in the Stmt and > detects duplicated CTE name in it. Well, I would think that the no-duplication rule applies to each WITH list separately, not both together. If you do something like with t1 as (select * from foo) select * from (with t2 as (select * from foo) select * from t1, t2) ss; there's no expectation that the WITH clauses can't both define the same name. regards, tom lane
2010/9/15 Tom Lane <tgl@sss.pgh.pa.us>: > Hitoshi Harada <umi.tanuki@gmail.com> writes: >> 2010/9/15 Tom Lane <tgl@sss.pgh.pa.us>: >>> Why is it so difficult to do this correctly? > >> Because INSERT INTO ... (SELECT|VALUES) is already a collection of >> kludge (as comments say). It was possible to parse the two WITHs >> separately, but it results in ambiguous naming issue; >> parseWithClause() asserts there's only one WITH clause in the Stmt and >> detects duplicated CTE name in it. > > Well, I would think that the no-duplication rule applies to each WITH > list separately, not both together. If you do something like > > with t1 as (select * from foo) > select * from > (with t2 as (select * from foo) > select * from t1, t2) ss; > Well, I didn't know it is allowed. That would look like the way to go. Regards, -- Hitoshi Harada
2010/9/15 Hitoshi Harada <umi.tanuki@gmail.com>: > 2010/9/15 Tom Lane <tgl@sss.pgh.pa.us>: >> >> Well, I would think that the no-duplication rule applies to each WITH >> list separately, not both together. If you do something like >> >> with t1 as (select * from foo) >> select * from >> (with t2 as (select * from foo) >> select * from t1, t2) ss; >> > > Well, I didn't know it is allowed. That would look like the way to go. I made changes to the previous version, so that it avoids to resolve CTE name duplication. regression=# with t as (select 1 as i) insert into z with t as(select 2 as i )values ((select * from t)); INSERT 0 1 Time: 1.656 ms regression=# table z; f3 ---- 2 (1 row) Also, the sample Marko gave is OK. > CREATE TABLE foo(a int); > > WITH t AS (SELECT * FROM foo) > INSERT INTO bar > WITH RECURSIVE foo (SELECT 1 AS a) > SELECT * FROM t; > Hope this covers all the cases. Regards, -- Hitoshi Harada
Attachment
On 2010-09-17 4:48 AM, Hitoshi Harada wrote: > 2010/9/15 Hitoshi Harada<umi.tanuki@gmail.com>: >> Well, I didn't know it is allowed. That would look like the way to go. > > I made changes to the previous version, so that it avoids to resolve > CTE name duplication. This patch still doesn't address the issue Tom raised here: http://archives.postgresql.org/pgsql-hackers/2010-09/msg00753.php For WITH .. INSERT .. WITH .. SELECT ..; this patch works OK, but not so much for VALUES: =# CREATE RULE barrule AS ON UPDATE TO bar DO INSTEAD -# WITH RECURSIVE t AS (SELECT -1) -# INSERT INTO bar -# WITH t AS (SELECT 1) -# VALUES((SELECT * FROM t)); CREATE RULE =# \d bar Table "public.bar" Column | Type | Modifiers --------+---------+----------- a | integer | Rules: barrule AS ON UPDATE TO bar DO INSTEAD WITH RECURSIVE t AS ( SELECT 1 ), t AS ( SELECT(-1) ) INSERT INTO bar (a) WITH RECURSIVE t AS ( SELECT 1 ), t AS ( SELECT (-1) ) VALUES (( SELECT t."?column?" FROM t)) Regards, Marko Tiikkaja
2010/9/23 Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi>: > On 2010-09-17 4:48 AM, Hitoshi Harada wrote: >> >> 2010/9/15 Hitoshi Harada<umi.tanuki@gmail.com>: >>> >>> Well, I didn't know it is allowed. That would look like the way to go. >> >> I made changes to the previous version, so that it avoids to resolve >> CTE name duplication. > > This patch still doesn't address the issue Tom raised here: > http://archives.postgresql.org/pgsql-hackers/2010-09/msg00753.php > > For WITH .. INSERT .. WITH .. SELECT ..; this patch works OK, but not so > much for VALUES: > > =# CREATE RULE barrule AS ON UPDATE TO bar DO INSTEAD > -# WITH RECURSIVE t AS (SELECT -1) > -# INSERT INTO bar > -# WITH t AS (SELECT 1) > -# VALUES((SELECT * FROM t)); > CREATE RULE > > =# \d bar > Table "public.bar" > Column | Type | Modifiers > --------+---------+----------- > a | integer | > Rules: > barrule AS > ON UPDATE TO bar DO INSTEAD WITH RECURSIVE t AS ( > SELECT 1 > ), t AS ( > SELECT (-1) > ) > INSERT INTO bar (a) WITH RECURSIVE t AS ( > SELECT 1 > ), t AS ( > SELECT (-1) > ) > > VALUES (( SELECT t."?column?" > FROM t)) I ran the sql and recognized what is wrong. In VALUES case, the WITH table "t" is copied in one list and shown up in the both of INSERT-level WITH and SELECT-level WITH. Since the transformation of WITH clause to cheat postgres is in the parser stage currently, I wonder if this should be done in the rewriter or the planner stage. Thanks for the report. Next time, please point the clear problem in English aside the sample. Regards, -- Hitoshi Harada
On 2010-09-23 9:12 AM +0300, Hitoshi Harada wrote: > Thanks for the report. Next time, please point the clear problem in > English aside the sample. I apologize. The problem was exactly the one pointed out in the email I referred to, so I assumed that further explanation was not necessary. I will try to be more clear in the future. Regards, Marko Tiikkaja
2010/10/1 Hitoshi Harada <umi.tanuki@gmail.com>: > 2010/9/30 Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi>: >> On 2010-09-23 9:12 AM +0300, Hitoshi Harada wrote: >>> >>> Since the transformation of >>> WITH clause to cheat postgres is in the parser stage currently, I >>> wonder if this should be done in the rewriter or the planner stage. >> >> It's been about a week now. Should we expect a new patch soon? >> >> > > Yep, I'm working it now. You'll see the conclusion in a day or so. ...and attached is the latest patch. It contains LIMIT etc. bug of INSERT fixes and I confirmed the barrule case correctly in this version. > =# CREATE RULE barrule AS ON UPDATE TO bar DO INSTEAD > -# WITH RECURSIVE t AS (SELECT -1) > -# INSERT INTO bar > -# WITH t AS (SELECT 1) > -# VALUES((SELECT * FROM t)); Regards, -- Hitoshi Harada
Attachment
(Oops, this didn't go to -HACKERS) On 2010-10-04 2:46 PM +0300, Erik Rijkers wrote: > (HEAD from git://git.postgresql.org/git/postgresql.git) > > The patch applies only with error. > If that error is ignored, the regression 'with' test failes. > If that is also ignored, it runs. This patch conflicted with Tom's WITH .. INSERT change. I tweaked the patch just a bit and it now passes all regression tests so I can review it. New version attached for documentation purposes. Regards, Marko Tiikkaja
Attachment
2010/10/5 Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi>: > (Oops, this didn't go to -HACKERS) > > On 2010-10-04 2:46 PM +0300, Erik Rijkers wrote: >> >> (HEAD from git://git.postgresql.org/git/postgresql.git) >> >> The patch applies only with error. >> If that error is ignored, the regression 'with' test failes. >> If that is also ignored, it runs. > > This patch conflicted with Tom's WITH .. INSERT change. I tweaked the > patch just a bit and it now passes all regression tests so I can review > it. New version attached for documentation purposes. Thank you, I didn't notice that commit. In your last patch, the snippet to add errhint() and ref/insert sgml is unnecessary since it was for INSERT ... VALUES fix. Regards, -- Hitoshi Harada
Attachment
On 2010-10-05 3:37 PM +0300, Hitoshi Harada wrote: > 2010/10/5 Marko Tiikkaja<marko.tiikkaja@cs.helsinki.fi>: >> This patch conflicted with Tom's WITH .. INSERT change. I tweaked the >> patch just a bit and it now passes all regression tests so I can review >> it. New version attached for documentation purposes. > > Thank you, I didn't notice that commit. In your last patch, the > snippet to add errhint() and ref/insert sgml is unnecessary since it > was for INSERT ... VALUES fix. The patch seems to work for all cases I can think of. I'm going to mark this one ready for committer. Regards, Marko Tiikkaja
Hitoshi Harada <umi.tanuki@gmail.com> writes: > 2010/10/5 Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi>: >> This patch conflicted with Tom's WITH .. INSERT change. �I tweaked the >> patch just a bit and it now passes all regression tests so I can review >> it. �New version attached for documentation purposes. > Thank you, I didn't notice that commit. In your last patch, the > snippet to add errhint() and ref/insert sgml is unnecessary since it > was for INSERT ... VALUES fix. Committed with minor fixes (mostly documentation improvements). regards, tom lane