Thread: COPY (INSERT/UPDATE/DELETE .. RETURNING ..)
Hi, Attached is a patch for being able to do $SUBJECT without a CTE. The reasons this is better than a CTE version are: 1) It's not obvious why a CTE version works but a plain one doesn't 2) This one has less overhead (I measured a ~12% improvement on a not-too-unreasonable test case) With regard to RULEs, similar restrictions apply as the ones on data-modifying statements in WITH. I can't add this to November's commit fest, but I'm posting this anyway in case someone is thinking about implementing this feature. .m
Attachment
On Sun, Nov 1, 2015 at 2:49 AM, Marko Tiikkaja wrote: > I can't add this to November's commit fest, but I'm posting this anyway in > case someone is thinking about implementing this feature. Note: this one has been added to CF: https://commitfest.postgresql.org/7/414/ -- Michael
On Sun, Nov 1, 2015 at 2:49 AM, Marko Tiikkaja <marko@joh.to> wrote:
> Attached is a patch for being able to do $SUBJECT without a CTE. The
> reasons this is better than a CTE version are:
>
> 1) It's not obvious why a CTE version works but a plain one doesn't
> 2) This one has less overhead (I measured a ~12% improvement on a
> not-too-unreasonable test case)
Would you mind sharing this test case as well as numbers?
> With regard to RULEs, similar restrictions apply as the ones on
> data-modifying statements in WITH.
OK, so with this patch BeginCopy holds a light copy of RewriteQuery@rewriteHandler.c that rewrites queries with rules, and this patch extends the error handling after calling pg_analyze_and_rewrite. This looks like the right approach.
INSERT/UPDATE/DELETE ... RETURNING is a Postgres extension not part of the SQL spec, so my first thought is that there is no reason to reject this feature based on the grammar allowed.
Here are some comments about the patch.
Nice patch to begin with.
Regression tests are broken when copyselect is run in parallel because test3 is a table used there as well. A simple idea to fix this is to rename the table test3 to something else or to use a schema dedicated to this test, I just renamed it to copydml_test in the patch attached.
- | COPY select_with_parens TO opt_program copy_file_name opt_with copy_options
+ | COPY '(' PreparableStmt ')' TO opt_program copy_file_name opt_with copy_options
--
Michael
Michael
Attachment
On 2015-11-16 08:24, Michael Paquier wrote: > On Sun, Nov 1, 2015 at 2:49 AM, Marko Tiikkaja <marko@joh.to> wrote: >> Attached is a patch for being able to do $SUBJECT without a CTE. The >> reasons this is better than a CTE version are: >> >> 1) It's not obvious why a CTE version works but a plain one doesn't >> 2) This one has less overhead (I measured a ~12% improvement on a >> not-too-unreasonable test case) > > Would you mind sharing this test case as well as numbers? I'll try and re-puzzle it together tomorrow. It looks like I wasn't smart enough to actually save the test case anywhere. > Regression tests are broken when copyselect is run in parallel because > test3 is a table used there as well. A simple idea to fix this is to rename > the table test3 to something else or to use a schema dedicated to this > test, I just renamed it to copydml_test in the patch attached. Seems reasonable. > - | COPY select_with_parens TO opt_program > copy_file_name opt_with copy_options > + | COPY '(' PreparableStmt ')' TO opt_program > copy_file_name opt_with copy_options > This does not look right, PreparableStmt is used for statements that are > part of PREPARE queries, any modifications happening there would impact > COPY with this implementation. I think that we had better have a new option > query_with_parens. Please note that the patch attached has not changed that. This was discussed in 2010 when CTEs got the same treatment, and I agree with what was decided back then. If someone needs to make PreparableStmt different from what COPY and CTEs support, we can split them up. But they're completely identical after this patch, so splitting them up right now is somewhat pointless. Further, if someone's going to add new stuff to PreparableStmt, she should probably think about whether it would make sense to add it to COPY and CTEs from day one, too, and in that case not splitting them up is actually a win. .m
On Thu, Nov 19, 2015 at 11:04 AM, Marko Tiikkaja wrote: > This was discussed in 2010 when CTEs got the same treatment, and I agree > with what was decided back then. If someone needs to make PreparableStmt > different from what COPY and CTEs support, we can split them up. But > they're completely identical after this patch, so splitting them up right > now is somewhat pointless. > Further, if someone's going to add new stuff to PreparableStmt, she should > probably think about whether it would make sense to add it to COPY and CTEs > from day one, too, and in that case not splitting them up is actually a win. Personally, I would take it on the safe side and actually update it. If someone were to add a new node type in PreparableStmt I am pretty sure that we are going to forget to update the COPY part, leading us to unwelcome bugs. And that would not be cool. -- Michael
On 11/19/15 1:17 PM, Michael Paquier wrote: > On Thu, Nov 19, 2015 at 11:04 AM, Marko Tiikkaja wrote: >> Further, if someone's going to add new stuff to PreparableStmt, she should >> probably think about whether it would make sense to add it to COPY and CTEs >> from day one, too, and in that case not splitting them up is actually a win. > > Personally, I would take it on the safe side and actually update it. > If someone were to add a new node type in PreparableStmt I am pretty > sure that we are going to forget to update the COPY part, leading us > to unwelcome bugs. And that would not be cool. They'd have to get past this: + if (query->commandType != CMD_SELECT && + query->returningList == NIL) + { + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("COPY query must have a RETURNING clause"))); + } Of course, something might break if we added a new statement type which supported RETURNING, but I'm really not worried about that. I'm not dead set against adding some Assert(IsA()) calls here, but I don't see the point. .m
On Thu, Nov 19, 2015 at 9:22 PM, Marko Tiikkaja <marko@joh.to> wrote: > Of course, something might break if we added a new statement type which > supported RETURNING, but I'm really not worried about that. I'm not dead > set against adding some Assert(IsA()) calls here, but I don't see the point. gram.y has a long comment before select_no_parens regarding why we shouldn't do it this way, did you notice it? I haven't crafted yet a backward incompatible query with your patch yet, but something smells fishy here. -- Michael
On 11/19/15 7:39 PM, Michael Paquier wrote: > On Thu, Nov 19, 2015 at 9:22 PM, Marko Tiikkaja <marko@joh.to> wrote: >> Of course, something might break if we added a new statement type which >> supported RETURNING, but I'm really not worried about that. I'm not dead >> set against adding some Assert(IsA()) calls here, but I don't see the point. > > gram.y has a long comment before select_no_parens regarding why we > shouldn't do it this way, did you notice it? It talks about not doing '(' SelectStmt ')' "in the expression grammar". I don't see what that has to do with this patch. .m
On 2015-11-16 08:24, Michael Paquier wrote: > On Sun, Nov 1, 2015 at 2:49 AM, Marko Tiikkaja <marko@joh.to> wrote: >> Attached is a patch for being able to do $SUBJECT without a CTE. The >> reasons this is better than a CTE version are: >> >> 1) It's not obvious why a CTE version works but a plain one doesn't >> 2) This one has less overhead (I measured a ~12% improvement on a >> not-too-unreasonable test case) > > Would you mind sharing this test case as well as numbers? Attached the test case I used; removing a batch of old rows from a table through an index. Best out of three runs, I get 13.1 seconds versus 15.0 seconds, which should amount to an improvement of around 12.7%. Also attached a v3 of the patch which adds an Assert on top of the test case changes suggested by you. .m
Attachment
On Sun, Nov 22, 2015 at 9:54 AM, Marko Tiikkaja wrote: > Attached the test case I used; removing a batch of old rows from a table > through an index. Best out of three runs, I get 13.1 seconds versus 15.0 > seconds, which should amount to an improvement of around 12.7%. Thanks for the test case. I haven't seen that much, up to 5% on my laptop still that's not bad thinking that it saves one execution step. > Also attached a v3 of the patch which adds an Assert on top of the test case > changes suggested by you. Sorry for the late reply. I have been playing a bit more with this patch and I am withdrawing my comments regarding the use of SelectStmt here. I have looked at the patch again, noticing typos as well as comments not updated, particularly for psql's \copy, resulting in the attached. Patch is switched to "ready for committer". -- Michael
Attachment
> Patch is switched to "ready for committer". Committed, thank you -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
On Sat, Nov 28, 2015 at 1:15 AM, Teodor Sigaev <teodor@sigaev.ru> wrote: >> Patch is switched to "ready for committer". > > Committed, thank you Thanks. -- Michael