Thread: pgsql: SQL-standard function body
SQL-standard function body This adds support for writing CREATE FUNCTION and CREATE PROCEDURE statements for language SQL with a function body that conforms to the SQL standard and is portable to other implementations. Instead of the PostgreSQL-specific AS $$ string literal $$ syntax, this allows writing out the SQL statements making up the body unquoted, either as a single statement: CREATE FUNCTION add(a integer, b integer) RETURNS integer LANGUAGE SQL RETURN a + b; or as a block CREATE PROCEDURE insert_data(a integer, b integer) LANGUAGE SQL BEGIN ATOMIC INSERT INTO tbl VALUES (a); INSERT INTO tbl VALUES (b); END; The function body is parsed at function definition time and stored as expression nodes in a new pg_proc column prosqlbody. So at run time, no further parsing is required. However, this form does not support polymorphic arguments, because there is no more parse analysis done at call time. Dependencies between the function and the objects it uses are fully tracked. A new RETURN statement is introduced. This can only be used inside function bodies. Internally, it is treated much like a SELECT statement. psql needs some new intelligence to keep track of function body boundaries so that it doesn't send off statements when it sees semicolons that are inside a function body. Tested-by: Jaime Casanova <jcasanov@systemguards.com.ec> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/1c11f1eb-f00c-43b7-799d-2d44132c02d7@2ndquadrant.com Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/e717a9a18b2e34c9c40e5259ad4d31cd7e420750 Modified Files -------------- doc/src/sgml/catalogs.sgml | 10 + doc/src/sgml/ref/create_function.sgml | 125 ++++++++++-- doc/src/sgml/ref/create_procedure.sgml | 61 +++++- src/backend/catalog/pg_aggregate.c | 1 + src/backend/catalog/pg_proc.c | 116 +++++++---- src/backend/commands/aggregatecmds.c | 2 + src/backend/commands/functioncmds.c | 145 ++++++++++--- src/backend/commands/typecmds.c | 4 + src/backend/executor/functions.c | 79 +++++--- src/backend/nodes/copyfuncs.c | 15 ++ src/backend/nodes/equalfuncs.c | 13 ++ src/backend/nodes/outfuncs.c | 12 ++ src/backend/nodes/readfuncs.c | 1 + src/backend/optimizer/util/clauses.c | 126 ++++++++---- src/backend/parser/analyze.c | 35 ++++ src/backend/parser/gram.y | 129 +++++++++--- src/backend/tcop/postgres.c | 3 +- src/backend/utils/adt/ruleutils.c | 139 ++++++++++++- src/bin/pg_dump/pg_dump.c | 45 ++++- src/bin/psql/describe.c | 15 +- src/fe_utils/psqlscan.l | 24 ++- src/include/catalog/catversion.h | 2 +- src/include/catalog/pg_proc.dat | 4 + src/include/catalog/pg_proc.h | 6 +- src/include/commands/defrem.h | 2 + src/include/executor/functions.h | 15 ++ src/include/fe_utils/psqlscan_int.h | 2 + src/include/nodes/nodes.h | 1 + src/include/nodes/parsenodes.h | 13 ++ src/include/parser/kwlist.h | 2 + src/include/tcop/tcopprot.h | 1 + src/interfaces/ecpg/preproc/ecpg.addons | 6 + src/interfaces/ecpg/preproc/ecpg.trailer | 4 +- src/test/regress/expected/create_function_3.out | 257 +++++++++++++++++++++++- src/test/regress/expected/create_procedure.out | 65 ++++++ src/test/regress/sql/create_function_3.sql | 110 +++++++++- src/test/regress/sql/create_procedure.sql | 33 +++ 37 files changed, 1411 insertions(+), 212 deletions(-)
Hi, On 2021-04-07 19:53:35 +0000, Peter Eisentraut wrote: > SQL-standard function body > > This adds support for writing CREATE FUNCTION and CREATE PROCEDURE > statements for language SQL with a function body that conforms to the > SQL standard and is portable to other implementations. > > Instead of the PostgreSQL-specific AS $$ string literal $$ syntax, > this allows writing out the SQL statements making up the body > unquoted, either as a single statement: > > CREATE FUNCTION add(a integer, b integer) RETURNS integer > LANGUAGE SQL > RETURN a + b; > > or as a block > > CREATE PROCEDURE insert_data(a integer, b integer) > LANGUAGE SQL > BEGIN ATOMIC > INSERT INTO tbl VALUES (a); > INSERT INTO tbl VALUES (b); > END; > > The function body is parsed at function definition time and stored as > expression nodes in a new pg_proc column prosqlbody. So at run time, > no further parsing is required. > > However, this form does not support polymorphic arguments, because > there is no more parse analysis done at call time. > > Dependencies between the function and the objects it uses are fully > tracked. > > A new RETURN statement is introduced. This can only be used inside > function bodies. Internally, it is treated much like a SELECT > statement. > > psql needs some new intelligence to keep track of function body > boundaries so that it doesn't send off statements when it sees > semicolons that are inside a function body. > > Tested-by: Jaime Casanova <jcasanov@systemguards.com.ec> > Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> > Discussion: https://www.postgresql.org/message-id/flat/1c11f1eb-f00c-43b7-799d-2d44132c02d7@2ndquadrant.com > > Branch > ------ > master > > Details > ------- > https://git.postgresql.org/pg/commitdiff/e717a9a18b2e34c9c40e5259ad4d31cd7e420750 This is turning the BF red: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=rhinoceros&dt=2021-04-07%2022%3A52%3A19 Might be force_parallel_mode=regress related. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2021-04-07 19:53:35 +0000, Peter Eisentraut wrote: >> SQL-standard function body > This is turning the BF red: > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=rhinoceros&dt=2021-04-07%2022%3A52%3A19 > Might be force_parallel_mode=regress related. Yeah. On my machine, it's fine without force_parallel_mode and crashes with that. Looks like query text is not getting passed to the parallel worker in some cases. regards, tom lane
On 08.04.21 01:06, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: >> On 2021-04-07 19:53:35 +0000, Peter Eisentraut wrote: >>> SQL-standard function body > >> This is turning the BF red: >> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=rhinoceros&dt=2021-04-07%2022%3A52%3A19 >> Might be force_parallel_mode=regress related. > > Yeah. On my machine, it's fine without force_parallel_mode and > crashes with that. Looks like query text is not getting passed > to the parallel worker in some cases. There does not appear to be any indication in CreateQueryDesc() or for es_sourceText that the query source text needs to be not NULL (unlike PortalDefineQuery() for example). If we want that to be the case, an assert in CreateQueryDesc() would be a straightforward way to catch this. But for example code in CreateExecutorState() and executor_errposition() is prepared to have es_sourceText be NULL, so relative to that, the fix that Andres put into ExecInitParallelPlan() to handle that as well seems appropriate.
On Thu, Apr 8, 2021 at 2:25 AM Peter Eisentraut <peter@eisentraut.org> wrote: > On 08.04.21 01:06, Tom Lane wrote: > > Andres Freund <andres@anarazel.de> writes: > >> On 2021-04-07 19:53:35 +0000, Peter Eisentraut wrote: > >>> SQL-standard function body > > > >> This is turning the BF red: > >> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=rhinoceros&dt=2021-04-07%2022%3A52%3A19 > >> Might be force_parallel_mode=regress related. > > > > Yeah. On my machine, it's fine without force_parallel_mode and > > crashes with that. Looks like query text is not getting passed > > to the parallel worker in some cases. > > There does not appear to be any indication in CreateQueryDesc() or for > es_sourceText that the query source text needs to be not NULL (unlike > PortalDefineQuery() for example). If we want that to be the case, an > assert in CreateQueryDesc() would be a straightforward way to catch > this. But for example code in CreateExecutorState() and > executor_errposition() is prepared to have es_sourceText be NULL, so > relative to that, the fix that Andres put into ExecInitParallelPlan() to > handle that as well seems appropriate. I think we should insist on having a query string all the time, if at all possible, because having the query string is awfully nice for debugging. For example, if a parallel worker dumps core, you'd like to be able to find out what query it was trying to run when things went bad. But if we're not going to do that, then I think it would be better to do something more like what Noah did in f90e80b9138355a51d2d5b5b63e1f89c4ba53325 rather than what Andres did here. -- Robert Haas EDB: http://www.enterprisedb.com
On 4/7/21 9:50 PM, Andres Freund wrote: > Hi, > > On 2021-04-08 13:33:18 +1200, David Rowley wrote: >> On Thu, 8 Apr 2021 at 11:06, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Andres Freund <andres@anarazel.de> writes: >>> >>>> Might be force_parallel_mode=regress related. >>> Yeah. On my machine, it's fine without force_parallel_mode and >>> crashes with that. Looks like query text is not getting passed >>> to the parallel worker in some cases. >> I wonder if it would be worth adding a regression test run of the core >> tests during make check-world with force_parallel_mode set to regress. >> >> But maybe people running the tests on slow machines might not be a fan >> of that idea. > I've wondered about that too. Perhaps we could reuse the pg_upgrade run? > Honestly I'd prefer it if we could get rid of the rerun of 'make check' by pg_upgrade's test.sh and instead upgrade the data directory made by the earlier 'make check' run if it's still there (which would mean we'd need to stop it being deleted). cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Andrew Dunstan <andrew@dunslane.net> writes: > On 4/7/21 9:50 PM, Andres Freund wrote: >> I've wondered about that too. Perhaps we could reuse the pg_upgrade run? > Honestly I'd prefer it if we could get rid of the rerun of 'make check' > by pg_upgrade's test.sh and instead upgrade the data directory made by > the earlier 'make check' run if it's still there (which would mean we'd > need to stop it being deleted). Good idea as far as speeding check-world and buildfarm runs, but I wonder if we wouldn't be losing test coverage. Seeing the number of times that buildfarm runs have gotten through "make check" only to fail at the re-run in pg_upgrade, it seems clear to me that there is something different about the execution environment in the latter case. I've never been able to pin down quite what :-( regards, tom lane
Hi, On Thu, Apr 8, 2021, at 07:52, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > > On 4/7/21 9:50 PM, Andres Freund wrote: > >> I've wondered about that too. Perhaps we could reuse the pg_upgrade run? > > > Honestly I'd prefer it if we could get rid of the rerun of 'make check' > > by pg_upgrade's test.sh and instead upgrade the data directory made by > > the earlier 'make check' run if it's still there (which would mean we'd > > need to stop it being deleted). > > Good idea as far as speeding check-world and buildfarm runs, but I wonder > if we wouldn't be losing test coverage. Seeing the number of times that > buildfarm runs have gotten through "make check" only to fail at the > re-run in pg_upgrade, it seems clear to me that there is something > different about the execution environment in the latter case. I've > never been able to pin down quite what :-( IIRC we made it run with fsync explicitly enabled. Which obviously will change timing somewhat substantially... Andres
On Thu, Apr 08, 2021 at 07:56:10AM -0700, Andres Freund wrote: > IIRC we made it run with fsync explicitly enabled. Which obviously > will change timing somewhat substantially... Yeah. I think that this kind of things pretty much boils down into making the tests of pg_upgrade into TAP. We have already a set of rather-sane defaults in PostgresNode. This does not prevent just sticking one or two extra configuration lines into test.sh, of course. -- Michael
Attachment
On 10.04.21 03:38, Laurenz Albe wrote: > On Fri, 2021-04-09 at 19:44 +0200, I wrote: >>> SQL-standard function body >>> >>> psql needs some new intelligence to keep track of function body >>> boundaries so that it doesn't send off statements when it sees >>> semicolons that are inside a function body. >> >> This causes psql to fail to recognize the semicolon in the following >> statement as the end of a statement: >> >> IMPORT FOREIGN SCHEMA x FROM SERVER y INTO z OPTIONS (case 'lower'); >> >> The cause is the "case", which is recognized as the start of a function >> body. >> >> Would it be an option to recognize BEGIN and CASE as starting a >> function body only if they are *not* inside parentheses, like in >> the attached? > > Here is an improved patch, which treats END in the same fashion > (not properly indented for readability). Thanks, I took another look at this and augmented your change with a change that tracks whether the statement starts with CREATE [OR REPLACE] {FUNCTION|PROCEDURE}. That should make it pretty safe. What do you think?
Attachment
On 12.04.21 21:00, Laurenz Albe wrote: >>> Here is an improved patch, which treats END in the same fashion >>> (not properly indented for readability). >> >> Thanks, I took another look at this and augmented your change with a >> change that tracks whether the statement starts with CREATE [OR REPLACE] >> {FUNCTION|PROCEDURE}. That should make it pretty safe. What do you think? > > Thanks, that is fine with me and more than I asked for. > That should definitely exclude all false positive matches. committed
On Fri, 2021-04-16 at 12:26 +0200, Peter Eisentraut wrote: > committed Thank you! Yours, Laurenz Albe