Thread: SQL-standard function body
This adds support for writing CREATE FUNCTION and CREATE PROCEDURE statements for language SQL with a function body that conforms to the SQL standard and is portable to other implementations. Instead of the PostgreSQL-specific AS $$ string literal $$ syntax, this allows writing out the SQL statements making up the body unquoted, either as a single statement: CREATE FUNCTION add(a integer, b integer) RETURNS integer LANGUAGE SQL RETURN a + b; or as a block CREATE PROCEDURE insert_data(a integer, b integer) LANGUAGE SQL BEGIN ATOMIC INSERT INTO tbl VALUES (a); INSERT INTO tbl VALUES (b); END; The function body is parsed at function definition time and stored as expression nodes in probin. So at run time, no further parsing is required. However, this form does not support polymorphic arguments, because there is no more parse analysis done at call time. Dependencies between the function and the objects it uses are fully tracked. A new RETURN statement is introduced. This can only be used inside function bodies. Internally, it is treated much like a SELECT statement. psql needs some new intelligence to keep track of function body boundaries so that it doesn't send off statements when it sees semicolons that are inside a function body. Also, per SQL standard, LANGUAGE SQL is the default, so it does not need to be specified anymore. Note: Some parts of the patch look better under git diff -w (ignoring whitespace changes) because if/else blocks were introduced around existing code. TODOs and discussion points: - pg_dump is not yet supported. As a consequence, the pg_upgrade tests don't pass yet. I'm thinking about changing pg_dump to use pg_get_functiondef here instead of coding everything by hand. Some initial experimenting showed that this would be possible with minimal tweaking and it would surely be beneficial in the long run. - The compiled function body is stored in the probin field of pg_proc. This matches the historical split similar to adsrc/adbin, consrc/conbin, but this has now been abandoned. Also, this field should ideally be of type pg_node_tree, so reusing probin for that is probably not good. Seems like a new field might be best. - More test coverage is needed. Surprisingly, there wasn't actually any test AFAICT that just creates and SQL function and runs it. Most of that code is tested incidentally, but there is very little or no targeted testing of this functionality. - Some of the changes in pg_proc.c, functioncmds.c, and functions.c in particular were jammed in and could use some reorganization after the basic ideas are solidified. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On Tue, Jun 30, 2020 at 1:49 PM Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > This adds support for writing CREATE FUNCTION and CREATE PROCEDURE > statements for language SQL with a function body that conforms to the > SQL standard and is portable to other implementations. With what other implementations is it compatible? > The function body is parsed at function definition time and stored as > expression nodes in probin. So at run time, no further parsing is > required. > > However, this form does not support polymorphic arguments, because > there is no more parse analysis done at call time. > > Dependencies between the function and the objects it uses are fully > tracked. > > A new RETURN statement is introduced. This can only be used inside > function bodies. Internally, it is treated much like a SELECT > statement. > > psql needs some new intelligence to keep track of function body > boundaries so that it doesn't send off statements when it sees > semicolons that are inside a function body. > > Also, per SQL standard, LANGUAGE SQL is the default, so it does not > need to be specified anymore. Hmm, this all seems like a pretty big semantic change. IIUC, right now, a SQL function can only contain one statement, but it seems like with this patch you can have a block in there with a bunch of statements, sorta like plpgsql. But probably you don't have all of the functionality of plpgsql available. Also, the fact that you're doing parsing earlier means that e.g. creating a table and inserting into it won't work. Maybe that's fine. But it almost seems like you are inventing a whole new PL.... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Greetings, * Robert Haas (robertmhaas@gmail.com) wrote: > Hmm, this all seems like a pretty big semantic change. IIUC, right > now, a SQL function can only contain one statement, but it seems like > with this patch you can have a block in there with a bunch of > statements, sorta like plpgsql. From our docs: CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS numeric AS $$ UPDATE bank SET balance = balance - debit WHERE accountno = tf1.accountno; SELECT 1; $$ LANGUAGE SQL; https://www.postgresql.org/docs/current/xfunc-sql.html Haven't looked at the patch, tho if it adds support for something the SQL standard defines, that generally seems like a positive to me. Thanks, Stephen
Attachment
út 30. 6. 2020 v 19:58 odesílatel Robert Haas <robertmhaas@gmail.com> napsal:
On Tue, Jun 30, 2020 at 1:49 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> This adds support for writing CREATE FUNCTION and CREATE PROCEDURE
> statements for language SQL with a function body that conforms to the
> SQL standard and is portable to other implementations.
With what other implementations is it compatible?
> The function body is parsed at function definition time and stored as
> expression nodes in probin. So at run time, no further parsing is
> required.
>
> However, this form does not support polymorphic arguments, because
> there is no more parse analysis done at call time.
>
> Dependencies between the function and the objects it uses are fully
> tracked.
>
> A new RETURN statement is introduced. This can only be used inside
> function bodies. Internally, it is treated much like a SELECT
> statement.
>
> psql needs some new intelligence to keep track of function body
> boundaries so that it doesn't send off statements when it sees
> semicolons that are inside a function body.
>
> Also, per SQL standard, LANGUAGE SQL is the default, so it does not
> need to be specified anymore.
Hmm, this all seems like a pretty big semantic change. IIUC, right
now, a SQL function can only contain one statement, but it seems like
with this patch you can have a block in there with a bunch of
statements, sorta like plpgsql. But probably you don't have all of the
functionality of plpgsql available. Also, the fact that you're doing
parsing earlier means that e.g. creating a table and inserting into it
won't work. Maybe that's fine. But it almost seems like you are
inventing a whole new PL....
It is SQL/PSM and can be nice to have it.
I am a little bit afraid about performance - SQL functions doesn't use plan cache and simple expressions. Without inlining it can be too slow.
Regards
Pavel
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Jun 30, 2020 at 1:49 PM Peter Eisentraut > <peter.eisentraut@2ndquadrant.com> wrote: >> This adds support for writing CREATE FUNCTION and CREATE PROCEDURE >> statements for language SQL with a function body that conforms to the >> SQL standard and is portable to other implementations. > With what other implementations is it compatible? Yeah ... I'm sort of wondering exactly what this really accomplishes. I think "portability" is a red herring unfortunately. Tracking the dependencies of the function body sounds nice at first glance, so it might be a feature. But given our experiences with having to use check_function_bodies = off to not have impossible dependency loops in dump/restore, I rather wonder whether it'll be a net loss in practice. IIUC, this implementation is flat out incapable of doing the equivalent of check_function_bodies = off, and that sounds like trouble. > Hmm, this all seems like a pretty big semantic change. IIUC, right > now, a SQL function can only contain one statement, Not true, you can have more. However, it's nonetheless an enormous semantic change, if only because the CREATE FUNCTION-time search_path is now relevant instead of the execution-time path. That *will* break use-cases I've heard of, where the same function is applied to different tables by adjusting the path. It'd certainly be useful from some perspectives (eg better security), but it's ... different. Replicating the creation-time search path will be a big headache for pg_dump, I bet. > But it almost seems like you are > inventing a whole new PL.... Yes. Having this replace the existing SQL PL would be a disaster, because there are use-cases this simply can't meet (even assuming that we can fix the polymorphism problem, which seems a bit unlikely). We'd need to treat it as a new PL type. Perhaps this is useful enough to justify all the work involved, but I'm not sure. regards, tom lane
I wrote: > Replicating the creation-time search path will be a big headache for > pg_dump, I bet. On further thought, we probably don't have to. Re-parsing the function body the same way is exactly the same problem as re-parsing a view or matview body the same way. I don't want to claim that that's a 100% solved problem, but I've heard few complaints in that area lately. The point remains that exposing the function body's dependencies will constrain restore order far more than we are accustomed to see. It might be possible to build examples that flat out can't be restored, even granting that we teach pg_dump how to break dependency loops by first creating the function with empty body and later redefining it with the real body. (Admittedly, if that's possible then you likely could make it happen with views too. But somehow it seems more likely that people would create spaghetti dependencies for functions than views.) regards, tom lane
Hi, On 2020-06-30 19:49:04 +0200, Peter Eisentraut wrote: > The function body is parsed at function definition time and stored as > expression nodes in probin. So at run time, no further parsing is > required. As raw parse tree or as a parse-analysed tree? I assume the latter? Isn't a consequence of that that we'd get a lot more errors if any DDL is done to tables involved in the query? In contrast to other languages we'd not be able to handle column type changes etc, right? Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2020-06-30 19:49:04 +0200, Peter Eisentraut wrote: >> The function body is parsed at function definition time and stored as >> expression nodes in probin. So at run time, no further parsing is >> required. > Isn't a consequence of that that we'd get a lot more errors if any DDL > is done to tables involved in the query? In contrast to other languages > we'd not be able to handle column type changes etc, right? I suppose it'd act like column references in a view, ie the dependency mechanisms would forbid you from changing/dropping any column mentioned in one of these functions. regards, tom lane
On Tue, Jun 30, 2020 at 2:51 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > On further thought, we probably don't have to. Re-parsing the function > body the same way is exactly the same problem as re-parsing a view or > matview body the same way. I don't want to claim that that's a 100% > solved problem, but I've heard few complaints in that area lately. > > The point remains that exposing the function body's dependencies will > constrain restore order far more than we are accustomed to see. It > might be possible to build examples that flat out can't be restored, > even granting that we teach pg_dump how to break dependency loops > by first creating the function with empty body and later redefining > it with the real body. (Admittedly, if that's possible then you > likely could make it happen with views too. But somehow it seems > more likely that people would create spaghetti dependencies for > functions than views.) In my experience, there's certainly demand for some kind of mode where plpgsql functions get checked at function definition time, rather than at execution time. The model we have is advantageous not only because it simplifies dump and reload, but also because it handles cases where the table is created on the fly properly. However, it also means that you can have silly mistakes in your function definitions that you don't find out about until runtime, and in my experience, people don't like that behavior much at all. So I don't think that it's a bad idea on principle, or anything like that, but the details seem like they need a lot of thought. The dump and restore issues need to be considered, but also, what about things like IF and WHILE? People are going to want those constructs with these new semantics, too. I actually don't have a very clear idea of what the standard has to say about SQL-language functions. Does it just say it's a list of statements, or does it involve variables and control-flow constructs and stuff like that, too? If we go that direction with this, then we're actually going to end up with two different implementations of what's now plpgsql, or something. But if we don't, then I'm not sure how far this takes us. I'm not saying it's bad, but the comment "I love the early binding but where's my IF statement" seems like an inevitable one. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
st 1. 7. 2020 v 15:37 odesílatel Robert Haas <robertmhaas@gmail.com> napsal:
On Tue, Jun 30, 2020 at 2:51 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> On further thought, we probably don't have to. Re-parsing the function
> body the same way is exactly the same problem as re-parsing a view or
> matview body the same way. I don't want to claim that that's a 100%
> solved problem, but I've heard few complaints in that area lately.
>
> The point remains that exposing the function body's dependencies will
> constrain restore order far more than we are accustomed to see. It
> might be possible to build examples that flat out can't be restored,
> even granting that we teach pg_dump how to break dependency loops
> by first creating the function with empty body and later redefining
> it with the real body. (Admittedly, if that's possible then you
> likely could make it happen with views too. But somehow it seems
> more likely that people would create spaghetti dependencies for
> functions than views.)
In my experience, there's certainly demand for some kind of mode where
plpgsql functions get checked at function definition time, rather than
at execution time. The model we have is advantageous not only because
it simplifies dump and reload, but also because it handles cases where
the table is created on the fly properly. However, it also means that
you can have silly mistakes in your function definitions that you
don't find out about until runtime, and in my experience, people don't
like that behavior much at all. So I don't think that it's a bad idea
on principle, or anything like that, but the details seem like they
need a lot of thought. The dump and restore issues need to be
considered, but also, what about things like IF and WHILE? People are
going to want those constructs with these new semantics, too.
plpgsql_check can be integrated to upstream.
I actually don't have a very clear idea of what the standard has to
say about SQL-language functions. Does it just say it's a list of
statements, or does it involve variables and control-flow constructs
and stuff like that, too? If we go that direction with this, then
we're actually going to end up with two different implementations of
what's now plpgsql, or something. But if we don't, then I'm not sure
how far this takes us. I'm not saying it's bad, but the comment "I
love the early binding but where's my IF statement" seems like an
inevitable one.
The standard SQL/PSM is a full functionality language with variables, conditional statements, exception handlings, ..
Unfortunately a basic implementation integrated into the main SQL parser can be pretty hard work. First issue can be SET statement implementation.
Regards
Pavel
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > In my experience, there's certainly demand for some kind of mode where > plpgsql functions get checked at function definition time, rather than > at execution time. Yeah, absolutely agreed. But I'm afraid this proposal takes us too far in the other direction: with this, you *must* have a 100% parseable and semantically valid function body, every time all the time. So far as plpgsql is concerned, I could see extending the validator to run parse analysis (not just raw parsing) on all SQL statements in the body. This wouldn't happen of course with check_function_bodies off, so it wouldn't affect dump/reload. But likely there would still be demand for more fine-grained control over it ... or maybe it could stop doing analysis as soon as it finds a DDL command? regards, tom lane
st 1. 7. 2020 v 16:14 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Robert Haas <robertmhaas@gmail.com> writes:
> In my experience, there's certainly demand for some kind of mode where
> plpgsql functions get checked at function definition time, rather than
> at execution time.
Yeah, absolutely agreed. But I'm afraid this proposal takes us too
far in the other direction: with this, you *must* have a 100% parseable
and semantically valid function body, every time all the time.
So far as plpgsql is concerned, I could see extending the validator
to run parse analysis (not just raw parsing) on all SQL statements in
the body. This wouldn't happen of course with check_function_bodies off,
so it wouldn't affect dump/reload. But likely there would still be
demand for more fine-grained control over it ... or maybe it could
stop doing analysis as soon as it finds a DDL command?
This simple analysis stops on first record type usage. PLpgSQL allows some dynamic work that increases the complexity of static analysis.
Regards
Pavel
regards, tom lane
On Wed, Jul 1, 2020 at 10:14:10AM -0400, Tom Lane wrote: > Robert Haas <robertmhaas@gmail.com> writes: > > In my experience, there's certainly demand for some kind of mode where > > plpgsql functions get checked at function definition time, rather than > > at execution time. > > Yeah, absolutely agreed. But I'm afraid this proposal takes us too > far in the other direction: with this, you *must* have a 100% parseable > and semantically valid function body, every time all the time. > > So far as plpgsql is concerned, I could see extending the validator > to run parse analysis (not just raw parsing) on all SQL statements in > the body. This wouldn't happen of course with check_function_bodies off, > so it wouldn't affect dump/reload. But likely there would still be > demand for more fine-grained control over it ... or maybe it could > stop doing analysis as soon as it finds a DDL command? Is the SQL-standard function body verified as preventing function inlining? That seems to be a major downside. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EnterpriseDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee
Bruce Momjian <bruce@momjian.us> writes: > Is the SQL-standard function body verified as preventing function > inlining? That seems to be a major downside. I see no reason why that would make any difference. There might be more code to be written than is in the patch, but in principle inlining should not care whether the function is pre-parsed or not. regards, tom lane
On 7/1/20 3:36 PM, Robert Haas wrote: > I actually don't have a very clear idea of what the standard has to > say about SQL-language functions. Does it just say it's a list of > statements, or does it involve variables and control-flow constructs > and stuff like that, too? It's either a single sql statement, or a collection of them between "begin atomic" and "end". There are no variables or flow control constructs or anything like that, just as there are no such things outside of a function. (There are a few statements that are not allowed, such as COMMIT.) -- Vik Fearing
st 1. 7. 2020 v 20:19 odesílatel Vik Fearing <vik@postgresfriends.org> napsal:
On 7/1/20 3:36 PM, Robert Haas wrote:
> I actually don't have a very clear idea of what the standard has to
> say about SQL-language functions. Does it just say it's a list of
> statements, or does it involve variables and control-flow constructs
> and stuff like that, too?
It's either a single sql statement, or a collection of them between
"begin atomic" and "end". There are no variables or flow control
constructs or anything like that, just as there are no such things
outside of a function.
What is the source of this comment? Maybe we are speaking (and thinking) about different languages.
I thought the language of SQL functions (ANSI/SQL) is SQL/PSM.
Regards
Pavel
(There are a few statements that are not allowed, such as COMMIT.)
--
Vik Fearing
On 7/1/20 9:32 PM, Pavel Stehule wrote: > st 1. 7. 2020 v 20:19 odesílatel Vik Fearing <vik@postgresfriends.org> > napsal: > >> On 7/1/20 3:36 PM, Robert Haas wrote: >>> I actually don't have a very clear idea of what the standard has to >>> say about SQL-language functions. Does it just say it's a list of >>> statements, or does it involve variables and control-flow constructs >>> and stuff like that, too? >> >> >> It's either a single sql statement, or a collection of them between >> "begin atomic" and "end". There are no variables or flow control >> constructs or anything like that, just as there are no such things >> outside of a function. >> > > What is the source of this comment? The SQL Standard. > Maybe we are speaking (and thinking) > about different languages. I think so, yes. > I thought the language of SQL functions (ANSI/SQL) is SQL/PSM. That is something else entirely, and not at all what Peter's patch is about. -- Vik Fearing
st 1. 7. 2020 v 22:31 odesílatel Vik Fearing <vik@postgresfriends.org> napsal:
On 7/1/20 9:32 PM, Pavel Stehule wrote:
> st 1. 7. 2020 v 20:19 odesílatel Vik Fearing <vik@postgresfriends.org>
> napsal:
>
>> On 7/1/20 3:36 PM, Robert Haas wrote:
>>> I actually don't have a very clear idea of what the standard has to
>>> say about SQL-language functions. Does it just say it's a list of
>>> statements, or does it involve variables and control-flow constructs
>>> and stuff like that, too?
>>
>>
>> It's either a single sql statement, or a collection of them between
>> "begin atomic" and "end". There are no variables or flow control
>> constructs or anything like that, just as there are no such things
>> outside of a function.
>>
>
> What is the source of this comment?
The SQL Standard.
The SQL Standard is really big, and is very possible so I miss this part. Can you send me a link?
Regards
Pavel
> Maybe we are speaking (and thinking)
> about different languages.
I think so, yes.
> I thought the language of SQL functions (ANSI/SQL) is SQL/PSM.
That is something else entirely, and not at all what Peter's patch is about.
--
Vik Fearing
On 7/1/20 10:34 PM, Pavel Stehule wrote: > st 1. 7. 2020 v 22:31 odesílatel Vik Fearing <vik@postgresfriends.org> > napsal: > >> On 7/1/20 9:32 PM, Pavel Stehule wrote: >>> st 1. 7. 2020 v 20:19 odesílatel Vik Fearing <vik@postgresfriends.org> >>> napsal: >>> >>>> On 7/1/20 3:36 PM, Robert Haas wrote: >>>>> I actually don't have a very clear idea of what the standard has to >>>>> say about SQL-language functions. Does it just say it's a list of >>>>> statements, or does it involve variables and control-flow constructs >>>>> and stuff like that, too? >>>> >>>> >>>> It's either a single sql statement, or a collection of them between >>>> "begin atomic" and "end". There are no variables or flow control >>>> constructs or anything like that, just as there are no such things >>>> outside of a function. >>>> >>> >>> What is the source of this comment? >> >> >> The SQL Standard. >> > > The SQL Standard is really big, and is very possible so I miss this part. > Can you send me a link? ISO/IEC 9075-2:2016 Section 11.60 <SQL-invoked routine> -- Vik Fearing
st 1. 7. 2020 v 22:54 odesílatel Vik Fearing <vik@postgresfriends.org> napsal:
On 7/1/20 10:34 PM, Pavel Stehule wrote:
> st 1. 7. 2020 v 22:31 odesílatel Vik Fearing <vik@postgresfriends.org>
> napsal:
>
>> On 7/1/20 9:32 PM, Pavel Stehule wrote:
>>> st 1. 7. 2020 v 20:19 odesílatel Vik Fearing <vik@postgresfriends.org>
>>> napsal:
>>>
>>>> On 7/1/20 3:36 PM, Robert Haas wrote:
>>>>> I actually don't have a very clear idea of what the standard has to
>>>>> say about SQL-language functions. Does it just say it's a list of
>>>>> statements, or does it involve variables and control-flow constructs
>>>>> and stuff like that, too?
>>>>
>>>>
>>>> It's either a single sql statement, or a collection of them between
>>>> "begin atomic" and "end". There are no variables or flow control
>>>> constructs or anything like that, just as there are no such things
>>>> outside of a function.
>>>>
>>>
>>> What is the source of this comment?
>>
>>
>> The SQL Standard.
>>
>
> The SQL Standard is really big, and is very possible so I miss this part.
> Can you send me a link?
ISO/IEC 9075-2:2016 Section 11.60 <SQL-invoked routine>
Thank you
Pavel
--
Vik Fearing
On Wed, Jul 1, 2020 at 5:58 AM Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Jun 30, 2020 at 1:49 PM Peter Eisentraut > <peter.eisentraut@2ndquadrant.com> wrote: > > This adds support for writing CREATE FUNCTION and CREATE PROCEDURE > > statements for language SQL with a function body that conforms to the > > SQL standard and is portable to other implementations. > > With what other implementations is it compatible? Judging by the Wikipedia article[1], it sounds like at least DB2 and MySQL/MariaDB are purposely striving for conformance. When I worked with DB2 a few years back I preferred to use their standard-conforming PL stuff (as opposed to their be-more-like-Oracle PL/SQL mode), and I always hoped that PostgreSQL would eventually understand the same syntax; admittedly, anyone who has ever worked on large applications that support multiple RDBMSs knows that there's more than just surface syntax to worry about, but it still seems like a pretty solid plan to conform to the standard that's in our name, so +1 from me on the general direction (I didn't look at the patch). [1] https://en.wikipedia.org/wiki/SQL/PSM
Thomas Munro <thomas.munro@gmail.com> writes: > On Wed, Jul 1, 2020 at 5:58 AM Robert Haas <robertmhaas@gmail.com> wrote: >> With what other implementations is it compatible? > Judging by the Wikipedia article[1], it sounds like at least DB2 and > MySQL/MariaDB are purposely striving for conformance. > [1] https://en.wikipedia.org/wiki/SQL/PSM but ... but ... but ... that's about SQL/PSM, which is not this. Having said that, I wonder whether this work could be repurposed to be the start of a real SQL/PSM implementation. There's other stuff in SQL/PSM, but a big part of it is routines that are written with syntax like this. regards, tom lane
st 1. 7. 2020 v 22:54 odesílatel Vik Fearing <vik@postgresfriends.org> napsal:
On 7/1/20 10:34 PM, Pavel Stehule wrote:
> st 1. 7. 2020 v 22:31 odesílatel Vik Fearing <vik@postgresfriends.org>
> napsal:
>
>> On 7/1/20 9:32 PM, Pavel Stehule wrote:
>>> st 1. 7. 2020 v 20:19 odesílatel Vik Fearing <vik@postgresfriends.org>
>>> napsal:
>>>
>>>> On 7/1/20 3:36 PM, Robert Haas wrote:
>>>>> I actually don't have a very clear idea of what the standard has to
>>>>> say about SQL-language functions. Does it just say it's a list of
>>>>> statements, or does it involve variables and control-flow constructs
>>>>> and stuff like that, too?
>>>>
>>>>
>>>> It's either a single sql statement, or a collection of them between
>>>> "begin atomic" and "end". There are no variables or flow control
>>>> constructs or anything like that, just as there are no such things
>>>> outside of a function.
>>>>
>>>
>>> What is the source of this comment?
>>
>>
>> The SQL Standard.
>>
>
> The SQL Standard is really big, and is very possible so I miss this part.
> Can you send me a link?
ISO/IEC 9075-2:2016 Section 11.60 <SQL-invoked routine>
I am looking there, and it looks like a subset of SQL/PSM or better - SQL/PSM is extending this. But this part is a little bit strange, because it doesn't introduce its own variables, but it is working with the concept of host variables and is a little bit messy (for me). Looks like it is introduced for usage in triggers. If we support triggers without trigger functions, then it has sense. Without it - It is hard for me to imagine a use case for this reduced language.
Regards
Pavel
--
Vik Fearing
On Wed, Jul 1, 2020 at 5:49 AM Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > - More test coverage is needed. Surprisingly, there wasn't actually any > test AFAICT that just creates and SQL function and runs it. Most of > that code is tested incidentally, but there is very little or no > targeted testing of this functionality. FYI cfbot showed a sign of some kind of error_context_stack corruption while running "DROP TABLE functest3 CASCADE;".
Thomas Munro <thomas.munro@gmail.com> writes: > On Wed, Jul 1, 2020 at 5:49 AM Peter Eisentraut > <peter.eisentraut@2ndquadrant.com> wrote: >> - More test coverage is needed. Surprisingly, there wasn't actually any >> test AFAICT that just creates and SQL function and runs it. Most of >> that code is tested incidentally, but there is very little or no >> targeted testing of this functionality. > FYI cfbot showed a sign of some kind of error_context_stack corruption > while running "DROP TABLE functest3 CASCADE;". BTW, it occurs to me after answering bug #16534 that contrib/earthdistance's SQL functions would be great candidates for this new syntax. Binding their references at creation time is really exactly what we want. I still feel that we can't just replace the existing implementation, though, as that would kill too many use-cases where late binding is helpful. regards, tom lane
On 2020-06-30 19:49, Peter Eisentraut wrote: > This adds support for writing CREATE FUNCTION and CREATE PROCEDURE > statements for language SQL with a function body that conforms to the > SQL standard and is portable to other implementations. Here is a new patch. The only significant change is that pg_dump support is now fixed. Per the discussion in [0], I have introduced a new function pg_get_function_sqlbody() that just produces the formatted SQL body, not the whole function definition. All the tests now pass. As mentioned before, more tests are probably needed, so if reviewers just want to play with this and find things that don't work, those could be put into test cases, for example. As a thought, a couple of things could probably be separated from this patch and considered separately: 1. making LANGUAGE SQL the default 2. the RETURN statement If reviewers think that would be sensible, I can prepare separate patches for those. [0]: https://www.postgresql.org/message-id/flat/9df8a3d3-13d2-116d-26ab-6a273c1ed38c%402ndquadrant.com -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Some conflicts have emerged, so here is an updated patch. I have implemented/fixed the inlining of set-returning functions written in the new style, which was previously marked TODO in the patch. On 2020-08-28 07:33, Peter Eisentraut wrote: > On 2020-06-30 19:49, Peter Eisentraut wrote: >> This adds support for writing CREATE FUNCTION and CREATE PROCEDURE >> statements for language SQL with a function body that conforms to the >> SQL standard and is portable to other implementations. > > Here is a new patch. The only significant change is that pg_dump > support is now fixed. Per the discussion in [0], I have introduced a > new function pg_get_function_sqlbody() that just produces the formatted > SQL body, not the whole function definition. All the tests now pass. > As mentioned before, more tests are probably needed, so if reviewers > just want to play with this and find things that don't work, those could > be put into test cases, for example. > > As a thought, a couple of things could probably be separated from this > patch and considered separately: > > 1. making LANGUAGE SQL the default > > 2. the RETURN statement > > If reviewers think that would be sensible, I can prepare separate > patches for those. > > > [0]: > https://www.postgresql.org/message-id/flat/9df8a3d3-13d2-116d-26ab-6a273c1ed38c%402ndquadrant.com > -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On Mon, Sep 07, 2020 at 08:00:08AM +0200, Peter Eisentraut wrote: > Some conflicts have emerged, so here is an updated patch. > > I have implemented/fixed the inlining of set-returning functions written in > the new style, which was previously marked TODO in the patch. The CF bot is telling that this patch fails to apply. Could you send a rebase? -- Michael
Attachment
On 2020-09-29 07:42, Michael Paquier wrote: > On Mon, Sep 07, 2020 at 08:00:08AM +0200, Peter Eisentraut wrote: >> Some conflicts have emerged, so here is an updated patch. >> >> I have implemented/fixed the inlining of set-returning functions written in >> the new style, which was previously marked TODO in the patch. > > The CF bot is telling that this patch fails to apply. Could you send > a rebase? Here is a rebase, no functionality changes. As indicated earlier, I'll also send some sub-patches as separate submissions. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Here is another updated patch. I did some merging and some small fixes and introduced the pg_proc column prosqlbody to store the parsed function body separately from probin. Aside from one TODO left it seems feature-complete to me for now. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Hi, I noticed that this patch fails on the cfbot. For this, I changed the status to: 'Waiting on Author'. Cheers, //Georgios The new status of this patch is: Waiting on Author
On 2020-11-10 16:21, Georgios Kokolatos wrote: > Hi, > > I noticed that this patch fails on the cfbot. > For this, I changed the status to: 'Waiting on Author'. > > Cheers, > //Georgios > > The new status of this patch is: Waiting on Author > Here is an updated patch to get it building again. -- Peter Eisentraut 2ndQuadrant, an EDB company https://www.2ndquadrant.com/
Attachment
On 2020-11-20 08:25, Peter Eisentraut wrote: > On 2020-11-10 16:21, Georgios Kokolatos wrote: >> Hi, >> >> I noticed that this patch fails on the cfbot. >> For this, I changed the status to: 'Waiting on Author'. >> >> Cheers, >> //Georgios >> >> The new status of this patch is: Waiting on Author >> > > Here is an updated patch to get it building again. Another updated patch to get things building again. I've also fixed the last TODO I had in there in qualifying function arguments as necessary in ruleutils.c. -- Peter Eisentraut 2ndQuadrant, an EDB company https://www.2ndquadrant.com/
Attachment
On 15.03.21 09:14, Julien Rouhaud wrote: > On Mon, Mar 15, 2021 at 04:03:44PM +0800, Julien Rouhaud wrote: >> On Mon, Mar 15, 2021 at 01:05:11AM -0500, Jaime Casanova wrote: >>> I found another problem when using CASE expressions: >>> >>> CREATE OR REPLACE FUNCTION foo_case() >>> RETURNS boolean >>> LANGUAGE SQL >>> BEGIN ATOMIC >>> select case when random() > 0.5 then true else false end; >>> END; >>> >>> apparently the END in the CASE expression is interpreted as the END of >>> the function >> >> I think that it's an issue in psql scanner. If you escape the semicolon or >> force a single query execution (say with psql -c), it works as expected. > > Applying the following diff (not sending a patch to avoid breaking the cfbot) > the issue and doesn't seem to break anything else: Right. Here is a new patch with that fix added and a small conflict resolved.
Attachment
On Fri, Mar 19, 2021 at 8:49 AM Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > > Right. Here is a new patch with that fix added and a small conflict > resolved. Great. It seems print_function_sqlbody() is not protected to avoid receiving a function that hasn't a standard sql body in src/backend/utils/adt/ruleutils.c:3292, but instead it has an assert that gets hit with something like this: CREATE FUNCTION foo() RETURNS int LANGUAGE SQL AS $$ SELECT 1 $$; SELECT pg_get_function_sqlbody('foo'::regproc); -- Jaime Casanova Director de Servicios Profesionales SYSTEMGUARDS - Consultores de PostgreSQL
On Tue, Mar 23, 2021 at 11:28:55PM -0500, Jaime Casanova wrote: > On Fri, Mar 19, 2021 at 8:49 AM Peter Eisentraut > <peter.eisentraut@2ndquadrant.com> wrote: > > > > Right. Here is a new patch with that fix added and a small conflict > > resolved. > > Great. > > It seems print_function_sqlbody() is not protected to avoid receiving > a function that hasn't a standard sql body in > src/backend/utils/adt/ruleutils.c:3292, but instead it has an assert > that gets hit with something like this: > > CREATE FUNCTION foo() RETURNS int LANGUAGE SQL AS $$ SELECT 1 $$; > SELECT pg_get_function_sqlbody('foo'::regproc); It would also be good to add a regression test checking that we can't define a function with both a prosrc and a prosqlbody. @@ -76,6 +77,7 @@ ProcedureCreate(const char *procedureName, Oid languageValidator, const char *prosrc, const char *probin, + Node *prosqlbody, char prokind, bool security_definer, bool isLeakProof, @@ -119,8 +121,6 @@ ProcedureCreate(const char *procedureName, /* * sanity checks */ - Assert(PointerIsValid(prosrc)); - parameterCount = parameterTypes->dim1; Shouldn't we still assert that we either have a valid procsrc or valid prosqlbody? No other comments apart from that!
On 31.03.21 12:12, Julien Rouhaud wrote: > On Tue, Mar 23, 2021 at 11:28:55PM -0500, Jaime Casanova wrote: >> On Fri, Mar 19, 2021 at 8:49 AM Peter Eisentraut >> <peter.eisentraut@2ndquadrant.com> wrote: >>> >>> Right. Here is a new patch with that fix added and a small conflict >>> resolved. >> >> Great. >> >> It seems print_function_sqlbody() is not protected to avoid receiving >> a function that hasn't a standard sql body in >> src/backend/utils/adt/ruleutils.c:3292, but instead it has an assert >> that gets hit with something like this: >> >> CREATE FUNCTION foo() RETURNS int LANGUAGE SQL AS $$ SELECT 1 $$; >> SELECT pg_get_function_sqlbody('foo'::regproc); fixed > It would also be good to add a regression test checking that we can't define a > function with both a prosrc and a prosqlbody. done > @@ -76,6 +77,7 @@ ProcedureCreate(const char *procedureName, > Oid languageValidator, > const char *prosrc, > const char *probin, > + Node *prosqlbody, > char prokind, > bool security_definer, > bool isLeakProof, > @@ -119,8 +121,6 @@ ProcedureCreate(const char *procedureName, > /* > * sanity checks > */ > - Assert(PointerIsValid(prosrc)); > - > parameterCount = parameterTypes->dim1; > > > Shouldn't we still assert that we either have a valid procsrc or valid > prosqlbody? fixed New patch attached.
Attachment
On Fri, Apr 02, 2021 at 02:25:15PM +0200, Peter Eisentraut wrote: > > New patch attached. Thanks, it all looks good to me. I just spot a few minor formatting issues: @@ -2968,6 +2973,13 @@ pg_get_functiondef(PG_FUNCTION_ARGS) } /* And finally the function definition ... */ + tmp = SysCacheGetAttr(PROCOID, proctup, Anum_pg_proc_prosqlbody, &isnull); + if (proc->prolang == SQLlanguageId && !isnull) + { + print_function_sqlbody(&buf, proctup); + } + else + { appendStringInfoString(&buf, "AS "); tmp = SysCacheGetAttr(PROCOID, proctup, Anum_pg_proc_probin, &isnull); @@ -2999,6 +3011,7 @@ pg_get_functiondef(PG_FUNCTION_ARGS) appendBinaryStringInfo(&buf, dq.data, dq.len); appendStringInfoString(&buf, prosrc); appendBinaryStringInfo(&buf, dq.data, dq.len); + } The curly braces could probably be removed for the if branch, and the code in the else branch isn't properly indented. Other occurences: + else + { + src = TextDatumGetCString(tmp); + + callback_arg.prosrc = src; + /* * Set up to handle parameters while parsing the function body. We need a * dummy FuncExpr node containing the already-simplified arguments to pass @@ -4317,6 +4337,7 @@ inline_function(Oid funcid, Oid result_type, Oid result_collid, querytree = transformTopLevelStmt(pstate, linitial(raw_parsetree_list)); free_parsestate(pstate); + } and + else + { + char *src; + + src = TextDatumGetCString(tmp); + + callback_arg.prosrc = src; + /* * Set up to handle parameters while parsing the function body. We can * use the FuncExpr just created as the input for @@ -4829,18 +4878,6 @@ inline_set_returning_function(PlannerInfo *root, RangeTblEntry *rte) (Node *) fexpr, fexpr->inputcollid); - /* and @@ -2968,6 +2973,13 @@ pg_get_functiondef(PG_FUNCTION_ARGS) } /* And finally the function definition ... */ + tmp = SysCacheGetAttr(PROCOID, proctup, Anum_pg_proc_prosqlbody, &isnull); + if (proc->prolang == SQLlanguageId && !isnull) + { + print_function_sqlbody(&buf, proctup); + } + else + { appendStringInfoString(&buf, "AS "); tmp = SysCacheGetAttr(PROCOID, proctup, Anum_pg_proc_probin, &isnull); @@ -2999,6 +3011,7 @@ pg_get_functiondef(PG_FUNCTION_ARGS) appendBinaryStringInfo(&buf, dq.data, dq.len); appendStringInfoString(&buf, prosrc); appendBinaryStringInfo(&buf, dq.data, dq.len); + } and @@ -12290,7 +12309,11 @@ dumpFunc(Archive *fout, const FuncInfo *finfo) * versions would set it to "-". There are no known cases in which prosrc * is unused, so the tests below for "-" are probably useless. */ - if (probin[0] != '\0' && strcmp(probin, "-") != 0) + if (prosqlbody) + { + appendPQExpBufferStr(asPart, prosqlbody); + } Are you planning to run pg_indent before committing or would that add too much noise? Anyway since it's only stylistic issues and the feature freeze is getting closer I'm marking the patch as ready for committer.
On 03.04.21 05:39, Julien Rouhaud wrote: > Are you planning to run pg_indent before committing or would that add too much > noise? Yeah, I figured I'd leave that for later, to not bloat the patch so much. > Anyway since it's only stylistic issues and the feature freeze is getting > closer I'm marking the patch as ready for committer. Committed. Thanks!
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: > Committed. Thanks! Buildfarm suggests this has some issues under force_parallel_mode. I'm wondering about missed fields in outfuncs/readfuncs, or the like. regards, tom lane
On Wed, Apr 07, 2021 at 04:22:17PM -0400, Tom Lane wrote: > Buildfarm suggests this has some issues under force_parallel_mode. > I'm wondering about missed fields in outfuncs/readfuncs, or the like. The problem looks a bit more fundamental to me, as there seems to be some confusion with the concept of what should be the query string when it comes to prosqlbody with a parallel run, as it replaces prosrc in some cases where the function uses SQL as language. If the buildfarm cannot be put back to green, could it be possible to revert this patch? -- Michael
Attachment
Michael Paquier <michael@paquier.xyz> writes: > On Wed, Apr 07, 2021 at 04:22:17PM -0400, Tom Lane wrote: >> Buildfarm suggests this has some issues under force_parallel_mode. >> I'm wondering about missed fields in outfuncs/readfuncs, or the like. > The problem looks a bit more fundamental to me, as there seems to be > some confusion with the concept of what should be the query string > when it comes to prosqlbody with a parallel run, as it replaces prosrc > in some cases where the function uses SQL as language. If the > buildfarm cannot be put back to green, could it be possible to revert > this patch? Andres pushed a stopgap fix. We might end up reverting the patch altogether for v14, but I don't want to be hasty. This should be enough to let people take advantage of the last few hours before feature freeze. regards, tom lane
Hi, On 2021-04-08 01:16:02 -0400, Tom Lane wrote: > Michael Paquier <michael@paquier.xyz> writes: > > On Wed, Apr 07, 2021 at 04:22:17PM -0400, Tom Lane wrote: > >> Buildfarm suggests this has some issues under force_parallel_mode. > >> I'm wondering about missed fields in outfuncs/readfuncs, or the like. > > > The problem looks a bit more fundamental to me, as there seems to be > > some confusion with the concept of what should be the query string > > when it comes to prosqlbody with a parallel run, as it replaces prosrc > > in some cases where the function uses SQL as language. If the > > buildfarm cannot be put back to green, could it be possible to revert > > this patch? > > Andres pushed a stopgap fix. Let's hope that it does fix it on the BF as well. One holdup was that check-world didn't succeed with force_parallel_mode=regress even after the fix - but that turned out to be the fault of commit 5fd9dfa5f50e4906c35133a414ebec5b6d518493 (HEAD) Author: Bruce Momjian <bruce@momjian.us> Date: 2021-04-07 13:06:47 -0400 Move pg_stat_statements query jumbling to core. et al. > We might end up reverting the patch altogether for v14, but I don't > want to be hasty. This should be enough to let people take advantage > of the last few hours before feature freeze. Yea, I think it'd be good to make that decision after a decent night of sleep or two. And an actual look at the issues the patch might (or might not) have. Independent of this patch, it might be a good idea to have ExecInitParallelPlan() be robust against NULL querystrings. Places like executor_errposition() are certainly trying to be... Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > Independent of this patch, it might be a good idea to have > ExecInitParallelPlan() be robust against NULL querystrings. Places like > executor_errposition() are certainly trying to be... FWIW, I think the long-term drift of things is definitely that we want to have the querystring available everywhere. Code like executor_errposition is from an earlier era before we were trying to enforce that. In particular, if the querystring is available in the leader and not the workers, then you will get different error reporting behavior in parallel query than non-parallel query, which is surely a bad thing. So IMO what you did here is definitely a short-term thing that we should be looking to revert. The question at hand is why Peter's patch broke this in the first place, and how hard it will be to fix it properly. I'm entirely on board with reverting the feature if that isn't readily fixable. regards, tom lane
On Wed, Apr 07, 2021 at 10:27:59PM -0700, Andres Freund wrote: > > One holdup was that > check-world didn't succeed with force_parallel_mode=regress even after > the fix - but that turned out to be the fault of > > commit 5fd9dfa5f50e4906c35133a414ebec5b6d518493 (HEAD) > Author: Bruce Momjian <bruce@momjian.us> > Date: 2021-04-07 13:06:47 -0400 > > Move pg_stat_statements query jumbling to core. > > et al. Yep, I'm on it!
Julien Rouhaud <rjuju123@gmail.com> writes: > On Wed, Apr 07, 2021 at 10:27:59PM -0700, Andres Freund wrote: >> One holdup was that >> check-world didn't succeed with force_parallel_mode=regress even after >> the fix - but that turned out to be the fault of >>>> Move pg_stat_statements query jumbling to core. > Yep, I'm on it! So far the buildfarm seems to be turning green after b3ee4c503 ... so I wonder what extra condition is needed to cause the failure Andres is seeing. regards, tom lane
Hi, On 2021-04-08 02:05:25 -0400, Tom Lane wrote: > So far the buildfarm seems to be turning green after b3ee4c503 ... > so I wonder what extra condition is needed to cause the failure > Andres is seeing. Nothing special, really. Surprised the BF doesn't see it: andres@awork3:~/build/postgres/dev-assert/vpath$ cat /tmp/test.conf force_parallel_mode=regress andres@awork3:~/build/postgres/dev-assert/vpath$ make -j48 -s && EXTRA_REGRESS_OPTS='--temp-config /tmp/test.conf' make -s-C contrib/pg_stat_statements/ check All of PostgreSQL successfully made. Ready to install. ... The differences that caused some tests to fail can be viewed in the file "/home/andres/build/postgres/dev-assert/vpath/contrib/pg_stat_statements/regression.diffs". A copy of the test summarythat you see above is saved in the file "/home/andres/build/postgres/dev-assert/vpath/contrib/pg_stat_statements/regression.out". ... andres@awork3:~/build/postgres/dev-assert/vpath$ head -n 30 /home/andres/build/postgres/dev-assert/vpath/contrib/pg_stat_statements/regression.diffs diff -du10 /home/andres/src/postgresql/contrib/pg_stat_statements/expected/pg_stat_statements.out /home/andres/build/postgres/dev-assert/vpath/contrib/pg_stat_statements/results/pg_stat_statements.out --- /home/andres/src/postgresql/contrib/pg_stat_statements/expected/pg_stat_statements.out 2021-04-06 09:08:42.688697932-0700 +++ /home/andres/build/postgres/dev-assert/vpath/contrib/pg_stat_statements/results/pg_stat_statements.out 2021-04-0723:30:26.876071024 -0700 @@ -118,37 +118,38 @@ ?column? | ?column? ----------+---------- 1 | test (1 row) DEALLOCATE pgss_test; SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C"; query | calls | rows ------------------------------------------------------------------------------+-------+------ PREPARE pgss_test (int) AS SELECT $1, $2 LIMIT $3 | 1 | 1 - SELECT $1 +| 4 | 4 + PREPARE pgss_test (int) AS SELECT $1, 'test' LIMIT 1; | 1 | 1 + SELECT $1 +| 8 | 8 +| | AS "text" | | - SELECT $1 + $2 | 2 | 2 - SELECT $1 + $2 + $3 AS "add" | 3 | 3 - SELECT $1 AS "float" | 1 | 1 - SELECT $1 AS "int" | 2 | 2 + SELECT $1 + $2 | 4 | 4 + SELECT $1 + $2 + $3 AS "add" | 6 | 6 + SELECT $1 AS "float" | 2 | 2 + SELECT $1 AS "int" | 4 | 4 SELECT $1 AS i UNION SELECT $2 ORDER BY i | 1 | 2 - SELECT $1 || $2 | 1 | 1 - SELECT pg_stat_statements_reset() | 1 | 1 Too tired to figure out why the BF doesn't see this. Perhaps the extra settings aren't used because it's scripted as an install check? Greetings, Andres Freund
Hi, On 2021-04-08 01:41:40 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > Independent of this patch, it might be a good idea to have > > ExecInitParallelPlan() be robust against NULL querystrings. Places like > > executor_errposition() are certainly trying to be... > > FWIW, I think the long-term drift of things is definitely that > we want to have the querystring available everywhere. Code like > executor_errposition is from an earlier era before we were trying > to enforce that. In particular, if the querystring is available in > the leader and not the workers, then you will get different error > reporting behavior in parallel query than non-parallel query, which > is surely a bad thing. Yea, I think it's a sensible direction - but I think we should put the line in the sand earlier on / higher up than ExecInitParallelPlan(). Greetings, Andres Freund
On Wed, Apr 07, 2021 at 11:33:20PM -0700, Andres Freund wrote: > Hi, > > On 2021-04-08 02:05:25 -0400, Tom Lane wrote: > > So far the buildfarm seems to be turning green after b3ee4c503 ... > > so I wonder what extra condition is needed to cause the failure > > Andres is seeing. > > Nothing special, really. Surprised the BF doesn't see it: > > andres@awork3:~/build/postgres/dev-assert/vpath$ cat /tmp/test.conf > force_parallel_mode=regress > andres@awork3:~/build/postgres/dev-assert/vpath$ make -j48 -s && EXTRA_REGRESS_OPTS='--temp-config /tmp/test.conf' make-s -C contrib/pg_stat_statements/ check > All of PostgreSQL successfully made. Ready to install. > ... > The differences that caused some tests to fail can be viewed in the > file "/home/andres/build/postgres/dev-assert/vpath/contrib/pg_stat_statements/regression.diffs". A copy of the test summarythat you see > above is saved in the file "/home/andres/build/postgres/dev-assert/vpath/contrib/pg_stat_statements/regression.out". > ... Is think this is because the buildfarm client is running installcheck for the contribs rather than check, and pg_stat_statements/Makefile has: # Disabled because these tests require "shared_preload_libraries=pg_stat_statements", # which typical installcheck users do not have (e.g. buildfarm clients). NO_INSTALLCHECK = 1
Julien Rouhaud <rjuju123@gmail.com> writes: > On Wed, Apr 07, 2021 at 11:33:20PM -0700, Andres Freund wrote: >> Nothing special, really. Surprised the BF doesn't see it: > Is think this is because the buildfarm client is running installcheck for the > contribs rather than check, and pg_stat_statements/Makefile has: > # Disabled because these tests require "shared_preload_libraries=pg_stat_statements", > # which typical installcheck users do not have (e.g. buildfarm clients). > NO_INSTALLCHECK = 1 No, because if that were the explanation then we'd be getting no buildfarm coverage at all for for pg_stat_statements. Which aside from being awful contradicts the results at coverage.postgresql.org. I think Andres has the right idea that there's some more-subtle variation in the test conditions, but (yawn) too tired to look into it right now. regards, tom lane
On Wed, Apr 07, 2021 at 11:35:14PM -0700, Andres Freund wrote: > On 2021-04-08 01:41:40 -0400, Tom Lane wrote: >> Andres Freund <andres@anarazel.de> writes: >> FWIW, I think the long-term drift of things is definitely that >> we want to have the querystring available everywhere. Code like >> executor_errposition is from an earlier era before we were trying >> to enforce that. In particular, if the querystring is available in >> the leader and not the workers, then you will get different error >> reporting behavior in parallel query than non-parallel query, which >> is surely a bad thing. > > Yea, I think it's a sensible direction - but I think we should put the > line in the sand earlier on / higher up than ExecInitParallelPlan(). Indeed, I agree that enforcing the availability of querystring everywhere sounds like a sensible thing to do in terms of consistency, and that's my impression when I scanned the parallel execution code, and I don't really get why SQL function bodies should not bind by this rule. Would people object if I add an open item to track that? -- Michael
Attachment
On Thu, Apr 08, 2021 at 04:54:56PM +0900, Michael Paquier wrote: > On Wed, Apr 07, 2021 at 11:35:14PM -0700, Andres Freund wrote: > > On 2021-04-08 01:41:40 -0400, Tom Lane wrote: > >> Andres Freund <andres@anarazel.de> writes: > >> FWIW, I think the long-term drift of things is definitely that > >> we want to have the querystring available everywhere. Code like > >> executor_errposition is from an earlier era before we were trying > >> to enforce that. In particular, if the querystring is available in > >> the leader and not the workers, then you will get different error > >> reporting behavior in parallel query than non-parallel query, which > >> is surely a bad thing. > > > > Yea, I think it's a sensible direction - but I think we should put the > > line in the sand earlier on / higher up than ExecInitParallelPlan(). > > Indeed, I agree that enforcing the availability of querystring > everywhere sounds like a sensible thing to do in terms of consistency, > and that's my impression when I scanned the parallel execution code, > and I don't really get why SQL function bodies should not bind by this > rule. Would people object if I add an open item to track that? It makes sense, +1 for an open item.
On Thu, Apr 08, 2021 at 02:58:02AM -0400, Tom Lane wrote: > Julien Rouhaud <rjuju123@gmail.com> writes: > > On Wed, Apr 07, 2021 at 11:33:20PM -0700, Andres Freund wrote: > >> Nothing special, really. Surprised the BF doesn't see it: > > > Is think this is because the buildfarm client is running installcheck for the > > contribs rather than check, and pg_stat_statements/Makefile has: > > # Disabled because these tests require "shared_preload_libraries=pg_stat_statements", > > # which typical installcheck users do not have (e.g. buildfarm clients). > > NO_INSTALLCHECK = 1 > > No, because if that were the explanation then we'd be getting no > buildfarm coverage at all for for pg_stat_statements. Which aside > from being awful contradicts the results at coverage.postgresql.org. Is there any chance that coverage.postgresql.org isn't backed by the buildfarm client but a plain make check-world or something like that? > I think Andres has the right idea that there's some more-subtle > variation in the test conditions, but (yawn) too tired to look > into it right now. I tried to look at some force-parallel-mode animal, like mantid, and I don't see any evidence of pg_stat_statements being run by a "make check", and only a few contrib modules seem to have an explicit check phase. However, looking at https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mantid&dt=2021-04-08%2007%3A07%3A05 I see https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=mantid&dt=2021-04-08%2007%3A07%3A05&stg=contrib-install-check-C: [...] make -C pg_stat_statements installcheck make[1]: Entering directory `/u1/tac/build-farm-11/buildroot/HEAD/pgsql.build/contrib/pg_stat_statements' make[1]: Nothing to be done for `installcheck'.
Julien Rouhaud <rjuju123@gmail.com> writes: > On Thu, Apr 08, 2021 at 02:58:02AM -0400, Tom Lane wrote: >> No, because if that were the explanation then we'd be getting no >> buildfarm coverage at all for for pg_stat_statements. Which aside >> from being awful contradicts the results at coverage.postgresql.org. > Is there any chance that coverage.postgresql.org isn't backed by the buildfarm > client but a plain make check-world or something like that? Hmm, I think you're right. Poking around in the log files from one of my own buildfarm animals, there's no evidence that pg_stat_statements is being tested at all. Needless to say, that's just horrid :-( I see that contrib/test_decoding also sets NO_INSTALLCHECK = 1, and the reason it gets tested is that the buildfarm script has a special module for that. I guess we need to clone that module, or maybe better, find a way to generalize it. There are also some src/test/modules modules that set NO_INSTALLCHECK, but apparently those do have coverage (modules-check is the step that runs their SQL tests, and then the TAP tests if any get broken out as separate buildfarm steps). regards, tom lane
On 4/8/21 2:40 AM, Julien Rouhaud wrote: > On Wed, Apr 07, 2021 at 11:33:20PM -0700, Andres Freund wrote: >> Hi, >> >> On 2021-04-08 02:05:25 -0400, Tom Lane wrote: >>> So far the buildfarm seems to be turning green after b3ee4c503 ... >>> so I wonder what extra condition is needed to cause the failure >>> Andres is seeing. >> Nothing special, really. Surprised the BF doesn't see it: >> >> andres@awork3:~/build/postgres/dev-assert/vpath$ cat /tmp/test.conf >> force_parallel_mode=regress >> andres@awork3:~/build/postgres/dev-assert/vpath$ make -j48 -s && EXTRA_REGRESS_OPTS='--temp-config /tmp/test.conf' make-s -C contrib/pg_stat_statements/ check >> All of PostgreSQL successfully made. Ready to install. >> ... >> The differences that caused some tests to fail can be viewed in the >> file "/home/andres/build/postgres/dev-assert/vpath/contrib/pg_stat_statements/regression.diffs". A copy of the test summarythat you see >> above is saved in the file "/home/andres/build/postgres/dev-assert/vpath/contrib/pg_stat_statements/regression.out". >> ... > Is think this is because the buildfarm client is running installcheck for the > contribs rather than check, and pg_stat_statements/Makefile has: > > # Disabled because these tests require "shared_preload_libraries=pg_stat_statements", > # which typical installcheck users do not have (e.g. buildfarm clients). > NO_INSTALLCHECK = 1 > > > Yeah, Julien is right, we run "make check" for these in src/test/modules but I missed contrib. I have fixed this on crake so we get some immediate coverage and a fix will be pushed to github shortly. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
On 2021-Apr-08, Julien Rouhaud wrote: > On Thu, Apr 08, 2021 at 02:58:02AM -0400, Tom Lane wrote: > > No, because if that were the explanation then we'd be getting no > > buildfarm coverage at all for for pg_stat_statements. Which aside > > from being awful contradicts the results at coverage.postgresql.org. > > Is there any chance that coverage.postgresql.org isn't backed by the buildfarm > client but a plain make check-world or something like that? Yes, coverage.pg.org runs "make check-world". Maybe it would make sense to change that script, so that it runs the buildfarm's run_build.pl script instead of "make check-world". That would make coverage.pg.org report what the buildfarm actually tests ... it would have made this problem a bit more obvious. -- Álvaro Herrera 39°49'30"S 73°17'W
On Thu, Apr 08, 2021 at 12:21:05PM -0400, Tom Lane wrote: > I see that contrib/test_decoding also sets NO_INSTALLCHECK = 1, > and the reason it gets tested is that the buildfarm script has > a special module for that. I guess we need to clone that module, > or maybe better, find a way to generalize it. > > There are also some src/test/modules modules that set NO_INSTALLCHECK, > but apparently those do have coverage (modules-check is the step that > runs their SQL tests, and then the TAP tests if any get broken out > as separate buildfarm steps). FWIW, on Windows any module with NO_INSTALLCHECK does not get tested as we rely mostly on an installed server to do all the tests and avoid the performance impact of setting up a new server for each module's test. -- Michael
Attachment
Julien Rouhaud <rjuju123@gmail.com> writes: > On Thu, Apr 08, 2021 at 04:54:56PM +0900, Michael Paquier wrote: >> Indeed, I agree that enforcing the availability of querystring >> everywhere sounds like a sensible thing to do in terms of consistency, >> and that's my impression when I scanned the parallel execution code, >> and I don't really get why SQL function bodies should not bind by this >> rule. Would people object if I add an open item to track that? > It makes sense, +1 for an open item. So here's what I propose to do about this. 0001 attached reverts the patch's change to remove the NOT NULL constraint on pg_proc.prosrc. I think that was an extremely poor decision; it risks breaking non-core PLs, and for that matter I'm not sure the core PLs wouldn't crash on null prosrc. It is not any harder for the SQL-language-related code to condition its checks on not-null prosqlbody instead of null prosrc. Of course that then requires us to put something into prosrc for these newfangled functions, but in 0001 I just used an empty string. (This patch also adds an Assert to standard_ExecutorStart checking that some source text was provided, responding to Andres' point that we should be checking that upstream of parallel query. We should then revert b3ee4c503, but for simplicity I didn't include that here.) 0002 addresses a different missing-source-text problem, which is that the patch didn't bother to provide source text while running parse analysis on the SQL function body. That means no error cursors for problems; which might seem cosmetic on the toy example I added to the regression tests, but it won't be for people writing functions that are dozens or hundreds of lines long. Finally, 0003 might be a bit controversial: it changes the stored prosrc for new-style SQL functions to be the query text of the CREATE FUNCTION command. The main argument I can see being made against this is that it'll bloat the pg_proc entry. But I think that that's not a terribly reasonable concern, because the source text is going to be a good deal smaller than the nodeToString representation in just about every case. The real value of 0003 of course would be to get an error cursor at runtime, but I failed to create an example where that would happen today. Right now there are only three calls of executor_errposition, and all of them are for cases that are already rejected by the parser, so they're effectively unreachable. A scenario that seems more likely to be reachable is a failure reported during function inlining, but most of the reasons I can think of for that also seem unreachable given the already-parse-analyzed nature of the function body in these cases. Maybe I'm just under-caffeinated today. Another point here is that for any error cursor to appear, we need not only source text at hand but also token locations in the query tree nodes. Right now, since readfuncs.c intentionally discards those locations, we won't have that. There is not-normally-compiled logic to reload those location fields, though, and I think before too long we'll want to enable it in some mainstream cases --- notably parallel query's shipping of querytrees to workers. However, until it gets easier to reach cases where an error-with-location can be thrown from the executor, I don't feel a need to do that. I do have ambitions to make execution-time errors produce cursors in more cases, so I think this will come to fruition before long, but not in v14. One could make an argument, therefore, for holding off 0003 until there's more support for execution-time error cursors. I don't think we should though, for two reasons: 1. It'd be better to keep the pg_proc representation of new-style SQL functions stable across versions. 2. Storing the CREATE text means we'll capture comments associated with the function text, which is something that at least some people will complain about the loss of. Admittedly we have no way to re-integrate the comments into the de-parsed body, but some folks might be satisfied with grabbing the prosrc text. Thoughts? regards, tom lane diff --git a/src/backend/catalog/pg_proc.c b/src/backend/catalog/pg_proc.c index cd13e63852..478dbde3fe 100644 --- a/src/backend/catalog/pg_proc.c +++ b/src/backend/catalog/pg_proc.c @@ -121,7 +121,7 @@ ProcedureCreate(const char *procedureName, /* * sanity checks */ - Assert(PointerIsValid(prosrc) || PointerIsValid(prosqlbody)); + Assert(PointerIsValid(prosrc)); parameterCount = parameterTypes->dim1; if (parameterCount < 0 || parameterCount > FUNC_MAX_ARGS) @@ -336,10 +336,7 @@ ProcedureCreate(const char *procedureName, values[Anum_pg_proc_protrftypes - 1] = trftypes; else nulls[Anum_pg_proc_protrftypes - 1] = true; - if (prosrc) - values[Anum_pg_proc_prosrc - 1] = CStringGetTextDatum(prosrc); - else - nulls[Anum_pg_proc_prosrc - 1] = true; + values[Anum_pg_proc_prosrc - 1] = CStringGetTextDatum(prosrc); if (probin) values[Anum_pg_proc_probin - 1] = CStringGetTextDatum(probin); else @@ -874,26 +871,29 @@ fmgr_sql_validator(PG_FUNCTION_ARGS) /* Postpone body checks if !check_function_bodies */ if (check_function_bodies) { + tmp = SysCacheGetAttr(PROCOID, tuple, Anum_pg_proc_prosrc, &isnull); + if (isnull) + elog(ERROR, "null prosrc"); + + prosrc = TextDatumGetCString(tmp); + /* * Setup error traceback support for ereport(). */ callback_arg.proname = NameStr(proc->proname); - callback_arg.prosrc = NULL; + callback_arg.prosrc = prosrc; sqlerrcontext.callback = sql_function_parse_error_callback; sqlerrcontext.arg = (void *) &callback_arg; sqlerrcontext.previous = error_context_stack; error_context_stack = &sqlerrcontext; - tmp = SysCacheGetAttr(PROCOID, tuple, Anum_pg_proc_prosrc, &isnull); - if (isnull) + /* If we have prosqlbody, pay attention to that not prosrc */ + tmp = SysCacheGetAttr(PROCOID, tuple, Anum_pg_proc_prosqlbody, &isnull); + if (!isnull) { Node *n; - tmp = SysCacheGetAttr(PROCOID, tuple, Anum_pg_proc_prosqlbody, &isnull); - if (isnull) - elog(ERROR, "null prosrc and prosqlbody"); - n = stringToNode(TextDatumGetCString(tmp)); if (IsA(n, List)) querytree_list = castNode(List, n); @@ -902,10 +902,6 @@ fmgr_sql_validator(PG_FUNCTION_ARGS) } else { - prosrc = TextDatumGetCString(tmp); - - callback_arg.prosrc = prosrc; - /* * We can't do full prechecking of the function definition if there * are any polymorphic input types, because actual datatypes of @@ -1001,9 +997,6 @@ function_parse_error_transpose(const char *prosrc) int newerrposition; const char *queryText; - if (!prosrc) - return false; - /* * Nothing to do unless we are dealing with a syntax error that has a * cursor position. diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c index 199029b7a8..dc317c83af 100644 --- a/src/backend/commands/functioncmds.c +++ b/src/backend/commands/functioncmds.c @@ -958,8 +958,17 @@ interpret_AS_clause(Oid languageOid, const char *languageName, *sql_body_out = (Node *) q; } + /* + * We must put something in prosrc. For the moment, just record an + * empty string. It might be useful to store the original text of the + * CREATE FUNCTION statement --- but to make actual use of that in + * error reports, we'd also have to adjust readfuncs.c to not throw + * away node location fields when reading prosqlbody. + */ + *prosrc_str_p = pstrdup(""); + + /* But we definitely don't need probin. */ *probin_str_p = NULL; - *prosrc_str_p = NULL; } else { diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c index b2e2df8773..2cf6dad768 100644 --- a/src/backend/executor/execMain.c +++ b/src/backend/executor/execMain.c @@ -195,6 +195,8 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags) palloc0(nParamExec * sizeof(ParamExecData)); } + /* We now require all callers to provide sourceText */ + Assert(queryDesc->sourceText != NULL); estate->es_sourceText = queryDesc->sourceText; /* diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c index 642683843e..39580f7d57 100644 --- a/src/backend/executor/functions.c +++ b/src/backend/executor/functions.c @@ -667,6 +667,15 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) procedureTuple, Anum_pg_proc_prosrc, &isNull); + if (isNull) + elog(ERROR, "null prosrc for function %u", foid); + fcache->src = TextDatumGetCString(tmp); + + /* If we have prosqlbody, pay attention to that not prosrc. */ + tmp = SysCacheGetAttr(PROCOID, + procedureTuple, + Anum_pg_proc_prosqlbody, + &isNull); /* * Parse and rewrite the queries in the function text. Use sublists to @@ -678,18 +687,11 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) * plancache.c. */ queryTree_list = NIL; - if (isNull) + if (!isNull) { Node *n; List *stored_query_list; - tmp = SysCacheGetAttr(PROCOID, - procedureTuple, - Anum_pg_proc_prosqlbody, - &isNull); - if (isNull) - elog(ERROR, "null prosrc and prosqlbody for function %u", foid); - n = stringToNode(TextDatumGetCString(tmp)); if (IsA(n, List)) stored_query_list = linitial_node(List, castNode(List, n)); @@ -710,8 +712,6 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) { List *raw_parsetree_list; - fcache->src = TextDatumGetCString(tmp); - raw_parsetree_list = pg_parse_query(fcache->src); foreach(lc, raw_parsetree_list) diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c index 526997327c..d9ad4efc5e 100644 --- a/src/backend/optimizer/util/clauses.c +++ b/src/backend/optimizer/util/clauses.c @@ -4317,32 +4317,37 @@ inline_function(Oid funcid, Oid result_type, Oid result_collid, ALLOCSET_DEFAULT_SIZES); oldcxt = MemoryContextSwitchTo(mycxt); + /* Fetch the function body */ + tmp = SysCacheGetAttr(PROCOID, + func_tuple, + Anum_pg_proc_prosrc, + &isNull); + if (isNull) + elog(ERROR, "null prosrc for function %u", funcid); + src = TextDatumGetCString(tmp); + /* * Setup error traceback support for ereport(). This is so that we can * finger the function that bad information came from. */ callback_arg.proname = NameStr(funcform->proname); - callback_arg.prosrc = NULL; + callback_arg.prosrc = src; sqlerrcontext.callback = sql_inline_error_callback; sqlerrcontext.arg = (void *) &callback_arg; sqlerrcontext.previous = error_context_stack; error_context_stack = &sqlerrcontext; - /* Fetch the function body */ + /* If we have prosqlbody, pay attention to that not prosrc */ tmp = SysCacheGetAttr(PROCOID, func_tuple, - Anum_pg_proc_prosrc, + Anum_pg_proc_prosqlbody, &isNull); - if (isNull) + if (!isNull) { Node *n; List *querytree_list; - tmp = SysCacheGetAttr(PROCOID, func_tuple, Anum_pg_proc_prosqlbody, &isNull); - if (isNull) - elog(ERROR, "null prosrc and prosqlbody for function %u", funcid); - n = stringToNode(TextDatumGetCString(tmp)); if (IsA(n, List)) querytree_list = linitial_node(List, castNode(List, n)); @@ -4354,10 +4359,6 @@ inline_function(Oid funcid, Oid result_type, Oid result_collid, } else { - src = TextDatumGetCString(tmp); - - callback_arg.prosrc = src; - /* * Set up to handle parameters while parsing the function body. We need a * dummy FuncExpr node containing the already-simplified arguments to pass @@ -4658,15 +4659,12 @@ sql_inline_error_callback(void *arg) int syntaxerrposition; /* If it's a syntax error, convert to internal syntax error report */ - if (callback_arg->prosrc) + syntaxerrposition = geterrposition(); + if (syntaxerrposition > 0) { - syntaxerrposition = geterrposition(); - if (syntaxerrposition > 0) - { - errposition(0); - internalerrposition(syntaxerrposition); - internalerrquery(callback_arg->prosrc); - } + errposition(0); + internalerrposition(syntaxerrposition); + internalerrquery(callback_arg->prosrc); } errcontext("SQL function \"%s\" during inlining", callback_arg->proname); @@ -4778,6 +4776,7 @@ inline_set_returning_function(PlannerInfo *root, RangeTblEntry *rte) Oid func_oid; HeapTuple func_tuple; Form_pg_proc funcform; + char *src; Datum tmp; bool isNull; MemoryContext oldcxt; @@ -4886,31 +4885,36 @@ inline_set_returning_function(PlannerInfo *root, RangeTblEntry *rte) ALLOCSET_DEFAULT_SIZES); oldcxt = MemoryContextSwitchTo(mycxt); + /* Fetch the function body */ + tmp = SysCacheGetAttr(PROCOID, + func_tuple, + Anum_pg_proc_prosrc, + &isNull); + if (isNull) + elog(ERROR, "null prosrc for function %u", func_oid); + src = TextDatumGetCString(tmp); + /* * Setup error traceback support for ereport(). This is so that we can * finger the function that bad information came from. */ callback_arg.proname = NameStr(funcform->proname); - callback_arg.prosrc = NULL; + callback_arg.prosrc = src; sqlerrcontext.callback = sql_inline_error_callback; sqlerrcontext.arg = (void *) &callback_arg; sqlerrcontext.previous = error_context_stack; error_context_stack = &sqlerrcontext; - /* Fetch the function body */ + /* If we have prosqlbody, pay attention to that not prosrc */ tmp = SysCacheGetAttr(PROCOID, func_tuple, - Anum_pg_proc_prosrc, + Anum_pg_proc_prosqlbody, &isNull); - if (isNull) + if (!isNull) { Node *n; - tmp = SysCacheGetAttr(PROCOID, func_tuple, Anum_pg_proc_prosqlbody, &isNull); - if (isNull) - elog(ERROR, "null prosrc and prosqlbody for function %u", func_oid); - n = stringToNode(TextDatumGetCString(tmp)); if (IsA(n, List)) querytree_list = linitial_node(List, castNode(List, n)); @@ -4927,12 +4931,6 @@ inline_set_returning_function(PlannerInfo *root, RangeTblEntry *rte) } else { - char *src; - - src = TextDatumGetCString(tmp); - - callback_arg.prosrc = src; - /* * Set up to handle parameters while parsing the function body. We can * use the FuncExpr just created as the input for diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c index d0ea489614..e1c65d8dfc 100644 --- a/src/bin/pg_dump/pg_dump.c +++ b/src/bin/pg_dump/pg_dump.c @@ -12247,7 +12247,7 @@ dumpFunc(Archive *fout, const FuncInfo *finfo) if (fout->remoteVersion >= 140000) appendPQExpBufferStr(query, - "CASE WHEN prosrc IS NULL AND lanname = 'sql' THEN pg_get_function_sqlbody(p.oid) END AS prosqlbody\n"); + "pg_get_function_sqlbody(p.oid) AS prosqlbody\n"); else appendPQExpBufferStr(query, "NULL AS prosqlbody\n"); diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c index fdc2a89085..400b683859 100644 --- a/src/bin/psql/describe.c +++ b/src/bin/psql/describe.c @@ -512,7 +512,7 @@ describeFunctions(const char *functypes, const char *func_pattern, gettext_noop("Language")); if (pset.sversion >= 140000) appendPQExpBuffer(&buf, - ",\n COALESCE(p.prosrc, pg_catalog.pg_get_function_sqlbody(p.oid)) as \"%s\"", + ",\n COALESCE(pg_catalog.pg_get_function_sqlbody(p.oid), p.prosrc) as \"%s\"", gettext_noop("Source code")); else appendPQExpBuffer(&buf, diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index 8d58067d03..448d9898cb 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -112,7 +112,7 @@ CATALOG(pg_proc,1255,ProcedureRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(81,Proce Oid protrftypes[1] BKI_DEFAULT(_null_) BKI_LOOKUP(pg_type); /* procedure source text */ - text prosrc; + text prosrc BKI_FORCE_NOT_NULL; /* secondary procedure info (can be NULL) */ text probin BKI_DEFAULT(_null_); diff --git a/src/include/executor/functions.h b/src/include/executor/functions.h index dcb8e18437..b56ce26da0 100644 --- a/src/include/executor/functions.h +++ b/src/include/executor/functions.h @@ -17,9 +17,6 @@ #include "nodes/execnodes.h" #include "tcop/dest.h" -/* This struct is known only within executor/functions.c */ -typedef struct SQLFunctionParseInfo *SQLFunctionParseInfoPtr; - /* * Data structure needed by the parser callback hooks to resolve parameter * references during parsing of a SQL function's body. This is separate from @@ -35,6 +32,8 @@ typedef struct SQLFunctionParseInfo Oid collation; /* function's input collation, if known */ } SQLFunctionParseInfo; +typedef SQLFunctionParseInfo *SQLFunctionParseInfoPtr; + extern Datum fmgr_sql(PG_FUNCTION_ARGS); extern SQLFunctionParseInfoPtr prepare_sql_fn_parse_info(HeapTuple procedureTuple, diff --git a/src/test/regress/expected/create_function_3.out b/src/test/regress/expected/create_function_3.out index 477130e620..9f8733f75c 100644 --- a/src/test/regress/expected/create_function_3.out +++ b/src/test/regress/expected/create_function_3.out @@ -290,6 +290,12 @@ CREATE FUNCTION functest_S_xx(x anyarray) RETURNS anyelement LANGUAGE SQL RETURN x[1]; ERROR: SQL function with unquoted function body cannot have polymorphic arguments +-- check reporting of parse-analysis errors +CREATE FUNCTION functest_S_xx(x date) RETURNS boolean + LANGUAGE SQL + RETURN x > 1; +ERROR: operator does not exist: date > integer +HINT: No operator matches the given name and argument types. You might need to add explicit type casts. -- tricky parsing CREATE FUNCTION functest_S_15(x int) RETURNS boolean LANGUAGE SQL diff --git a/src/test/regress/sql/create_function_3.sql b/src/test/regress/sql/create_function_3.sql index 3575ecc693..9a03301621 100644 --- a/src/test/regress/sql/create_function_3.sql +++ b/src/test/regress/sql/create_function_3.sql @@ -191,6 +191,11 @@ CREATE FUNCTION functest_S_xx(x anyarray) RETURNS anyelement LANGUAGE SQL RETURN x[1]; +-- check reporting of parse-analysis errors +CREATE FUNCTION functest_S_xx(x date) RETURNS boolean + LANGUAGE SQL + RETURN x > 1; + -- tricky parsing CREATE FUNCTION functest_S_15(x int) RETURNS boolean LANGUAGE SQL diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c index dc317c83af..e7cb5c65e9 100644 --- a/src/backend/commands/functioncmds.c +++ b/src/backend/commands/functioncmds.c @@ -852,7 +852,9 @@ static void interpret_AS_clause(Oid languageOid, const char *languageName, char *funcname, List *as, Node *sql_body_in, List *parameterTypes, List *inParameterNames, - char **prosrc_str_p, char **probin_str_p, Node **sql_body_out) + char **prosrc_str_p, char **probin_str_p, + Node **sql_body_out, + const char *queryString) { if (!sql_body_in && !as) ereport(ERROR, @@ -929,6 +931,7 @@ interpret_AS_clause(Oid languageOid, const char *languageName, Query *q; ParseState *pstate = make_parsestate(NULL); + pstate->p_sourcetext = queryString; sql_fn_parser_setup(pstate, pinfo); q = transformStmt(pstate, stmt); if (q->commandType == CMD_UTILITY) @@ -947,6 +950,7 @@ interpret_AS_clause(Oid languageOid, const char *languageName, Query *q; ParseState *pstate = make_parsestate(NULL); + pstate->p_sourcetext = queryString; sql_fn_parser_setup(pstate, pinfo); q = transformStmt(pstate, sql_body_in); if (q->commandType == CMD_UTILITY) @@ -954,6 +958,7 @@ interpret_AS_clause(Oid languageOid, const char *languageName, errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("%s is not yet supported in unquoted SQL function body", GetCommandTagName(CreateCommandTag(q->utilityStmt)))); + free_parsestate(pstate); *sql_body_out = (Node *) q; } @@ -1220,7 +1225,8 @@ CreateFunction(ParseState *pstate, CreateFunctionStmt *stmt) interpret_AS_clause(languageOid, language, funcname, as_clause, stmt->sql_body, parameterTypes_list, inParameterNames_list, - &prosrc_str, &probin_str, &prosqlbody); + &prosrc_str, &probin_str, &prosqlbody, + pstate->p_sourcetext); /* * Set default values for COST and ROWS depending on other parameters; diff --git a/src/test/regress/expected/create_function_3.out b/src/test/regress/expected/create_function_3.out index 9f8733f75c..e11a7a2dad 100644 --- a/src/test/regress/expected/create_function_3.out +++ b/src/test/regress/expected/create_function_3.out @@ -295,6 +295,8 @@ CREATE FUNCTION functest_S_xx(x date) RETURNS boolean LANGUAGE SQL RETURN x > 1; ERROR: operator does not exist: date > integer +LINE 3: RETURN x > 1; + ^ HINT: No operator matches the given name and argument types. You might need to add explicit type casts. -- tricky parsing CREATE FUNCTION functest_S_15(x int) RETURNS boolean diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c index e7cb5c65e9..73f9fe1f70 100644 --- a/src/backend/commands/functioncmds.c +++ b/src/backend/commands/functioncmds.c @@ -964,13 +964,13 @@ interpret_AS_clause(Oid languageOid, const char *languageName, } /* - * We must put something in prosrc. For the moment, just record an - * empty string. It might be useful to store the original text of the - * CREATE FUNCTION statement --- but to make actual use of that in - * error reports, we'd also have to adjust readfuncs.c to not throw - * away node location fields when reading prosqlbody. + * Use the text of the CREATE FUNCTION statement as prosrc. Note that + * this might be more text than strictly necessary, if CREATE FUNCTION + * is part of a multi-statement string; but trimming it would require + * adjusting the location fields in prosqlbody, which seems like way + * more trouble than the case is worth. */ - *prosrc_str_p = pstrdup(""); + *prosrc_str_p = pstrdup(queryString); /* But we definitely don't need probin. */ *probin_str_p = NULL;
On 4/9/21 12:09 PM, Tom Lane wrote: > One could make an argument, therefore, for holding off 0003 until > there's more support for execution-time error cursors. I don't > think we should though, for two reasons: > 1. It'd be better to keep the pg_proc representation of new-style > SQL functions stable across versions. > 2. Storing the CREATE text means we'll capture comments associated > with the function text, which is something that at least some > people will complain about the loss of. Admittedly we have no way > to re-integrate the comments into the de-parsed body, but some > folks might be satisfied with grabbing the prosrc text. > +many for storing query text. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
On Fri, Apr 09, 2021 at 12:09:43PM -0400, Tom Lane wrote: > Finally, 0003 might be a bit controversial: it changes the stored > prosrc for new-style SQL functions to be the query text of the CREATE > FUNCTION command. The main argument I can see being made against this > is that it'll bloat the pg_proc entry. But I think that that's > not a terribly reasonable concern Such storage cost should be acceptable, but ... > The real value of 0003 of course would be to get an error cursor at > runtime A key benefit of $SUBJECT is the function body following DDL renames: create table foo (); insert into foo default values; create function count_it() returns int begin atomic return (select count(*) from foo); end; select count_it(); insert into foo default values; alter table foo rename to some_new_long_table_name; select count_it(); -- still works After the rename, any stored prosrc is obsolete. To show accurate error cursors, deparse prosqlbody and use that in place of prosrc.
Noah Misch <noah@leadboat.com> writes: > On Fri, Apr 09, 2021 at 12:09:43PM -0400, Tom Lane wrote: >> The real value of 0003 of course would be to get an error cursor at >> runtime > A key benefit of $SUBJECT is the function body following DDL renames: Agreed. But ... > After the rename, any stored prosrc is obsolete. To show accurate error > cursors, deparse prosqlbody and use that in place of prosrc. ... I'm not sure this conclusion follows. There are two problems with it: 1. I don't see an acceptably low-overhead way to mechanize it. Deparsing prosqlbody is unlikely to be safe in a post-error transaction, but surely we'd not want to expend that cost in advance on every use of a SQL function. Even ignoring that, the act of deparsing would not in itself tell you what offset to use. Should we deparse and then re-parse to get a new node tree with corrected token locations? 2. The reason we can get away with showing a fragment of a large query (or function body) in an error message is that the user is supposed to be able to correlate the display with what she wrote. That assumption falls to the ground if the display is based on a deconstruction that is virtually certain to have line breaks in different places, not to mention that the details of what is shown may be substantially different from the original. Still, I take your point that the original text may be less useful for this purpose than I was supposing. regards, tom lane
On Sat, Apr 10, 2021 at 10:52:15AM -0400, Tom Lane wrote: > Noah Misch <noah@leadboat.com> writes: > > On Fri, Apr 09, 2021 at 12:09:43PM -0400, Tom Lane wrote: > >> The real value of 0003 of course would be to get an error cursor at > >> runtime > > > A key benefit of $SUBJECT is the function body following DDL renames: > > Agreed. But ... > > > After the rename, any stored prosrc is obsolete. To show accurate error > > cursors, deparse prosqlbody and use that in place of prosrc. > > ... I'm not sure this conclusion follows. There are two problems with it: > > 1. I don't see an acceptably low-overhead way to mechanize it. > Deparsing prosqlbody is unlikely to be safe in a post-error transaction, > but surely we'd not want to expend that cost in advance on every use > of a SQL function. Even ignoring that, the act of deparsing would not > in itself tell you what offset to use. Should we deparse and then > re-parse to get a new node tree with corrected token locations? If you really want those error cursors, yes. (I feel we can continue to live without them; their absence is no more important than it was ten years ago.) One can envision several ways to cache that high-overhead work. Otherwise, the usual PostgreSQL answer would be to omit an error cursor, not show one that reflects an obsolete sense of the function. If the original CREATE FUNCTION query text were so valuable, I'd be arguing to preserve it across dump/reload. > 2. The reason we can get away with showing a fragment of a large query > (or function body) in an error message is that the user is supposed to > be able to correlate the display with what she wrote. That assumption > falls to the ground if the display is based on a deconstruction that is > virtually certain to have line breaks in different places, not to mention > that the details of what is shown may be substantially different from the > original. Preferences on this matter will be situation-dependent. If I do CREATE FUNCTION f() ...; SELECT f() all in one sitting, then it's fine for an error in the SELECT to show the function I wrote. If I'm calling a function defined years ago, I'm likely to compare the error report to "\sf foo" and not likely to compare it to a years-old record of the SQL statement. I think it's fine to expect users to consult "\sf foo" when the user is in doubt. > Still, I take your point that the original text may be less useful > for this purpose than I was supposing.
Based on the discussion so far, I've committed 0001 and 0002 but not 0003, and marked this open issue as closed. regards, tom lane
On Tue, Jun 30, 2020 at 02:51:38PM -0400, Tom Lane wrote: > The point remains that exposing the function body's dependencies will > constrain restore order far more than we are accustomed to see. It > might be possible to build examples that flat out can't be restored, > even granting that we teach pg_dump how to break dependency loops > by first creating the function with empty body and later redefining > it with the real body. (Admittedly, if that's possible then you > likely could make it happen with views too. But somehow it seems > more likely that people would create spaghetti dependencies for > functions than views.) Should we be okay releasing v14 without support for breaking function dependency loops, or does that call for an open item? -- example create function f() returns int language sql return 1; create function g() returns int language sql return f(); create or replace function f() returns int language sql return coalesce(2, g()); -- but when a view can break the cycle, pg_dump still does so create view v as select 1 as c; create function f() returns int language sql return coalesce(0, (select count(*) from v)); create or replace view v as select f() as c;
Noah Misch <noah@leadboat.com> writes: > Should we be okay releasing v14 without support for breaking function > dependency loops, or does that call for an open item? Oh! That should definitely be an open item. It doesn't seem that hard to do something similar to what we do for views, i.e. create a dummy function and replace it later. regards, tom lane
On Sun, Apr 18, 2021 at 03:08:44PM -0400, Tom Lane wrote: > Noah Misch <noah@leadboat.com> writes: > > Should we be okay releasing v14 without support for breaking function > > dependency loops, or does that call for an open item? > > Oh! That should definitely be an open item. It doesn't seem > that hard to do something similar to what we do for views, > i.e. create a dummy function and replace it later. I added https://wiki.postgresql.org/index.php?title=PostgreSQL_14_Open_Items&type=revision&diff=35926&oldid=35925
... BTW, a dependency loop is also possible without using this feature, by abusing default-value expressions: create function f1(x int, y int) returns int language sql as 'select $1 + $2'; create function f2(x int, y int default f1(1,2)) returns int language sql as 'select $1 + $2'; create or replace function f1(x int, y int default f2(11,12)) returns int language sql as 'select $1 + $2'; The actual use-case for that seems pretty thin, so we never bothered to worry about it before. But if we're going to build loop-breaking logic to handle function body dependencies, it should deal with this too. I think that all that's required is for the initial dummy function declaration to omit defaults as well as providing a dummy body. regards, tom lane
On Wed, Apr 7, 2021 at 3:55 PM Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
Committed. Thanks!
This commit break line continuation prompts for unbalanced parentheses in the psql binary. Skimming through this thread, I don't see that this is intentional or has been noticed before.
with psql -X
Before:
jjanes=# asdf (
jjanes(#
jjanes(#
Now:
jjanes=# asdf (
jjanes-#
jjanes-#
I've looked through the parts of the commit that change psql, but didn't see an obvious culprit.
Cheers,
Jeff
On 18.04.21 23:33, Tom Lane wrote: > ... BTW, a dependency loop is also possible without using this feature, > by abusing default-value expressions: > > create function f1(x int, y int) returns int language sql > as 'select $1 + $2'; > create function f2(x int, y int default f1(1,2)) returns int language sql > as 'select $1 + $2'; > create or replace function f1(x int, y int default f2(11,12)) returns int language sql > as 'select $1 + $2'; > > The actual use-case for that seems pretty thin, so we never bothered > to worry about it before. But if we're going to build loop-breaking > logic to handle function body dependencies, it should deal with this > too. I think that all that's required is for the initial dummy > function declaration to omit defaults as well as providing a dummy > body. I have studied this a bit. I'm not sure where the dummy function declaration should be created. The current dependency-breaking logic in pg_dump_sort.c doesn't appear to support injecting additional objects into the set of dumpable objects. So we would need to create it perhaps in dumpFunc() and then later set flags that indicate whether it will be required. Another option would be that we disallow this at creation time. It seems we could detect dependency loops using findDependentObjects(), so this might not be so difficult. The use case for recursive SQL functions is probably low, at least with the current limited set of control flow options in SQL. (And you can always use a quoted body to work around it.)
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes: > On 18.04.21 23:33, Tom Lane wrote: >> The actual use-case for that seems pretty thin, so we never bothered >> to worry about it before. But if we're going to build loop-breaking >> logic to handle function body dependencies, it should deal with this >> too. I think that all that's required is for the initial dummy >> function declaration to omit defaults as well as providing a dummy >> body. > I have studied this a bit. I'm not sure where the dummy function > declaration should be created. The current dependency-breaking logic in > pg_dump_sort.c doesn't appear to support injecting additional objects > into the set of dumpable objects. So we would need to create it perhaps > in dumpFunc() and then later set flags that indicate whether it will be > required. Hmm, good point. The existing code that breaks loops involving views depends on the fact that the view relation and the view's ON SELECT rule are already treated as distinct objects within pg_dump. So we just need to mark the rule object to indicate whether to emit it or not. To make it work for functions, there would have to be a secondary object representing the function body (and the default expressions, I guess). That's kind of a lot of complication, and inefficiency, for a corner case that may never arise in practice. We've ignored the risk for default expressions, and AFAIR have yet to receive any field complaints about it. So maybe it's okay to do the same for SQL-style function bodies, at least for now. > Another option would be that we disallow this at creation time. Don't like that one much. The backend shouldn't be in the business of rejecting valid commands just because pg_dump might be unable to cope later. regards, tom lane
On 27.04.21 04:44, Justin Pryzby wrote: > On Thu, Apr 22, 2021 at 04:04:18PM -0400, Jeff Janes wrote: >> This commit break line continuation prompts for unbalanced parentheses in >> the psql binary. Skimming through this thread, I don't see that this is >> intentional or has been noticed before. >> >> with psql -X >> >> Before: >> >> jjanes=# asdf ( >> jjanes(# >> >> Now: >> >> jjanes=# asdf ( >> jjanes-# >> >> I've looked through the parts of the commit that change psql, but didn't >> see an obvious culprit. > > I haven't studied it in detail, but it probably needs something like this. Yeah, fixed like that.
On 27.04.21 18:16, Tom Lane wrote: > That's kind of a lot of complication, and inefficiency, for a corner case > that may never arise in practice. We've ignored the risk for default > expressions, and AFAIR have yet to receive any field complaints about it. > So maybe it's okay to do the same for SQL-style function bodies, at least > for now. > >> Another option would be that we disallow this at creation time. > > Don't like that one much. The backend shouldn't be in the business > of rejecting valid commands just because pg_dump might be unable > to cope later. Since this is listed as an open item, I want to clarify that I'm currently not planning to work on this, based on this discussion. Certainly something to look into sometime later, but it's not in my plans right now.
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes: > On 27.04.21 18:16, Tom Lane wrote: >> That's kind of a lot of complication, and inefficiency, for a corner case >> that may never arise in practice. We've ignored the risk for default >> expressions, and AFAIR have yet to receive any field complaints about it. >> So maybe it's okay to do the same for SQL-style function bodies, at least >> for now. >>> Another option would be that we disallow this at creation time. >> Don't like that one much. The backend shouldn't be in the business >> of rejecting valid commands just because pg_dump might be unable >> to cope later. > Since this is listed as an open item, I want to clarify that I'm > currently not planning to work on this, based on this discussion. > Certainly something to look into sometime later, but it's not in my > plans right now. Right, I concur with moving it to the "won't fix" category. regards, tom lane
On Mon, May 10, 2021 at 11:09:43AM -0400, Tom Lane wrote: > Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes: > > On 27.04.21 18:16, Tom Lane wrote: > >> That's kind of a lot of complication, and inefficiency, for a corner case > >> that may never arise in practice. We've ignored the risk for default > >> expressions, and AFAIR have yet to receive any field complaints about it. > >> So maybe it's okay to do the same for SQL-style function bodies, at least > >> for now. > > >>> Another option would be that we disallow this at creation time. > > >> Don't like that one much. The backend shouldn't be in the business > >> of rejecting valid commands just because pg_dump might be unable > >> to cope later. > > > Since this is listed as an open item, I want to clarify that I'm > > currently not planning to work on this, based on this discussion. > > Certainly something to look into sometime later, but it's not in my > > plans right now. > > Right, I concur with moving it to the "won't fix" category. Works for me.
On Wed, Apr 07, 2021 at 09:55:40PM +0200, Peter Eisentraut wrote: > Committed. Thanks! I get a NULL pointer dereference if the function body has a doubled semicolon: create function f() returns int language sql begin atomic select 1;; end; Program received signal SIGSEGV, Segmentation fault. transformStmt (pstate=pstate@entry=0x2623978, parseTree=parseTree@entry=0x0) at analyze.c:297 297 switch (nodeTag(parseTree)) #0 transformStmt (pstate=pstate@entry=0x2623978, parseTree=parseTree@entry=0x0) at analyze.c:297 #1 0x00000000006132a4 in interpret_AS_clause (queryString=<optimized out>, sql_body_out=<synthetic pointer>, probin_str_p=<syntheticpointer>, prosrc_str_p=<synthetic pointer>, inParameterNames=<optimized out>, parameterTypes=<optimizedout>, sql_body_in=<optimized out>, as=<optimized out>, funcname=<optimized out>, languageName=<optimized out>, languageOid=14)at functioncmds.c:937 #2 CreateFunction (pstate=pstate@entry=0x26213e0, stmt=stmt@entry=0x25fd048) at functioncmds.c:1227 #3 0x0000000000813e23 in ProcessUtilitySlow (pstate=pstate@entry=0x26213e0, pstmt=pstmt@entry=0x25fd3b8, queryString=queryString@entry=0x25fc040"create function f() returns int language sql begin atomic select 1;; end;", context=context@entry=PROCESS_UTILITY_TOPLEVEL, params=params@entry=0x0, queryEnv=queryEnv@entry=0x0, qc=qc@entry=0x7fff4b715b70,dest=0x25fd4a8) at utility.c:1607 #4 0x0000000000812944 in standard_ProcessUtility (pstmt=0x25fd3b8, queryString=0x25fc040 "create function f() returns intlanguage sql begin atomic select 1;; end;", context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x25fd4a8, qc=0x7fff4b715b70) at utility.c:1034 #5 0x0000000000810efe in PortalRunUtility (portal=portal@entry=0x265fb60, pstmt=0x25fd3b8, isTopLevel=isTopLevel@entry=true,setHoldSnapshot=setHoldSnapshot@entry=false, dest=0x25fd4a8, qc=0x7fff4b715b70) at pquery.c:1147 #6 0x0000000000811053 in PortalRunMulti (portal=portal@entry=0x265fb60, isTopLevel=isTopLevel@entry=true, setHoldSnapshot=setHoldSnapshot@entry=false,dest=dest@entry=0x25fd4a8, altdest=altdest@entry=0x25fd4a8, qc=qc@entry=0x7fff4b715b70)at pquery.c:1310 #7 0x00000000008115e4 in PortalRun (portal=portal@entry=0x265fb60, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true,run_once=run_once@entry=true, dest=dest@entry=0x25fd4a8, altdest=altdest@entry=0x25fd4a8,qc=qc@entry=0x7fff4b715b70) at pquery.c:786 #8 0x000000000080d004 in exec_simple_query (query_string=0x25fc040 "create function f() returns int language sql begin atomicselect 1;; end;") at postgres.c:1214 #9 0x000000000080ee1f in PostgresMain (argc=argc@entry=1, argv=argv@entry=0x7fff4b716030, dbname=0x2627788 "test", username=<optimizedout>) at postgres.c:4486 #10 0x000000000048bc97 in BackendRun (port=<optimized out>, port=<optimized out>) at postmaster.c:4507 #11 BackendStartup (port=0x261f480) at postmaster.c:4229 #12 ServerLoop () at postmaster.c:1745 #13 0x000000000077c278 in PostmasterMain (argc=argc@entry=1, argv=argv@entry=0x25f6a00) at postmaster.c:1417 #14 0x000000000048d51e in main (argc=1, argv=0x25f6a00) at main.c:209 (gdb) p parseTree $1 = (Node *) 0x0
On Sat, Jun 05, 2021 at 09:44:18PM -0700, Noah Misch wrote: > On Wed, Apr 07, 2021 at 09:55:40PM +0200, Peter Eisentraut wrote: > > Committed. Thanks! > > I get a NULL pointer dereference if the function body has a doubled semicolon: > > create function f() returns int language sql begin atomic select 1;; end; You don't even need a statements to reproduce the problem, a body containing only semi-colon(s) will behave the same. Attached patch should fix the problem.
Attachment
On 06.06.21 09:32, Julien Rouhaud wrote: > On Sat, Jun 05, 2021 at 09:44:18PM -0700, Noah Misch wrote: >> I get a NULL pointer dereference if the function body has a doubled semicolon: >> >> create function f() returns int language sql begin atomic select 1;; end; > > You don't even need a statements to reproduce the problem, a body containing > only semi-colon(s) will behave the same. > > Attached patch should fix the problem. Your patch filters out empty statements at the parse transformation phase, so they are no longer present when you dump the body back out. So your edits in the test expected files don't fit. I suggest we just prohibit empty statements at the parse stage. I don't see a strong reason to allow them, and if we wanted to, we'd have to do more work, e.g., in ruleutils.c to print them back out correctly.
Attachment
On Mon, Jun 7, 2021 at 4:52 PM Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote: > > Your patch filters out empty statements at the parse transformation > phase, so they are no longer present when you dump the body back out. > So your edits in the test expected files don't fit. Oh, somehow the tests aren't failing here, I'm not sure what I did wrong. > I suggest we just prohibit empty statements at the parse stage. I don't > see a strong reason to allow them, and if we wanted to, we'd have to do > more work, e.g., in ruleutils.c to print them back out correctly. I always thought extraneous semicolons were tokens to be ignored, which happens to be internally implemented as empty statements, so deparsing them is not required, similar to deparsing extraneous whitespaces. If the spec says otherwise then I agree it's not worth implementing, but otherwise I'm not sure if it's really helpful to error out.
Julien Rouhaud <rjuju123@gmail.com> writes: > On Mon, Jun 7, 2021 at 4:52 PM Peter Eisentraut > <peter.eisentraut@enterprisedb.com> wrote: >> Your patch filters out empty statements at the parse transformation >> phase, so they are no longer present when you dump the body back out. >> So your edits in the test expected files don't fit. > Oh, somehow the tests aren't failing here, I'm not sure what I did wrong. Modulo getting the tests right ... >> I suggest we just prohibit empty statements at the parse stage. I don't >> see a strong reason to allow them, and if we wanted to, we'd have to do >> more work, e.g., in ruleutils.c to print them back out correctly. > I always thought extraneous semicolons were tokens to be ignored, ... I tend to agree with Julien's position here. It seems really ugly to prohibit empty statements just for implementation convenience. However, the way I'd handle it is to have the grammar remove them, which is what it does in other contexts. I don't think there's any need to preserve them in ruleutils output --- there's a lot of other normalization we do on the way to that, and this seems to fit in. BTW, is it just me, or does SQL:2021 fail to permit multiple statements in a procedure at all? After much searching, I found the BEGIN ATOMIC ... END syntax, but it's in <triggered SQL statement>, in other words the body of a trigger not a procedure. I cannot find any production that connects a <routine body> to that. There's an example showing use of BEGIN ATOMIC as a procedure statement, so they clearly *meant* to allow it, but it looks like somebody messed up the grammar. regards, tom lane
On Mon, Jun 7, 2021 at 11:27 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Julien Rouhaud <rjuju123@gmail.com> writes: > > On Mon, Jun 7, 2021 at 4:52 PM Peter Eisentraut > > <peter.eisentraut@enterprisedb.com> wrote: > >> Your patch filters out empty statements at the parse transformation > >> phase, so they are no longer present when you dump the body back out. > >> So your edits in the test expected files don't fit. > > > Oh, somehow the tests aren't failing here, I'm not sure what I did wrong. > > Modulo getting the tests right ... I can certainly accept that my patch broke the tests, but I just ran another make check-world and it passed without any problem. What am I missing?
I wrote: > ... I tend to agree with Julien's position here. It seems really ugly > to prohibit empty statements just for implementation convenience. > However, the way I'd handle it is to have the grammar remove them, > which is what it does in other contexts. Concretely, I think the right fix is per attached. Like Julien, I don't see any additional change in regression test outputs. Maybe Peter thinks there should be some? But I think the reverse-listing we get for functest_S_3a is fine. regards, tom lane diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index 9ee90e3f13..52a254928f 100644 --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@ -7990,7 +7990,11 @@ opt_routine_body: routine_body_stmt_list: routine_body_stmt_list routine_body_stmt ';' { - $$ = lappend($1, $2); + /* As in stmtmulti, discard empty statements */ + if ($2 != NULL) + $$ = lappend($1, $2); + else + $$ = $1; } | /*EMPTY*/ { diff --git a/src/test/regress/expected/create_function_3.out b/src/test/regress/expected/create_function_3.out index 5b6bc5eddb..5955859bb5 100644 --- a/src/test/regress/expected/create_function_3.out +++ b/src/test/regress/expected/create_function_3.out @@ -267,7 +267,7 @@ CREATE FUNCTION functest_S_3() RETURNS boolean RETURN false; CREATE FUNCTION functest_S_3a() RETURNS boolean BEGIN ATOMIC - RETURN false; + ;;RETURN false;; END; CREATE FUNCTION functest_S_10(a text, b date) RETURNS boolean LANGUAGE SQL diff --git a/src/test/regress/sql/create_function_3.sql b/src/test/regress/sql/create_function_3.sql index 4b778999ed..6e8b838ff2 100644 --- a/src/test/regress/sql/create_function_3.sql +++ b/src/test/regress/sql/create_function_3.sql @@ -165,7 +165,7 @@ CREATE FUNCTION functest_S_3() RETURNS boolean RETURN false; CREATE FUNCTION functest_S_3a() RETURNS boolean BEGIN ATOMIC - RETURN false; + ;;RETURN false;; END; CREATE FUNCTION functest_S_10(a text, b date) RETURNS boolean
On 07.06.21 17:27, Tom Lane wrote: > ... I tend to agree with Julien's position here. It seems really ugly > to prohibit empty statements just for implementation convenience. > However, the way I'd handle it is to have the grammar remove them, > which is what it does in other contexts. I don't think there's any > need to preserve them in ruleutils output --- there's a lot of other > normalization we do on the way to that, and this seems to fit in. Ok, if that's what people prefer. > BTW, is it just me, or does SQL:2021 fail to permit multiple > statements in a procedure at all? After much searching, I found the > BEGIN ATOMIC ... END syntax, but it's in <triggered SQL statement>, > in other words the body of a trigger not a procedure. I cannot find > any production that connects a <routine body> to that. There's an > example showing use of BEGIN ATOMIC as a procedure statement, so > they clearly*meant* to allow it, but it looks like somebody messed > up the grammar. It's in the SQL/PSM part.
On Mon, Jun 07, 2021 at 03:24:33PM -0400, Tom Lane wrote: > > Concretely, I think the right fix is per attached. +1, I agree that this approach is better.
Julien Rouhaud <rjuju123@gmail.com> writes: > On Mon, Jun 07, 2021 at 03:24:33PM -0400, Tom Lane wrote: >> Concretely, I think the right fix is per attached. > +1, I agree that this approach is better. Pushed. regards, tom lane