Thread: SQL-standard function body

SQL-standard function body

From
Peter Eisentraut
Date:
This adds support for writing CREATE FUNCTION and CREATE PROCEDURE 
statements for language SQL with a function body that conforms to the 
SQL standard and is portable to other implementations.

Instead of the PostgreSQL-specific AS $$ string literal $$ syntax,
this allows writing out the SQL statements making up the body
unquoted, either as a single statement:

      CREATE FUNCTION add(a integer, b integer) RETURNS integer
          LANGUAGE SQL
          RETURN a + b;

or as a block

      CREATE PROCEDURE insert_data(a integer, b integer)
      LANGUAGE SQL
      BEGIN ATOMIC
        INSERT INTO tbl VALUES (a);
        INSERT INTO tbl VALUES (b);
      END;

The function body is parsed at function definition time and stored as
expression nodes in probin.  So at run time, no further parsing is
required.

However, this form does not support polymorphic arguments, because
there is no more parse analysis done at call time.

Dependencies between the function and the objects it uses are fully
tracked.

A new RETURN statement is introduced.  This can only be used inside
function bodies.  Internally, it is treated much like a SELECT
statement.

psql needs some new intelligence to keep track of function body
boundaries so that it doesn't send off statements when it sees
semicolons that are inside a function body.

Also, per SQL standard, LANGUAGE SQL is the default, so it does not
need to be specified anymore.

Note: Some parts of the patch look better under git diff -w (ignoring 
whitespace changes) because if/else blocks were introduced around 
existing code.

TODOs and discussion points:

- pg_dump is not yet supported.  As a consequence, the pg_upgrade
tests don't pass yet.  I'm thinking about changing pg_dump to use 
pg_get_functiondef here instead of coding everything by hand.  Some 
initial experimenting showed that this would be possible with minimal 
tweaking and it would surely be beneficial in the long run.

- The compiled function body is stored in the probin field of pg_proc. 
This matches the historical split similar to adsrc/adbin, consrc/conbin, 
but this has now been abandoned.  Also, this field should ideally be of 
type pg_node_tree, so reusing probin for that is probably not good. 
Seems like a new field might be best.

- More test coverage is needed.  Surprisingly, there wasn't actually any 
test AFAICT that just creates and SQL function and runs it.  Most of 
that code is tested incidentally, but there is very little or no 
targeted testing of this functionality.

- Some of the changes in pg_proc.c, functioncmds.c, and functions.c in 
particular were jammed in and could use some reorganization after the 
basic ideas are solidified.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: SQL-standard function body

From
Robert Haas
Date:
On Tue, Jun 30, 2020 at 1:49 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> This adds support for writing CREATE FUNCTION and CREATE PROCEDURE
> statements for language SQL with a function body that conforms to the
> SQL standard and is portable to other implementations.

With what other implementations is it compatible?

> The function body is parsed at function definition time and stored as
> expression nodes in probin.  So at run time, no further parsing is
> required.
>
> However, this form does not support polymorphic arguments, because
> there is no more parse analysis done at call time.
>
> Dependencies between the function and the objects it uses are fully
> tracked.
>
> A new RETURN statement is introduced.  This can only be used inside
> function bodies.  Internally, it is treated much like a SELECT
> statement.
>
> psql needs some new intelligence to keep track of function body
> boundaries so that it doesn't send off statements when it sees
> semicolons that are inside a function body.
>
> Also, per SQL standard, LANGUAGE SQL is the default, so it does not
> need to be specified anymore.

Hmm, this all seems like a pretty big semantic change. IIUC, right
now, a SQL function can only contain one statement, but it seems like
with this patch you can have a block in there with a bunch of
statements, sorta like plpgsql. But probably you don't have all of the
functionality of plpgsql available. Also, the fact that you're doing
parsing earlier means that e.g. creating a table and inserting into it
won't work. Maybe that's fine. But it almost seems like you are
inventing a whole new PL....

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: SQL-standard function body

From
Stephen Frost
Date:
Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> Hmm, this all seems like a pretty big semantic change. IIUC, right
> now, a SQL function can only contain one statement, but it seems like
> with this patch you can have a block in there with a bunch of
> statements, sorta like plpgsql.

From our docs:

CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS numeric AS $$
    UPDATE bank
        SET balance = balance - debit
        WHERE accountno = tf1.accountno;
    SELECT 1;
$$ LANGUAGE SQL;

https://www.postgresql.org/docs/current/xfunc-sql.html

Haven't looked at the patch, tho if it adds support for something the
SQL standard defines, that generally seems like a positive to me.

Thanks,

Stephen

Attachment

Re: SQL-standard function body

From
Pavel Stehule
Date:


út 30. 6. 2020 v 19:58 odesílatel Robert Haas <robertmhaas@gmail.com> napsal:
On Tue, Jun 30, 2020 at 1:49 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> This adds support for writing CREATE FUNCTION and CREATE PROCEDURE
> statements for language SQL with a function body that conforms to the
> SQL standard and is portable to other implementations.

With what other implementations is it compatible?

> The function body is parsed at function definition time and stored as
> expression nodes in probin.  So at run time, no further parsing is
> required.
>
> However, this form does not support polymorphic arguments, because
> there is no more parse analysis done at call time.
>
> Dependencies between the function and the objects it uses are fully
> tracked.
>
> A new RETURN statement is introduced.  This can only be used inside
> function bodies.  Internally, it is treated much like a SELECT
> statement.
>
> psql needs some new intelligence to keep track of function body
> boundaries so that it doesn't send off statements when it sees
> semicolons that are inside a function body.
>
> Also, per SQL standard, LANGUAGE SQL is the default, so it does not
> need to be specified anymore.

Hmm, this all seems like a pretty big semantic change. IIUC, right
now, a SQL function can only contain one statement, but it seems like
with this patch you can have a block in there with a bunch of
statements, sorta like plpgsql. But probably you don't have all of the
functionality of plpgsql available. Also, the fact that you're doing
parsing earlier means that e.g. creating a table and inserting into it
won't work. Maybe that's fine. But it almost seems like you are
inventing a whole new PL....

It is SQL/PSM and can be nice to have it.

I am a little bit afraid about performance - SQL functions doesn't use plan cache and simple expressions. Without inlining it can be too slow.

Regards

Pavel


--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: SQL-standard function body

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Jun 30, 2020 at 1:49 PM Peter Eisentraut
> <peter.eisentraut@2ndquadrant.com> wrote:
>> This adds support for writing CREATE FUNCTION and CREATE PROCEDURE
>> statements for language SQL with a function body that conforms to the
>> SQL standard and is portable to other implementations.

> With what other implementations is it compatible?

Yeah ... I'm sort of wondering exactly what this really accomplishes.
I think "portability" is a red herring unfortunately.

Tracking the dependencies of the function body sounds nice at first
glance, so it might be a feature.  But given our experiences with having
to use check_function_bodies = off to not have impossible dependency loops
in dump/restore, I rather wonder whether it'll be a net loss in practice.
IIUC, this implementation is flat out incapable of doing the equivalent of
check_function_bodies = off, and that sounds like trouble.

> Hmm, this all seems like a pretty big semantic change. IIUC, right
> now, a SQL function can only contain one statement,

Not true, you can have more.  However, it's nonetheless an enormous
semantic change, if only because the CREATE FUNCTION-time search_path
is now relevant instead of the execution-time path.  That *will*
break use-cases I've heard of, where the same function is applied
to different tables by adjusting the path.  It'd certainly be useful
from some perspectives (eg better security), but it's ... different.

Replicating the creation-time search path will be a big headache for
pg_dump, I bet.

> But it almost seems like you are
> inventing a whole new PL....

Yes.  Having this replace the existing SQL PL would be a disaster,
because there are use-cases this simply can't meet (even assuming
that we can fix the polymorphism problem, which seems a bit unlikely).
We'd need to treat it as a new PL type.

Perhaps this is useful enough to justify all the work involved,
but I'm not sure.

            regards, tom lane



Re: SQL-standard function body

From
Tom Lane
Date:
I wrote:
> Replicating the creation-time search path will be a big headache for
> pg_dump, I bet.

On further thought, we probably don't have to.  Re-parsing the function
body the same way is exactly the same problem as re-parsing a view or
matview body the same way.  I don't want to claim that that's a 100%
solved problem, but I've heard few complaints in that area lately.

The point remains that exposing the function body's dependencies will
constrain restore order far more than we are accustomed to see.  It
might be possible to build examples that flat out can't be restored,
even granting that we teach pg_dump how to break dependency loops
by first creating the function with empty body and later redefining
it with the real body.  (Admittedly, if that's possible then you
likely could make it happen with views too.  But somehow it seems
more likely that people would create spaghetti dependencies for
functions than views.)

            regards, tom lane



Re: SQL-standard function body

From
Andres Freund
Date:
Hi,

On 2020-06-30 19:49:04 +0200, Peter Eisentraut wrote:
> The function body is parsed at function definition time and stored as
> expression nodes in probin.  So at run time, no further parsing is
> required.

As raw parse tree or as a parse-analysed tree? I assume the latter?

Isn't a consequence of that that we'd get a lot more errors if any DDL
is done to tables involved in the query? In contrast to other languages
we'd not be able to handle column type changes etc, right?

Greetings,

Andres Freund



Re: SQL-standard function body

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2020-06-30 19:49:04 +0200, Peter Eisentraut wrote:
>> The function body is parsed at function definition time and stored as
>> expression nodes in probin.  So at run time, no further parsing is
>> required.

> Isn't a consequence of that that we'd get a lot more errors if any DDL
> is done to tables involved in the query? In contrast to other languages
> we'd not be able to handle column type changes etc, right?

I suppose it'd act like column references in a view, ie the dependency
mechanisms would forbid you from changing/dropping any column mentioned
in one of these functions.

            regards, tom lane



Re: SQL-standard function body

From
Robert Haas
Date:
On Tue, Jun 30, 2020 at 2:51 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> On further thought, we probably don't have to.  Re-parsing the function
> body the same way is exactly the same problem as re-parsing a view or
> matview body the same way.  I don't want to claim that that's a 100%
> solved problem, but I've heard few complaints in that area lately.
>
> The point remains that exposing the function body's dependencies will
> constrain restore order far more than we are accustomed to see.  It
> might be possible to build examples that flat out can't be restored,
> even granting that we teach pg_dump how to break dependency loops
> by first creating the function with empty body and later redefining
> it with the real body.  (Admittedly, if that's possible then you
> likely could make it happen with views too.  But somehow it seems
> more likely that people would create spaghetti dependencies for
> functions than views.)

In my experience, there's certainly demand for some kind of mode where
plpgsql functions get checked at function definition time, rather than
at execution time. The model we have is advantageous not only because
it simplifies dump and reload, but also because it handles cases where
the table is created on the fly properly. However, it also means that
you can have silly mistakes in your function definitions that you
don't find out about until runtime, and in my experience, people don't
like that behavior much at all. So I don't think that it's a bad idea
on principle, or anything like that, but the details seem like they
need a lot of thought. The dump and restore issues need to be
considered, but also, what about things like IF and WHILE? People are
going to want those constructs with these new semantics, too.

I actually don't have a very clear idea of what the standard has to
say about SQL-language functions. Does it just say it's a list of
statements, or does it involve variables and control-flow constructs
and stuff like that, too? If we go that direction with this, then
we're actually going to end up with two different implementations of
what's now plpgsql, or something. But if we don't, then I'm not sure
how far this takes us. I'm not saying it's bad, but the comment "I
love the early binding but where's my IF statement" seems like an
inevitable one.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: SQL-standard function body

From
Pavel Stehule
Date:


st 1. 7. 2020 v 15:37 odesílatel Robert Haas <robertmhaas@gmail.com> napsal:
On Tue, Jun 30, 2020 at 2:51 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> On further thought, we probably don't have to.  Re-parsing the function
> body the same way is exactly the same problem as re-parsing a view or
> matview body the same way.  I don't want to claim that that's a 100%
> solved problem, but I've heard few complaints in that area lately.
>
> The point remains that exposing the function body's dependencies will
> constrain restore order far more than we are accustomed to see.  It
> might be possible to build examples that flat out can't be restored,
> even granting that we teach pg_dump how to break dependency loops
> by first creating the function with empty body and later redefining
> it with the real body.  (Admittedly, if that's possible then you
> likely could make it happen with views too.  But somehow it seems
> more likely that people would create spaghetti dependencies for
> functions than views.)

In my experience, there's certainly demand for some kind of mode where
plpgsql functions get checked at function definition time, rather than
at execution time. The model we have is advantageous not only because
it simplifies dump and reload, but also because it handles cases where
the table is created on the fly properly. However, it also means that
you can have silly mistakes in your function definitions that you
don't find out about until runtime, and in my experience, people don't
like that behavior much at all. So I don't think that it's a bad idea
on principle, or anything like that, but the details seem like they
need a lot of thought. The dump and restore issues need to be
considered, but also, what about things like IF and WHILE? People are
going to want those constructs with these new semantics, too.

plpgsql_check can be integrated to upstream.




I actually don't have a very clear idea of what the standard has to
say about SQL-language functions. Does it just say it's a list of
statements, or does it involve variables and control-flow constructs
and stuff like that, too? If we go that direction with this, then
we're actually going to end up with two different implementations of
what's now plpgsql, or something. But if we don't, then I'm not sure
how far this takes us. I'm not saying it's bad, but the comment "I
love the early binding but where's my IF statement" seems like an
inevitable one.

The standard SQL/PSM is a full functionality language with variables, conditional statements, exception handlings, ..


Unfortunately a basic implementation integrated into the main SQL parser can be pretty hard work. First issue can be SET statement implementation.

Regards

Pavel



--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: SQL-standard function body

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> In my experience, there's certainly demand for some kind of mode where
> plpgsql functions get checked at function definition time, rather than
> at execution time.

Yeah, absolutely agreed.  But I'm afraid this proposal takes us too
far in the other direction: with this, you *must* have a 100% parseable
and semantically valid function body, every time all the time.

So far as plpgsql is concerned, I could see extending the validator
to run parse analysis (not just raw parsing) on all SQL statements in
the body.  This wouldn't happen of course with check_function_bodies off,
so it wouldn't affect dump/reload.  But likely there would still be
demand for more fine-grained control over it ... or maybe it could
stop doing analysis as soon as it finds a DDL command?

            regards, tom lane



Re: SQL-standard function body

From
Pavel Stehule
Date:


st 1. 7. 2020 v 16:14 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Robert Haas <robertmhaas@gmail.com> writes:
> In my experience, there's certainly demand for some kind of mode where
> plpgsql functions get checked at function definition time, rather than
> at execution time.

Yeah, absolutely agreed.  But I'm afraid this proposal takes us too
far in the other direction: with this, you *must* have a 100% parseable
and semantically valid function body, every time all the time.

So far as plpgsql is concerned, I could see extending the validator
to run parse analysis (not just raw parsing) on all SQL statements in
the body.  This wouldn't happen of course with check_function_bodies off,
so it wouldn't affect dump/reload.  But likely there would still be
demand for more fine-grained control over it ... or maybe it could
stop doing analysis as soon as it finds a DDL command?

This simple analysis stops on first record type usage. PLpgSQL allows some dynamic work that increases the complexity of static analysis.

Regards

Pavel


                        regards, tom lane


Re: SQL-standard function body

From
Bruce Momjian
Date:
On Wed, Jul  1, 2020 at 10:14:10AM -0400, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > In my experience, there's certainly demand for some kind of mode where
> > plpgsql functions get checked at function definition time, rather than
> > at execution time.
> 
> Yeah, absolutely agreed.  But I'm afraid this proposal takes us too
> far in the other direction: with this, you *must* have a 100% parseable
> and semantically valid function body, every time all the time.
> 
> So far as plpgsql is concerned, I could see extending the validator
> to run parse analysis (not just raw parsing) on all SQL statements in
> the body.  This wouldn't happen of course with check_function_bodies off,
> so it wouldn't affect dump/reload.  But likely there would still be
> demand for more fine-grained control over it ... or maybe it could
> stop doing analysis as soon as it finds a DDL command?

Is the SQL-standard function body verified as preventing function
inlining?  That seems to be a major downside.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee




Re: SQL-standard function body

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Is the SQL-standard function body verified as preventing function
> inlining?  That seems to be a major downside.

I see no reason why that would make any difference.  There might
be more code to be written than is in the patch, but in principle
inlining should not care whether the function is pre-parsed or not.

            regards, tom lane



Re: SQL-standard function body

From
Vik Fearing
Date:
On 7/1/20 3:36 PM, Robert Haas wrote:
> I actually don't have a very clear idea of what the standard has to
> say about SQL-language functions. Does it just say it's a list of
> statements, or does it involve variables and control-flow constructs
> and stuff like that, too?


It's either a single sql statement, or a collection of them between
"begin atomic" and "end".  There are no variables or flow control
constructs or anything like that, just as there are no such things
outside of a function.

(There are a few statements that are not allowed, such as COMMIT.)
-- 
Vik Fearing



Re: SQL-standard function body

From
Pavel Stehule
Date:


st 1. 7. 2020 v 20:19 odesílatel Vik Fearing <vik@postgresfriends.org> napsal:
On 7/1/20 3:36 PM, Robert Haas wrote:
> I actually don't have a very clear idea of what the standard has to
> say about SQL-language functions. Does it just say it's a list of
> statements, or does it involve variables and control-flow constructs
> and stuff like that, too?


It's either a single sql statement, or a collection of them between
"begin atomic" and "end".  There are no variables or flow control
constructs or anything like that, just as there are no such things
outside of a function.

What is the source of this comment? Maybe we are speaking (and thinking) about different languages.

I thought the language of SQL functions (ANSI/SQL) is SQL/PSM.

Regards

Pavel



(There are a few statements that are not allowed, such as COMMIT.)
--
Vik Fearing


Re: SQL-standard function body

From
Vik Fearing
Date:
On 7/1/20 9:32 PM, Pavel Stehule wrote:
> st 1. 7. 2020 v 20:19 odesílatel Vik Fearing <vik@postgresfriends.org>
> napsal:
> 
>> On 7/1/20 3:36 PM, Robert Haas wrote:
>>> I actually don't have a very clear idea of what the standard has to
>>> say about SQL-language functions. Does it just say it's a list of
>>> statements, or does it involve variables and control-flow constructs
>>> and stuff like that, too?
>>
>>
>> It's either a single sql statement, or a collection of them between
>> "begin atomic" and "end".  There are no variables or flow control
>> constructs or anything like that, just as there are no such things
>> outside of a function.
>>
> 
> What is the source of this comment?


The SQL Standard.


> Maybe we are speaking (and thinking)
> about different languages.


I think so, yes.


> I thought the language of SQL functions (ANSI/SQL) is SQL/PSM.


That is something else entirely, and not at all what Peter's patch is about.
-- 
Vik Fearing



Re: SQL-standard function body

From
Pavel Stehule
Date:


st 1. 7. 2020 v 22:31 odesílatel Vik Fearing <vik@postgresfriends.org> napsal:
On 7/1/20 9:32 PM, Pavel Stehule wrote:
> st 1. 7. 2020 v 20:19 odesílatel Vik Fearing <vik@postgresfriends.org>
> napsal:
>
>> On 7/1/20 3:36 PM, Robert Haas wrote:
>>> I actually don't have a very clear idea of what the standard has to
>>> say about SQL-language functions. Does it just say it's a list of
>>> statements, or does it involve variables and control-flow constructs
>>> and stuff like that, too?
>>
>>
>> It's either a single sql statement, or a collection of them between
>> "begin atomic" and "end".  There are no variables or flow control
>> constructs or anything like that, just as there are no such things
>> outside of a function.
>>
>
> What is the source of this comment?


The SQL Standard.

The SQL Standard is really big, and is very possible so I miss this part. Can you send me a link?

Regards

Pavel



> Maybe we are speaking (and thinking)
> about different languages.


I think so, yes.


> I thought the language of SQL functions (ANSI/SQL) is SQL/PSM.


That is something else entirely, and not at all what Peter's patch is about.
--
Vik Fearing

Re: SQL-standard function body

From
Vik Fearing
Date:
On 7/1/20 10:34 PM, Pavel Stehule wrote:
> st 1. 7. 2020 v 22:31 odesílatel Vik Fearing <vik@postgresfriends.org>
> napsal:
> 
>> On 7/1/20 9:32 PM, Pavel Stehule wrote:
>>> st 1. 7. 2020 v 20:19 odesílatel Vik Fearing <vik@postgresfriends.org>
>>> napsal:
>>>
>>>> On 7/1/20 3:36 PM, Robert Haas wrote:
>>>>> I actually don't have a very clear idea of what the standard has to
>>>>> say about SQL-language functions. Does it just say it's a list of
>>>>> statements, or does it involve variables and control-flow constructs
>>>>> and stuff like that, too?
>>>>
>>>>
>>>> It's either a single sql statement, or a collection of them between
>>>> "begin atomic" and "end".  There are no variables or flow control
>>>> constructs or anything like that, just as there are no such things
>>>> outside of a function.
>>>>
>>>
>>> What is the source of this comment?
>>
>>
>> The SQL Standard.
>>
> 
> The SQL Standard is really big, and is very possible so I miss this part.
> Can you send me a link?


ISO/IEC 9075-2:2016 Section 11.60 <SQL-invoked routine>
-- 
Vik Fearing



Re: SQL-standard function body

From
Pavel Stehule
Date:


st 1. 7. 2020 v 22:54 odesílatel Vik Fearing <vik@postgresfriends.org> napsal:
On 7/1/20 10:34 PM, Pavel Stehule wrote:
> st 1. 7. 2020 v 22:31 odesílatel Vik Fearing <vik@postgresfriends.org>
> napsal:
>
>> On 7/1/20 9:32 PM, Pavel Stehule wrote:
>>> st 1. 7. 2020 v 20:19 odesílatel Vik Fearing <vik@postgresfriends.org>
>>> napsal:
>>>
>>>> On 7/1/20 3:36 PM, Robert Haas wrote:
>>>>> I actually don't have a very clear idea of what the standard has to
>>>>> say about SQL-language functions. Does it just say it's a list of
>>>>> statements, or does it involve variables and control-flow constructs
>>>>> and stuff like that, too?
>>>>
>>>>
>>>> It's either a single sql statement, or a collection of them between
>>>> "begin atomic" and "end".  There are no variables or flow control
>>>> constructs or anything like that, just as there are no such things
>>>> outside of a function.
>>>>
>>>
>>> What is the source of this comment?
>>
>>
>> The SQL Standard.
>>
>
> The SQL Standard is really big, and is very possible so I miss this part.
> Can you send me a link?


ISO/IEC 9075-2:2016 Section 11.60 <SQL-invoked routine>

Thank you

Pavel

--
Vik Fearing

Re: SQL-standard function body

From
Thomas Munro
Date:
On Wed, Jul 1, 2020 at 5:58 AM Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Jun 30, 2020 at 1:49 PM Peter Eisentraut
> <peter.eisentraut@2ndquadrant.com> wrote:
> > This adds support for writing CREATE FUNCTION and CREATE PROCEDURE
> > statements for language SQL with a function body that conforms to the
> > SQL standard and is portable to other implementations.
>
> With what other implementations is it compatible?

Judging by the Wikipedia article[1], it sounds like at least DB2 and
MySQL/MariaDB are purposely striving for conformance.  When I worked
with DB2 a few years back I preferred to use their standard-conforming
PL stuff (as opposed to their be-more-like-Oracle PL/SQL mode), and I
always hoped that PostgreSQL would eventually understand the same
syntax; admittedly, anyone who has ever worked on large applications
that support multiple RDBMSs knows that there's more than just surface
syntax to worry about, but it still seems like a pretty solid plan to
conform to the standard that's in our name, so +1 from me on the
general direction (I didn't look at the patch).

[1] https://en.wikipedia.org/wiki/SQL/PSM



Re: SQL-standard function body

From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes:
> On Wed, Jul 1, 2020 at 5:58 AM Robert Haas <robertmhaas@gmail.com> wrote:
>> With what other implementations is it compatible?

> Judging by the Wikipedia article[1], it sounds like at least DB2 and
> MySQL/MariaDB are purposely striving for conformance.
> [1] https://en.wikipedia.org/wiki/SQL/PSM

but ... but ... but ... that's about SQL/PSM, which is not this.

Having said that, I wonder whether this work could be repurposed
to be the start of a real SQL/PSM implementation.  There's other
stuff in SQL/PSM, but a big part of it is routines that are written
with syntax like this.

            regards, tom lane



Re: SQL-standard function body

From
Pavel Stehule
Date:


st 1. 7. 2020 v 22:54 odesílatel Vik Fearing <vik@postgresfriends.org> napsal:
On 7/1/20 10:34 PM, Pavel Stehule wrote:
> st 1. 7. 2020 v 22:31 odesílatel Vik Fearing <vik@postgresfriends.org>
> napsal:
>
>> On 7/1/20 9:32 PM, Pavel Stehule wrote:
>>> st 1. 7. 2020 v 20:19 odesílatel Vik Fearing <vik@postgresfriends.org>
>>> napsal:
>>>
>>>> On 7/1/20 3:36 PM, Robert Haas wrote:
>>>>> I actually don't have a very clear idea of what the standard has to
>>>>> say about SQL-language functions. Does it just say it's a list of
>>>>> statements, or does it involve variables and control-flow constructs
>>>>> and stuff like that, too?
>>>>
>>>>
>>>> It's either a single sql statement, or a collection of them between
>>>> "begin atomic" and "end".  There are no variables or flow control
>>>> constructs or anything like that, just as there are no such things
>>>> outside of a function.
>>>>
>>>
>>> What is the source of this comment?
>>
>>
>> The SQL Standard.
>>
>
> The SQL Standard is really big, and is very possible so I miss this part.
> Can you send me a link?


ISO/IEC 9075-2:2016 Section 11.60 <SQL-invoked routine>

I am looking there, and it looks like a subset of SQL/PSM or better - SQL/PSM is extending this. But this part is a little bit strange, because it doesn't introduce its own variables, but it is working with the concept of host variables and is a little bit messy (for me). Looks like it is introduced for usage in triggers. If we support triggers without trigger functions, then it has sense. Without it - It is hard for me to imagine a use case for this reduced language.

Regards

Pavel



--
Vik Fearing

Re: SQL-standard function body

From
Thomas Munro
Date:
On Wed, Jul 1, 2020 at 5:49 AM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> - More test coverage is needed.  Surprisingly, there wasn't actually any
> test AFAICT that just creates and SQL function and runs it.  Most of
> that code is tested incidentally, but there is very little or no
> targeted testing of this functionality.

FYI cfbot showed a sign of some kind of error_context_stack corruption
while running "DROP TABLE functest3 CASCADE;".



Re: SQL-standard function body

From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes:
> On Wed, Jul 1, 2020 at 5:49 AM Peter Eisentraut
> <peter.eisentraut@2ndquadrant.com> wrote:
>> - More test coverage is needed.  Surprisingly, there wasn't actually any
>> test AFAICT that just creates and SQL function and runs it.  Most of
>> that code is tested incidentally, but there is very little or no
>> targeted testing of this functionality.

> FYI cfbot showed a sign of some kind of error_context_stack corruption
> while running "DROP TABLE functest3 CASCADE;".

BTW, it occurs to me after answering bug #16534 that
contrib/earthdistance's SQL functions would be great candidates for this
new syntax.  Binding their references at creation time is really exactly
what we want.

I still feel that we can't just replace the existing implementation,
though, as that would kill too many use-cases where late binding is
helpful.

            regards, tom lane



Re: SQL-standard function body

From
Peter Eisentraut
Date:
On 2020-06-30 19:49, Peter Eisentraut wrote:
> This adds support for writing CREATE FUNCTION and CREATE PROCEDURE
> statements for language SQL with a function body that conforms to the
> SQL standard and is portable to other implementations.

Here is a new patch.  The only significant change is that pg_dump 
support is now fixed.  Per the discussion in [0], I have introduced a 
new function pg_get_function_sqlbody() that just produces the formatted 
SQL body, not the whole function definition.  All the tests now pass. 
As mentioned before, more tests are probably needed, so if reviewers 
just want to play with this and find things that don't work, those could 
be put into test cases, for example.

As a thought, a couple of things could probably be separated from this 
patch and considered separately:

1. making LANGUAGE SQL the default

2. the RETURN statement

If reviewers think that would be sensible, I can prepare separate 
patches for those.


[0]: 
https://www.postgresql.org/message-id/flat/9df8a3d3-13d2-116d-26ab-6a273c1ed38c%402ndquadrant.com

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: SQL-standard function body

From
Peter Eisentraut
Date:
Some conflicts have emerged, so here is an updated patch.

I have implemented/fixed the inlining of set-returning functions written 
in the new style, which was previously marked TODO in the patch.


On 2020-08-28 07:33, Peter Eisentraut wrote:
> On 2020-06-30 19:49, Peter Eisentraut wrote:
>> This adds support for writing CREATE FUNCTION and CREATE PROCEDURE
>> statements for language SQL with a function body that conforms to the
>> SQL standard and is portable to other implementations.
> 
> Here is a new patch.  The only significant change is that pg_dump
> support is now fixed.  Per the discussion in [0], I have introduced a
> new function pg_get_function_sqlbody() that just produces the formatted
> SQL body, not the whole function definition.  All the tests now pass.
> As mentioned before, more tests are probably needed, so if reviewers
> just want to play with this and find things that don't work, those could
> be put into test cases, for example.
> 
> As a thought, a couple of things could probably be separated from this
> patch and considered separately:
> 
> 1. making LANGUAGE SQL the default
> 
> 2. the RETURN statement
> 
> If reviewers think that would be sensible, I can prepare separate
> patches for those.
> 
> 
> [0]:
> https://www.postgresql.org/message-id/flat/9df8a3d3-13d2-116d-26ab-6a273c1ed38c%402ndquadrant.com
> 


-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: SQL-standard function body

From
Michael Paquier
Date:
On Mon, Sep 07, 2020 at 08:00:08AM +0200, Peter Eisentraut wrote:
> Some conflicts have emerged, so here is an updated patch.
>
> I have implemented/fixed the inlining of set-returning functions written in
> the new style, which was previously marked TODO in the patch.

The CF bot is telling that this patch fails to apply.  Could you send
a rebase?
--
Michael

Attachment

Re: SQL-standard function body

From
Peter Eisentraut
Date:
On 2020-09-29 07:42, Michael Paquier wrote:
> On Mon, Sep 07, 2020 at 08:00:08AM +0200, Peter Eisentraut wrote:
>> Some conflicts have emerged, so here is an updated patch.
>>
>> I have implemented/fixed the inlining of set-returning functions written in
>> the new style, which was previously marked TODO in the patch.
> 
> The CF bot is telling that this patch fails to apply.  Could you send
> a rebase?

Here is a rebase, no functionality changes.

As indicated earlier, I'll also send some sub-patches as separate 
submissions.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: SQL-standard function body

From
Peter Eisentraut
Date:
Here is another updated patch.  I did some merging and some small fixes 
and introduced the pg_proc column prosqlbody to store the parsed 
function body separately from probin.  Aside from one TODO left it seems 
feature-complete to me for now.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: SQL-standard function body

From
Georgios Kokolatos
Date:
Hi,

I noticed that this patch fails on the cfbot.
For this, I changed the status to: 'Waiting on Author'.

Cheers,
//Georgios

The new status of this patch is: Waiting on Author

Re: SQL-standard function body

From
Peter Eisentraut
Date:
On 2020-11-10 16:21, Georgios Kokolatos wrote:
> Hi,
> 
> I noticed that this patch fails on the cfbot.
> For this, I changed the status to: 'Waiting on Author'.
> 
> Cheers,
> //Georgios
> 
> The new status of this patch is: Waiting on Author
> 

Here is an updated patch to get it building again.

-- 
Peter Eisentraut
2ndQuadrant, an EDB company
https://www.2ndquadrant.com/

Attachment

Re: SQL-standard function body

From
Peter Eisentraut
Date:
On 2020-11-20 08:25, Peter Eisentraut wrote:
> On 2020-11-10 16:21, Georgios Kokolatos wrote:
>> Hi,
>>
>> I noticed that this patch fails on the cfbot.
>> For this, I changed the status to: 'Waiting on Author'.
>>
>> Cheers,
>> //Georgios
>>
>> The new status of this patch is: Waiting on Author
>>
> 
> Here is an updated patch to get it building again.

Another updated patch to get things building again.  I've also fixed the 
last TODO I had in there in qualifying function arguments as necessary 
in ruleutils.c.

-- 
Peter Eisentraut
2ndQuadrant, an EDB company
https://www.2ndquadrant.com/

Attachment

Re: SQL-standard function body

From
Peter Eisentraut
Date:
On 15.03.21 09:14, Julien Rouhaud wrote:
> On Mon, Mar 15, 2021 at 04:03:44PM +0800, Julien Rouhaud wrote:
>> On Mon, Mar 15, 2021 at 01:05:11AM -0500, Jaime Casanova wrote:
>>> I found another problem when using CASE expressions:
>>>
>>> CREATE OR REPLACE FUNCTION foo_case()
>>> RETURNS boolean
>>> LANGUAGE SQL
>>> BEGIN ATOMIC
>>>      select case when random() > 0.5 then true else false end;
>>> END;
>>>
>>> apparently the END in the CASE expression is interpreted as the END of
>>> the function
>>
>> I think that it's an issue in psql scanner.  If you escape the semicolon or
>> force a single query execution (say with psql -c), it works as expected.
> 
> Applying the following diff (not sending a patch to avoid breaking the cfbot)
> the issue and doesn't seem to break anything else:

Right.  Here is a new patch with that fix added and a small conflict 
resolved.

Attachment

Re: SQL-standard function body

From
Jaime Casanova
Date:
On Fri, Mar 19, 2021 at 8:49 AM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
>
> Right.  Here is a new patch with that fix added and a small conflict
> resolved.

Great.

It seems print_function_sqlbody() is not protected to avoid receiving
a function that hasn't a standard sql body in
src/backend/utils/adt/ruleutils.c:3292, but instead it has an assert
that gets hit with something like this:

CREATE FUNCTION foo() RETURNS int LANGUAGE SQL AS $$ SELECT 1 $$;
SELECT pg_get_function_sqlbody('foo'::regproc);

--
Jaime Casanova
Director de Servicios Profesionales
SYSTEMGUARDS - Consultores de PostgreSQL



Re: SQL-standard function body

From
Julien Rouhaud
Date:
On Tue, Mar 23, 2021 at 11:28:55PM -0500, Jaime Casanova wrote:
> On Fri, Mar 19, 2021 at 8:49 AM Peter Eisentraut
> <peter.eisentraut@2ndquadrant.com> wrote:
> >
> > Right.  Here is a new patch with that fix added and a small conflict
> > resolved.
> 
> Great.
> 
> It seems print_function_sqlbody() is not protected to avoid receiving
> a function that hasn't a standard sql body in
> src/backend/utils/adt/ruleutils.c:3292, but instead it has an assert
> that gets hit with something like this:
> 
> CREATE FUNCTION foo() RETURNS int LANGUAGE SQL AS $$ SELECT 1 $$;
> SELECT pg_get_function_sqlbody('foo'::regproc);

It would also be good to add a regression test checking that we can't define a
function with both a prosrc and a prosqlbody.



@@ -76,6 +77,7 @@ ProcedureCreate(const char *procedureName,
                Oid languageValidator,
                const char *prosrc,
                const char *probin,
+               Node *prosqlbody,
                char prokind,
                bool security_definer,
                bool isLeakProof,
@@ -119,8 +121,6 @@ ProcedureCreate(const char *procedureName,
    /*
     * sanity checks
     */
-   Assert(PointerIsValid(prosrc));
-
    parameterCount = parameterTypes->dim1;


Shouldn't we still assert that we either have a valid procsrc or valid
prosqlbody?

No other comments apart from that!



Re: SQL-standard function body

From
Peter Eisentraut
Date:
On 31.03.21 12:12, Julien Rouhaud wrote:
> On Tue, Mar 23, 2021 at 11:28:55PM -0500, Jaime Casanova wrote:
>> On Fri, Mar 19, 2021 at 8:49 AM Peter Eisentraut
>> <peter.eisentraut@2ndquadrant.com> wrote:
>>>
>>> Right.  Here is a new patch with that fix added and a small conflict
>>> resolved.
>>
>> Great.
>>
>> It seems print_function_sqlbody() is not protected to avoid receiving
>> a function that hasn't a standard sql body in
>> src/backend/utils/adt/ruleutils.c:3292, but instead it has an assert
>> that gets hit with something like this:
>>
>> CREATE FUNCTION foo() RETURNS int LANGUAGE SQL AS $$ SELECT 1 $$;
>> SELECT pg_get_function_sqlbody('foo'::regproc);

fixed

> It would also be good to add a regression test checking that we can't define a
> function with both a prosrc and a prosqlbody.

done

> @@ -76,6 +77,7 @@ ProcedureCreate(const char *procedureName,
>                  Oid languageValidator,
>                  const char *prosrc,
>                  const char *probin,
> +               Node *prosqlbody,
>                  char prokind,
>                  bool security_definer,
>                  bool isLeakProof,
> @@ -119,8 +121,6 @@ ProcedureCreate(const char *procedureName,
>      /*
>       * sanity checks
>       */
> -   Assert(PointerIsValid(prosrc));
> -
>      parameterCount = parameterTypes->dim1;
> 
> 
> Shouldn't we still assert that we either have a valid procsrc or valid
> prosqlbody?

fixed

New patch attached.

Attachment

Re: SQL-standard function body

From
Julien Rouhaud
Date:
On Fri, Apr 02, 2021 at 02:25:15PM +0200, Peter Eisentraut wrote:
> 
> New patch attached.

Thanks, it all looks good to me.  I just spot a few minor formatting issues:

@@ -2968,6 +2973,13 @@ pg_get_functiondef(PG_FUNCTION_ARGS)
    }

    /* And finally the function definition ... */
+   tmp = SysCacheGetAttr(PROCOID, proctup, Anum_pg_proc_prosqlbody, &isnull);
+   if (proc->prolang == SQLlanguageId && !isnull)
+   {
+       print_function_sqlbody(&buf, proctup);
+   }
+   else
+   {
    appendStringInfoString(&buf, "AS ");

    tmp = SysCacheGetAttr(PROCOID, proctup, Anum_pg_proc_probin, &isnull);
@@ -2999,6 +3011,7 @@ pg_get_functiondef(PG_FUNCTION_ARGS)
    appendBinaryStringInfo(&buf, dq.data, dq.len);
    appendStringInfoString(&buf, prosrc);
    appendBinaryStringInfo(&buf, dq.data, dq.len);
+   }

The curly braces could probably be removed for the if branch, and the code in
the else branch isn't properly indented.

Other occurences:

+   else
+   {
+   src = TextDatumGetCString(tmp);
+
+   callback_arg.prosrc = src;
+
    /*
     * Set up to handle parameters while parsing the function body.  We need a
     * dummy FuncExpr node containing the already-simplified arguments to pass
@@ -4317,6 +4337,7 @@ inline_function(Oid funcid, Oid result_type, Oid result_collid,
    querytree = transformTopLevelStmt(pstate, linitial(raw_parsetree_list));

    free_parsestate(pstate);
+   }

and

+   else
+   {
+   char       *src;
+
+   src = TextDatumGetCString(tmp);
+
+   callback_arg.prosrc = src;
+
    /*
     * Set up to handle parameters while parsing the function body.  We can
     * use the FuncExpr just created as the input for
@@ -4829,18 +4878,6 @@ inline_set_returning_function(PlannerInfo *root, RangeTblEntry *rte)
                                      (Node *) fexpr,
                                      fexpr->inputcollid);

-   /*

and

@@ -2968,6 +2973,13 @@ pg_get_functiondef(PG_FUNCTION_ARGS)
    }

    /* And finally the function definition ... */
+   tmp = SysCacheGetAttr(PROCOID, proctup, Anum_pg_proc_prosqlbody, &isnull);
+   if (proc->prolang == SQLlanguageId && !isnull)
+   {
+       print_function_sqlbody(&buf, proctup);
+   }
+   else
+   {
    appendStringInfoString(&buf, "AS ");

    tmp = SysCacheGetAttr(PROCOID, proctup, Anum_pg_proc_probin, &isnull);
@@ -2999,6 +3011,7 @@ pg_get_functiondef(PG_FUNCTION_ARGS)
    appendBinaryStringInfo(&buf, dq.data, dq.len);
    appendStringInfoString(&buf, prosrc);
    appendBinaryStringInfo(&buf, dq.data, dq.len);
+   }

and

@@ -12290,7 +12309,11 @@ dumpFunc(Archive *fout, const FuncInfo *finfo)
     * versions would set it to "-".  There are no known cases in which prosrc
     * is unused, so the tests below for "-" are probably useless.
     */
-   if (probin[0] != '\0' && strcmp(probin, "-") != 0)
+   if (prosqlbody)
+   {
+       appendPQExpBufferStr(asPart, prosqlbody);
+   }


Are you planning to run pg_indent before committing or would that add too much
noise?

Anyway since it's only stylistic issues and the feature freeze is getting
closer I'm marking the patch as ready for committer.



Re: SQL-standard function body

From
Peter Eisentraut
Date:
On 03.04.21 05:39, Julien Rouhaud wrote:
> Are you planning to run pg_indent before committing or would that add too much
> noise?

Yeah, I figured I'd leave that for later, to not bloat the patch so much.

> Anyway since it's only stylistic issues and the feature freeze is getting
> closer I'm marking the patch as ready for committer.

Committed.  Thanks!



Re: SQL-standard function body

From
Tom Lane
Date:
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> Committed.  Thanks!

Buildfarm suggests this has some issues under force_parallel_mode.
I'm wondering about missed fields in outfuncs/readfuncs, or the like.

            regards, tom lane



Re: SQL-standard function body

From
Michael Paquier
Date:
On Wed, Apr 07, 2021 at 04:22:17PM -0400, Tom Lane wrote:
> Buildfarm suggests this has some issues under force_parallel_mode.
> I'm wondering about missed fields in outfuncs/readfuncs, or the like.

The problem looks a bit more fundamental to me, as there seems to be
some confusion with the concept of what should be the query string
when it comes to prosqlbody with a parallel run, as it replaces prosrc
in some cases where the function uses SQL as language.  If the
buildfarm cannot be put back to green, could it be possible to revert
this patch?
--
Michael

Attachment

Re: SQL-standard function body

From
Tom Lane
Date:
Michael Paquier <michael@paquier.xyz> writes:
> On Wed, Apr 07, 2021 at 04:22:17PM -0400, Tom Lane wrote:
>> Buildfarm suggests this has some issues under force_parallel_mode.
>> I'm wondering about missed fields in outfuncs/readfuncs, or the like.

> The problem looks a bit more fundamental to me, as there seems to be
> some confusion with the concept of what should be the query string 
> when it comes to prosqlbody with a parallel run, as it replaces prosrc
> in some cases where the function uses SQL as language.  If the
> buildfarm cannot be put back to green, could it be possible to revert
> this patch?

Andres pushed a stopgap fix.  We might end up reverting the patch
altogether for v14, but I don't want to be hasty.  This should be enough
to let people take advantage of the last few hours before feature freeze.

            regards, tom lane



Re: SQL-standard function body

From
Andres Freund
Date:
Hi,

On 2021-04-08 01:16:02 -0400, Tom Lane wrote:
> Michael Paquier <michael@paquier.xyz> writes:
> > On Wed, Apr 07, 2021 at 04:22:17PM -0400, Tom Lane wrote:
> >> Buildfarm suggests this has some issues under force_parallel_mode.
> >> I'm wondering about missed fields in outfuncs/readfuncs, or the like.
>
> > The problem looks a bit more fundamental to me, as there seems to be
> > some confusion with the concept of what should be the query string
> > when it comes to prosqlbody with a parallel run, as it replaces prosrc
> > in some cases where the function uses SQL as language.  If the
> > buildfarm cannot be put back to green, could it be possible to revert
> > this patch?
>
> Andres pushed a stopgap fix.

Let's hope that it does fix it on the BF as well. One holdup was that
check-world didn't succeed with force_parallel_mode=regress even after
the fix - but that turned out to be the fault of

commit 5fd9dfa5f50e4906c35133a414ebec5b6d518493 (HEAD)
Author: Bruce Momjian <bruce@momjian.us>
Date:   2021-04-07 13:06:47 -0400

    Move pg_stat_statements query jumbling to core.

et al.


> We might end up reverting the patch altogether for v14, but I don't
> want to be hasty.  This should be enough to let people take advantage
> of the last few hours before feature freeze.

Yea, I think it'd be good to make that decision after a decent night of
sleep or two. And an actual look at the issues the patch might (or might
not) have.


Independent of this patch, it might be a good idea to have
ExecInitParallelPlan() be robust against NULL querystrings. Places like
executor_errposition() are certainly trying to be...

Greetings,

Andres Freund



Re: SQL-standard function body

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> Independent of this patch, it might be a good idea to have
> ExecInitParallelPlan() be robust against NULL querystrings. Places like
> executor_errposition() are certainly trying to be...

FWIW, I think the long-term drift of things is definitely that
we want to have the querystring available everywhere.  Code like
executor_errposition is from an earlier era before we were trying
to enforce that.  In particular, if the querystring is available in
the leader and not the workers, then you will get different error
reporting behavior in parallel query than non-parallel query, which
is surely a bad thing.

So IMO what you did here is definitely a short-term thing that
we should be looking to revert.  The question at hand is why
Peter's patch broke this in the first place, and how hard it
will be to fix it properly.  I'm entirely on board with reverting
the feature if that isn't readily fixable.

            regards, tom lane



Re: SQL-standard function body

From
Julien Rouhaud
Date:
On Wed, Apr 07, 2021 at 10:27:59PM -0700, Andres Freund wrote:
> 
> One holdup was that
> check-world didn't succeed with force_parallel_mode=regress even after
> the fix - but that turned out to be the fault of
> 
> commit 5fd9dfa5f50e4906c35133a414ebec5b6d518493 (HEAD)
> Author: Bruce Momjian <bruce@momjian.us>
> Date:   2021-04-07 13:06:47 -0400
> 
>     Move pg_stat_statements query jumbling to core.
> 
> et al.

Yep, I'm on it!



Re: SQL-standard function body

From
Tom Lane
Date:
Julien Rouhaud <rjuju123@gmail.com> writes:
> On Wed, Apr 07, 2021 at 10:27:59PM -0700, Andres Freund wrote:
>> One holdup was that
>> check-world didn't succeed with force_parallel_mode=regress even after
>> the fix - but that turned out to be the fault of

>>>> Move pg_stat_statements query jumbling to core.

> Yep, I'm on it!

So far the buildfarm seems to be turning green after b3ee4c503 ...
so I wonder what extra condition is needed to cause the failure
Andres is seeing.

            regards, tom lane



Re: SQL-standard function body

From
Andres Freund
Date:
Hi,

On 2021-04-08 02:05:25 -0400, Tom Lane wrote:
> So far the buildfarm seems to be turning green after b3ee4c503 ...
> so I wonder what extra condition is needed to cause the failure
> Andres is seeing.

Nothing special, really. Surprised the BF doesn't see it:

andres@awork3:~/build/postgres/dev-assert/vpath$ cat /tmp/test.conf
force_parallel_mode=regress
andres@awork3:~/build/postgres/dev-assert/vpath$ make -j48 -s && EXTRA_REGRESS_OPTS='--temp-config /tmp/test.conf' make
-s-C contrib/pg_stat_statements/ check
 
All of PostgreSQL successfully made. Ready to install.
...
The differences that caused some tests to fail can be viewed in the
file "/home/andres/build/postgres/dev-assert/vpath/contrib/pg_stat_statements/regression.diffs".  A copy of the test
summarythat you see
 
above is saved in the file "/home/andres/build/postgres/dev-assert/vpath/contrib/pg_stat_statements/regression.out".
...

andres@awork3:~/build/postgres/dev-assert/vpath$ head -n 30
/home/andres/build/postgres/dev-assert/vpath/contrib/pg_stat_statements/regression.diffs
diff -du10 /home/andres/src/postgresql/contrib/pg_stat_statements/expected/pg_stat_statements.out
/home/andres/build/postgres/dev-assert/vpath/contrib/pg_stat_statements/results/pg_stat_statements.out
--- /home/andres/src/postgresql/contrib/pg_stat_statements/expected/pg_stat_statements.out    2021-04-06
09:08:42.688697932-0700
 
+++ /home/andres/build/postgres/dev-assert/vpath/contrib/pg_stat_statements/results/pg_stat_statements.out
2021-04-0723:30:26.876071024 -0700
 
@@ -118,37 +118,38 @@
  ?column? | ?column?
 ----------+----------
         1 | test
 (1 row)

 DEALLOCATE pgss_test;
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
                                     query                                     | calls | rows
 ------------------------------------------------------------------------------+-------+------
  PREPARE pgss_test (int) AS SELECT $1, $2 LIMIT $3                            |     1 |    1
- SELECT $1                                                                   +|     4 |    4
+ PREPARE pgss_test (int) AS SELECT $1, 'test' LIMIT 1;                        |     1 |    1
+ SELECT $1                                                                   +|     8 |    8
                                                                              +|       |
    AS "text"                                                                  |       |
- SELECT $1 + $2                                                               |     2 |    2
- SELECT $1 + $2 + $3 AS "add"                                                 |     3 |    3
- SELECT $1 AS "float"                                                         |     1 |    1
- SELECT $1 AS "int"                                                           |     2 |    2
+ SELECT $1 + $2                                                               |     4 |    4
+ SELECT $1 + $2 + $3 AS "add"                                                 |     6 |    6
+ SELECT $1 AS "float"                                                         |     2 |    2
+ SELECT $1 AS "int"                                                           |     4 |    4
  SELECT $1 AS i UNION SELECT $2 ORDER BY i                                    |     1 |    2
- SELECT $1 || $2                                                              |     1 |    1
- SELECT pg_stat_statements_reset()                                            |     1 |    1


Too tired to figure out why the BF doesn't see this. Perhaps the extra
settings aren't used because it's scripted as an install check?

Greetings,

Andres Freund



Re: SQL-standard function body

From
Andres Freund
Date:
Hi,

On 2021-04-08 01:41:40 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > Independent of this patch, it might be a good idea to have
> > ExecInitParallelPlan() be robust against NULL querystrings. Places like
> > executor_errposition() are certainly trying to be...
> 
> FWIW, I think the long-term drift of things is definitely that
> we want to have the querystring available everywhere.  Code like
> executor_errposition is from an earlier era before we were trying
> to enforce that.  In particular, if the querystring is available in
> the leader and not the workers, then you will get different error
> reporting behavior in parallel query than non-parallel query, which
> is surely a bad thing.

Yea, I think it's a sensible direction - but I think we should put the
line in the sand earlier on / higher up than ExecInitParallelPlan().

Greetings,

Andres Freund



Re: SQL-standard function body

From
Julien Rouhaud
Date:
On Wed, Apr 07, 2021 at 11:33:20PM -0700, Andres Freund wrote:
> Hi,
> 
> On 2021-04-08 02:05:25 -0400, Tom Lane wrote:
> > So far the buildfarm seems to be turning green after b3ee4c503 ...
> > so I wonder what extra condition is needed to cause the failure
> > Andres is seeing.
> 
> Nothing special, really. Surprised the BF doesn't see it:
> 
> andres@awork3:~/build/postgres/dev-assert/vpath$ cat /tmp/test.conf
> force_parallel_mode=regress
> andres@awork3:~/build/postgres/dev-assert/vpath$ make -j48 -s && EXTRA_REGRESS_OPTS='--temp-config /tmp/test.conf'
make-s -C contrib/pg_stat_statements/ check
 
> All of PostgreSQL successfully made. Ready to install.
> ...
> The differences that caused some tests to fail can be viewed in the
> file "/home/andres/build/postgres/dev-assert/vpath/contrib/pg_stat_statements/regression.diffs".  A copy of the test
summarythat you see
 
> above is saved in the file "/home/andres/build/postgres/dev-assert/vpath/contrib/pg_stat_statements/regression.out".
> ...

Is think this is because the buildfarm client is running installcheck for the
contribs rather than check, and pg_stat_statements/Makefile has:

# Disabled because these tests require "shared_preload_libraries=pg_stat_statements",
# which typical installcheck users do not have (e.g. buildfarm clients).
NO_INSTALLCHECK = 1




Re: SQL-standard function body

From
Tom Lane
Date:
Julien Rouhaud <rjuju123@gmail.com> writes:
> On Wed, Apr 07, 2021 at 11:33:20PM -0700, Andres Freund wrote:
>> Nothing special, really. Surprised the BF doesn't see it:

> Is think this is because the buildfarm client is running installcheck for the
> contribs rather than check, and pg_stat_statements/Makefile has:
> # Disabled because these tests require "shared_preload_libraries=pg_stat_statements",
> # which typical installcheck users do not have (e.g. buildfarm clients).
> NO_INSTALLCHECK = 1

No, because if that were the explanation then we'd be getting no
buildfarm coverage at all for for pg_stat_statements.  Which aside
from being awful contradicts the results at coverage.postgresql.org.

I think Andres has the right idea that there's some more-subtle
variation in the test conditions, but (yawn) too tired to look
into it right now.

            regards, tom lane



Re: SQL-standard function body

From
Michael Paquier
Date:
On Wed, Apr 07, 2021 at 11:35:14PM -0700, Andres Freund wrote:
> On 2021-04-08 01:41:40 -0400, Tom Lane wrote:
>> Andres Freund <andres@anarazel.de> writes:
>> FWIW, I think the long-term drift of things is definitely that
>> we want to have the querystring available everywhere.  Code like
>> executor_errposition is from an earlier era before we were trying
>> to enforce that.  In particular, if the querystring is available in
>> the leader and not the workers, then you will get different error
>> reporting behavior in parallel query than non-parallel query, which
>> is surely a bad thing.
>
> Yea, I think it's a sensible direction - but I think we should put the
> line in the sand earlier on / higher up than ExecInitParallelPlan().

Indeed, I agree that enforcing the availability of querystring
everywhere sounds like a sensible thing to do in terms of consistency,
and that's my impression when I scanned the parallel execution code,
and I don't really get why SQL function bodies should not bind by this
rule.  Would people object if I add an open item to track that?
--
Michael

Attachment

Re: SQL-standard function body

From
Julien Rouhaud
Date:
On Thu, Apr 08, 2021 at 04:54:56PM +0900, Michael Paquier wrote:
> On Wed, Apr 07, 2021 at 11:35:14PM -0700, Andres Freund wrote:
> > On 2021-04-08 01:41:40 -0400, Tom Lane wrote:
> >> Andres Freund <andres@anarazel.de> writes:
> >> FWIW, I think the long-term drift of things is definitely that
> >> we want to have the querystring available everywhere.  Code like
> >> executor_errposition is from an earlier era before we were trying
> >> to enforce that.  In particular, if the querystring is available in
> >> the leader and not the workers, then you will get different error
> >> reporting behavior in parallel query than non-parallel query, which
> >> is surely a bad thing.
> > 
> > Yea, I think it's a sensible direction - but I think we should put the
> > line in the sand earlier on / higher up than ExecInitParallelPlan().
> 
> Indeed, I agree that enforcing the availability of querystring
> everywhere sounds like a sensible thing to do in terms of consistency,
> and that's my impression when I scanned the parallel execution code,
> and I don't really get why SQL function bodies should not bind by this
> rule.  Would people object if I add an open item to track that?

It makes sense, +1 for an open item.



Re: SQL-standard function body

From
Julien Rouhaud
Date:
On Thu, Apr 08, 2021 at 02:58:02AM -0400, Tom Lane wrote:
> Julien Rouhaud <rjuju123@gmail.com> writes:
> > On Wed, Apr 07, 2021 at 11:33:20PM -0700, Andres Freund wrote:
> >> Nothing special, really. Surprised the BF doesn't see it:
> 
> > Is think this is because the buildfarm client is running installcheck for the
> > contribs rather than check, and pg_stat_statements/Makefile has:
> > # Disabled because these tests require "shared_preload_libraries=pg_stat_statements",
> > # which typical installcheck users do not have (e.g. buildfarm clients).
> > NO_INSTALLCHECK = 1
> 
> No, because if that were the explanation then we'd be getting no
> buildfarm coverage at all for for pg_stat_statements.  Which aside
> from being awful contradicts the results at coverage.postgresql.org.

Is there any chance that coverage.postgresql.org isn't backed by the buildfarm
client but a plain make check-world or something like that?

> I think Andres has the right idea that there's some more-subtle
> variation in the test conditions, but (yawn) too tired to look
> into it right now.

I tried to look at some force-parallel-mode animal, like mantid, and I don't
see any evidence of pg_stat_statements being run by a "make check", and only a
few contrib modules seem to have an explicit check phase.  However, looking at
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mantid&dt=2021-04-08%2007%3A07%3A05
I see

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=mantid&dt=2021-04-08%2007%3A07%3A05&stg=contrib-install-check-C:

[...]
make -C pg_stat_statements installcheck
make[1]: Entering directory `/u1/tac/build-farm-11/buildroot/HEAD/pgsql.build/contrib/pg_stat_statements'
make[1]: Nothing to be done for `installcheck'.



Re: SQL-standard function body

From
Tom Lane
Date:
Julien Rouhaud <rjuju123@gmail.com> writes:
> On Thu, Apr 08, 2021 at 02:58:02AM -0400, Tom Lane wrote:
>> No, because if that were the explanation then we'd be getting no
>> buildfarm coverage at all for for pg_stat_statements.  Which aside
>> from being awful contradicts the results at coverage.postgresql.org.

> Is there any chance that coverage.postgresql.org isn't backed by the buildfarm
> client but a plain make check-world or something like that?

Hmm, I think you're right.  Poking around in the log files from one
of my own buildfarm animals, there's no evidence that pg_stat_statements
is being tested at all.  Needless to say, that's just horrid :-(

I see that contrib/test_decoding also sets NO_INSTALLCHECK = 1,
and the reason it gets tested is that the buildfarm script has
a special module for that.  I guess we need to clone that module,
or maybe better, find a way to generalize it.

There are also some src/test/modules modules that set NO_INSTALLCHECK,
but apparently those do have coverage (modules-check is the step that
runs their SQL tests, and then the TAP tests if any get broken out
as separate buildfarm steps).

            regards, tom lane



Re: SQL-standard function body

From
Andrew Dunstan
Date:
On 4/8/21 2:40 AM, Julien Rouhaud wrote:
> On Wed, Apr 07, 2021 at 11:33:20PM -0700, Andres Freund wrote:
>> Hi,
>>
>> On 2021-04-08 02:05:25 -0400, Tom Lane wrote:
>>> So far the buildfarm seems to be turning green after b3ee4c503 ...
>>> so I wonder what extra condition is needed to cause the failure
>>> Andres is seeing.
>> Nothing special, really. Surprised the BF doesn't see it:
>>
>> andres@awork3:~/build/postgres/dev-assert/vpath$ cat /tmp/test.conf
>> force_parallel_mode=regress
>> andres@awork3:~/build/postgres/dev-assert/vpath$ make -j48 -s && EXTRA_REGRESS_OPTS='--temp-config /tmp/test.conf'
make-s -C contrib/pg_stat_statements/ check
 
>> All of PostgreSQL successfully made. Ready to install.
>> ...
>> The differences that caused some tests to fail can be viewed in the
>> file "/home/andres/build/postgres/dev-assert/vpath/contrib/pg_stat_statements/regression.diffs".  A copy of the test
summarythat you see
 
>> above is saved in the file
"/home/andres/build/postgres/dev-assert/vpath/contrib/pg_stat_statements/regression.out".
>> ...
> Is think this is because the buildfarm client is running installcheck for the
> contribs rather than check, and pg_stat_statements/Makefile has:
>
> # Disabled because these tests require "shared_preload_libraries=pg_stat_statements",
> # which typical installcheck users do not have (e.g. buildfarm clients).
> NO_INSTALLCHECK = 1
>
>
>


Yeah, Julien is right, we run "make check" for these in src/test/modules
but I missed contrib. I have fixed this on crake so we get some
immediate coverage and a fix will be pushed to github shortly.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: SQL-standard function body

From
Alvaro Herrera
Date:
On 2021-Apr-08, Julien Rouhaud wrote:

> On Thu, Apr 08, 2021 at 02:58:02AM -0400, Tom Lane wrote:

> > No, because if that were the explanation then we'd be getting no
> > buildfarm coverage at all for for pg_stat_statements.  Which aside
> > from being awful contradicts the results at coverage.postgresql.org.
> 
> Is there any chance that coverage.postgresql.org isn't backed by the buildfarm
> client but a plain make check-world or something like that?

Yes, coverage.pg.org runs "make check-world".

Maybe it would make sense to change that script, so that it runs the
buildfarm's run_build.pl script instead of "make check-world".  That
would make coverage.pg.org report what the buildfarm actually tests ...
it would have made this problem a bit more obvious.

-- 
Álvaro Herrera                            39°49'30"S 73°17'W



Re: SQL-standard function body

From
Michael Paquier
Date:
On Thu, Apr 08, 2021 at 12:21:05PM -0400, Tom Lane wrote:
> I see that contrib/test_decoding also sets NO_INSTALLCHECK = 1,
> and the reason it gets tested is that the buildfarm script has
> a special module for that.  I guess we need to clone that module,
> or maybe better, find a way to generalize it.
>
> There are also some src/test/modules modules that set NO_INSTALLCHECK,
> but apparently those do have coverage (modules-check is the step that
> runs their SQL tests, and then the TAP tests if any get broken out
> as separate buildfarm steps).

FWIW, on Windows any module with NO_INSTALLCHECK does not get tested
as we rely mostly on an installed server to do all the tests and avoid
the performance impact of setting up a new server for each module's
test.
--
Michael

Attachment

Re: SQL-standard function body

From
Tom Lane
Date:
Julien Rouhaud <rjuju123@gmail.com> writes:
> On Thu, Apr 08, 2021 at 04:54:56PM +0900, Michael Paquier wrote:
>> Indeed, I agree that enforcing the availability of querystring
>> everywhere sounds like a sensible thing to do in terms of consistency,
>> and that's my impression when I scanned the parallel execution code,
>> and I don't really get why SQL function bodies should not bind by this
>> rule.  Would people object if I add an open item to track that?

> It makes sense, +1 for an open item.

So here's what I propose to do about this.

0001 attached reverts the patch's change to remove the NOT NULL
constraint on pg_proc.prosrc.  I think that was an extremely poor
decision; it risks breaking non-core PLs, and for that matter I'm not
sure the core PLs wouldn't crash on null prosrc.  It is not any harder
for the SQL-language-related code to condition its checks on not-null
prosqlbody instead of null prosrc.  Of course that then requires us to
put something into prosrc for these newfangled functions, but in 0001
I just used an empty string.  (This patch also adds an Assert to
standard_ExecutorStart checking that some source text was provided,
responding to Andres' point that we should be checking that upstream
of parallel query.  We should then revert b3ee4c503, but for simplicity
I didn't include that here.)

0002 addresses a different missing-source-text problem, which is that
the patch didn't bother to provide source text while running parse
analysis on the SQL function body.  That means no error cursors for
problems; which might seem cosmetic on the toy example I added to the
regression tests, but it won't be for people writing functions that
are dozens or hundreds of lines long.

Finally, 0003 might be a bit controversial: it changes the stored
prosrc for new-style SQL functions to be the query text of the CREATE
FUNCTION command.  The main argument I can see being made against this
is that it'll bloat the pg_proc entry.  But I think that that's
not a terribly reasonable concern, because the source text is going
to be a good deal smaller than the nodeToString representation in
just about every case.

The real value of 0003 of course would be to get an error cursor at
runtime, but I failed to create an example where that would happen
today.  Right now there are only three calls of executor_errposition,
and all of them are for cases that are already rejected by the parser,
so they're effectively unreachable.  A scenario that seems more likely
to be reachable is a failure reported during function inlining, but
most of the reasons I can think of for that also seem unreachable given
the already-parse-analyzed nature of the function body in these cases.
Maybe I'm just under-caffeinated today.

Another point here is that for any error cursor to appear, we need
not only source text at hand but also token locations in the query
tree nodes.  Right now, since readfuncs.c intentionally discards
those locations, we won't have that.  There is not-normally-compiled
logic to reload those location fields, though, and I think before too
long we'll want to enable it in some mainstream cases --- notably
parallel query's shipping of querytrees to workers.  However, until
it gets easier to reach cases where an error-with-location can be
thrown from the executor, I don't feel a need to do that.

I do have ambitions to make execution-time errors produce cursors
in more cases, so I think this will come to fruition before long,
but not in v14.

One could make an argument, therefore, for holding off 0003 until
there's more support for execution-time error cursors.  I don't
think we should though, for two reasons:
1. It'd be better to keep the pg_proc representation of new-style
SQL functions stable across versions.
2. Storing the CREATE text means we'll capture comments associated
with the function text, which is something that at least some
people will complain about the loss of.  Admittedly we have no way
to re-integrate the comments into the de-parsed body, but some
folks might be satisfied with grabbing the prosrc text.

Thoughts?

            regards, tom lane

diff --git a/src/backend/catalog/pg_proc.c b/src/backend/catalog/pg_proc.c
index cd13e63852..478dbde3fe 100644
--- a/src/backend/catalog/pg_proc.c
+++ b/src/backend/catalog/pg_proc.c
@@ -121,7 +121,7 @@ ProcedureCreate(const char *procedureName,
     /*
      * sanity checks
      */
-    Assert(PointerIsValid(prosrc) || PointerIsValid(prosqlbody));
+    Assert(PointerIsValid(prosrc));

     parameterCount = parameterTypes->dim1;
     if (parameterCount < 0 || parameterCount > FUNC_MAX_ARGS)
@@ -336,10 +336,7 @@ ProcedureCreate(const char *procedureName,
         values[Anum_pg_proc_protrftypes - 1] = trftypes;
     else
         nulls[Anum_pg_proc_protrftypes - 1] = true;
-    if (prosrc)
-        values[Anum_pg_proc_prosrc - 1] = CStringGetTextDatum(prosrc);
-    else
-        nulls[Anum_pg_proc_prosrc - 1] = true;
+    values[Anum_pg_proc_prosrc - 1] = CStringGetTextDatum(prosrc);
     if (probin)
         values[Anum_pg_proc_probin - 1] = CStringGetTextDatum(probin);
     else
@@ -874,26 +871,29 @@ fmgr_sql_validator(PG_FUNCTION_ARGS)
     /* Postpone body checks if !check_function_bodies */
     if (check_function_bodies)
     {
+        tmp = SysCacheGetAttr(PROCOID, tuple, Anum_pg_proc_prosrc, &isnull);
+        if (isnull)
+            elog(ERROR, "null prosrc");
+
+        prosrc = TextDatumGetCString(tmp);
+
         /*
          * Setup error traceback support for ereport().
          */
         callback_arg.proname = NameStr(proc->proname);
-        callback_arg.prosrc = NULL;
+        callback_arg.prosrc = prosrc;

         sqlerrcontext.callback = sql_function_parse_error_callback;
         sqlerrcontext.arg = (void *) &callback_arg;
         sqlerrcontext.previous = error_context_stack;
         error_context_stack = &sqlerrcontext;

-        tmp = SysCacheGetAttr(PROCOID, tuple, Anum_pg_proc_prosrc, &isnull);
-        if (isnull)
+        /* If we have prosqlbody, pay attention to that not prosrc */
+        tmp = SysCacheGetAttr(PROCOID, tuple, Anum_pg_proc_prosqlbody, &isnull);
+        if (!isnull)
         {
             Node       *n;

-            tmp = SysCacheGetAttr(PROCOID, tuple, Anum_pg_proc_prosqlbody, &isnull);
-            if (isnull)
-                elog(ERROR, "null prosrc and prosqlbody");
-
             n = stringToNode(TextDatumGetCString(tmp));
             if (IsA(n, List))
                 querytree_list = castNode(List, n);
@@ -902,10 +902,6 @@ fmgr_sql_validator(PG_FUNCTION_ARGS)
         }
         else
         {
-            prosrc = TextDatumGetCString(tmp);
-
-            callback_arg.prosrc = prosrc;
-
             /*
              * We can't do full prechecking of the function definition if there
              * are any polymorphic input types, because actual datatypes of
@@ -1001,9 +997,6 @@ function_parse_error_transpose(const char *prosrc)
     int            newerrposition;
     const char *queryText;

-    if (!prosrc)
-        return false;
-
     /*
      * Nothing to do unless we are dealing with a syntax error that has a
      * cursor position.
diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c
index 199029b7a8..dc317c83af 100644
--- a/src/backend/commands/functioncmds.c
+++ b/src/backend/commands/functioncmds.c
@@ -958,8 +958,17 @@ interpret_AS_clause(Oid languageOid, const char *languageName,
             *sql_body_out = (Node *) q;
         }

+        /*
+         * We must put something in prosrc.  For the moment, just record an
+         * empty string.  It might be useful to store the original text of the
+         * CREATE FUNCTION statement --- but to make actual use of that in
+         * error reports, we'd also have to adjust readfuncs.c to not throw
+         * away node location fields when reading prosqlbody.
+         */
+        *prosrc_str_p = pstrdup("");
+
+        /* But we definitely don't need probin. */
         *probin_str_p = NULL;
-        *prosrc_str_p = NULL;
     }
     else
     {
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index b2e2df8773..2cf6dad768 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -195,6 +195,8 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
             palloc0(nParamExec * sizeof(ParamExecData));
     }

+    /* We now require all callers to provide sourceText */
+    Assert(queryDesc->sourceText != NULL);
     estate->es_sourceText = queryDesc->sourceText;

     /*
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 642683843e..39580f7d57 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -667,6 +667,15 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK)
                           procedureTuple,
                           Anum_pg_proc_prosrc,
                           &isNull);
+    if (isNull)
+        elog(ERROR, "null prosrc for function %u", foid);
+    fcache->src = TextDatumGetCString(tmp);
+
+    /* If we have prosqlbody, pay attention to that not prosrc. */
+    tmp = SysCacheGetAttr(PROCOID,
+                          procedureTuple,
+                          Anum_pg_proc_prosqlbody,
+                          &isNull);

     /*
      * Parse and rewrite the queries in the function text.  Use sublists to
@@ -678,18 +687,11 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK)
      * plancache.c.
      */
     queryTree_list = NIL;
-    if (isNull)
+    if (!isNull)
     {
         Node       *n;
         List       *stored_query_list;

-        tmp = SysCacheGetAttr(PROCOID,
-                              procedureTuple,
-                              Anum_pg_proc_prosqlbody,
-                              &isNull);
-        if (isNull)
-            elog(ERROR, "null prosrc and prosqlbody for function %u", foid);
-
         n = stringToNode(TextDatumGetCString(tmp));
         if (IsA(n, List))
             stored_query_list = linitial_node(List, castNode(List, n));
@@ -710,8 +712,6 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK)
     {
         List       *raw_parsetree_list;

-        fcache->src = TextDatumGetCString(tmp);
-
         raw_parsetree_list = pg_parse_query(fcache->src);

         foreach(lc, raw_parsetree_list)
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index 526997327c..d9ad4efc5e 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -4317,32 +4317,37 @@ inline_function(Oid funcid, Oid result_type, Oid result_collid,
                                   ALLOCSET_DEFAULT_SIZES);
     oldcxt = MemoryContextSwitchTo(mycxt);

+    /* Fetch the function body */
+    tmp = SysCacheGetAttr(PROCOID,
+                          func_tuple,
+                          Anum_pg_proc_prosrc,
+                          &isNull);
+    if (isNull)
+        elog(ERROR, "null prosrc for function %u", funcid);
+    src = TextDatumGetCString(tmp);
+
     /*
      * Setup error traceback support for ereport().  This is so that we can
      * finger the function that bad information came from.
      */
     callback_arg.proname = NameStr(funcform->proname);
-    callback_arg.prosrc = NULL;
+    callback_arg.prosrc = src;

     sqlerrcontext.callback = sql_inline_error_callback;
     sqlerrcontext.arg = (void *) &callback_arg;
     sqlerrcontext.previous = error_context_stack;
     error_context_stack = &sqlerrcontext;

-    /* Fetch the function body */
+    /* If we have prosqlbody, pay attention to that not prosrc */
     tmp = SysCacheGetAttr(PROCOID,
                           func_tuple,
-                          Anum_pg_proc_prosrc,
+                          Anum_pg_proc_prosqlbody,
                           &isNull);
-    if (isNull)
+    if (!isNull)
     {
         Node       *n;
         List       *querytree_list;

-        tmp = SysCacheGetAttr(PROCOID, func_tuple, Anum_pg_proc_prosqlbody, &isNull);
-        if (isNull)
-            elog(ERROR, "null prosrc and prosqlbody for function %u", funcid);
-
         n = stringToNode(TextDatumGetCString(tmp));
         if (IsA(n, List))
             querytree_list = linitial_node(List, castNode(List, n));
@@ -4354,10 +4359,6 @@ inline_function(Oid funcid, Oid result_type, Oid result_collid,
     }
     else
     {
-    src = TextDatumGetCString(tmp);
-
-    callback_arg.prosrc = src;
-
     /*
      * Set up to handle parameters while parsing the function body.  We need a
      * dummy FuncExpr node containing the already-simplified arguments to pass
@@ -4658,15 +4659,12 @@ sql_inline_error_callback(void *arg)
     int            syntaxerrposition;

     /* If it's a syntax error, convert to internal syntax error report */
-    if (callback_arg->prosrc)
+    syntaxerrposition = geterrposition();
+    if (syntaxerrposition > 0)
     {
-        syntaxerrposition = geterrposition();
-        if (syntaxerrposition > 0)
-        {
-            errposition(0);
-            internalerrposition(syntaxerrposition);
-            internalerrquery(callback_arg->prosrc);
-        }
+        errposition(0);
+        internalerrposition(syntaxerrposition);
+        internalerrquery(callback_arg->prosrc);
     }

     errcontext("SQL function \"%s\" during inlining", callback_arg->proname);
@@ -4778,6 +4776,7 @@ inline_set_returning_function(PlannerInfo *root, RangeTblEntry *rte)
     Oid            func_oid;
     HeapTuple    func_tuple;
     Form_pg_proc funcform;
+    char       *src;
     Datum        tmp;
     bool        isNull;
     MemoryContext oldcxt;
@@ -4886,31 +4885,36 @@ inline_set_returning_function(PlannerInfo *root, RangeTblEntry *rte)
                                   ALLOCSET_DEFAULT_SIZES);
     oldcxt = MemoryContextSwitchTo(mycxt);

+    /* Fetch the function body */
+    tmp = SysCacheGetAttr(PROCOID,
+                          func_tuple,
+                          Anum_pg_proc_prosrc,
+                          &isNull);
+    if (isNull)
+        elog(ERROR, "null prosrc for function %u", func_oid);
+    src = TextDatumGetCString(tmp);
+
     /*
      * Setup error traceback support for ereport().  This is so that we can
      * finger the function that bad information came from.
      */
     callback_arg.proname = NameStr(funcform->proname);
-    callback_arg.prosrc = NULL;
+    callback_arg.prosrc = src;

     sqlerrcontext.callback = sql_inline_error_callback;
     sqlerrcontext.arg = (void *) &callback_arg;
     sqlerrcontext.previous = error_context_stack;
     error_context_stack = &sqlerrcontext;

-    /* Fetch the function body */
+    /* If we have prosqlbody, pay attention to that not prosrc */
     tmp = SysCacheGetAttr(PROCOID,
                           func_tuple,
-                          Anum_pg_proc_prosrc,
+                          Anum_pg_proc_prosqlbody,
                           &isNull);
-    if (isNull)
+    if (!isNull)
     {
         Node       *n;

-        tmp = SysCacheGetAttr(PROCOID, func_tuple, Anum_pg_proc_prosqlbody, &isNull);
-        if (isNull)
-            elog(ERROR, "null prosrc and prosqlbody for function %u", func_oid);
-
         n = stringToNode(TextDatumGetCString(tmp));
         if (IsA(n, List))
             querytree_list = linitial_node(List, castNode(List, n));
@@ -4927,12 +4931,6 @@ inline_set_returning_function(PlannerInfo *root, RangeTblEntry *rte)
     }
     else
     {
-    char       *src;
-
-    src = TextDatumGetCString(tmp);
-
-    callback_arg.prosrc = src;
-
     /*
      * Set up to handle parameters while parsing the function body.  We can
      * use the FuncExpr just created as the input for
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index d0ea489614..e1c65d8dfc 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -12247,7 +12247,7 @@ dumpFunc(Archive *fout, const FuncInfo *finfo)

     if (fout->remoteVersion >= 140000)
         appendPQExpBufferStr(query,
-                             "CASE WHEN prosrc IS NULL AND lanname = 'sql' THEN pg_get_function_sqlbody(p.oid) END AS
prosqlbody\n");
+                             "pg_get_function_sqlbody(p.oid) AS prosqlbody\n");
     else
         appendPQExpBufferStr(query,
                              "NULL AS prosqlbody\n");
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index fdc2a89085..400b683859 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -512,7 +512,7 @@ describeFunctions(const char *functypes, const char *func_pattern,
                           gettext_noop("Language"));
         if (pset.sversion >= 140000)
             appendPQExpBuffer(&buf,
-                              ",\n COALESCE(p.prosrc, pg_catalog.pg_get_function_sqlbody(p.oid)) as \"%s\"",
+                              ",\n COALESCE(pg_catalog.pg_get_function_sqlbody(p.oid), p.prosrc) as \"%s\"",
                               gettext_noop("Source code"));
         else
             appendPQExpBuffer(&buf,
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 8d58067d03..448d9898cb 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -112,7 +112,7 @@ CATALOG(pg_proc,1255,ProcedureRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(81,Proce
     Oid            protrftypes[1] BKI_DEFAULT(_null_) BKI_LOOKUP(pg_type);

     /* procedure source text */
-    text        prosrc;
+    text        prosrc BKI_FORCE_NOT_NULL;

     /* secondary procedure info (can be NULL) */
     text        probin BKI_DEFAULT(_null_);
diff --git a/src/include/executor/functions.h b/src/include/executor/functions.h
index dcb8e18437..b56ce26da0 100644
--- a/src/include/executor/functions.h
+++ b/src/include/executor/functions.h
@@ -17,9 +17,6 @@
 #include "nodes/execnodes.h"
 #include "tcop/dest.h"

-/* This struct is known only within executor/functions.c */
-typedef struct SQLFunctionParseInfo *SQLFunctionParseInfoPtr;
-
 /*
  * Data structure needed by the parser callback hooks to resolve parameter
  * references during parsing of a SQL function's body.  This is separate from
@@ -35,6 +32,8 @@ typedef struct SQLFunctionParseInfo
     Oid            collation;        /* function's input collation, if known */
 }            SQLFunctionParseInfo;

+typedef SQLFunctionParseInfo *SQLFunctionParseInfoPtr;
+
 extern Datum fmgr_sql(PG_FUNCTION_ARGS);

 extern SQLFunctionParseInfoPtr prepare_sql_fn_parse_info(HeapTuple procedureTuple,
diff --git a/src/test/regress/expected/create_function_3.out b/src/test/regress/expected/create_function_3.out
index 477130e620..9f8733f75c 100644
--- a/src/test/regress/expected/create_function_3.out
+++ b/src/test/regress/expected/create_function_3.out
@@ -290,6 +290,12 @@ CREATE FUNCTION functest_S_xx(x anyarray) RETURNS anyelement
     LANGUAGE SQL
     RETURN x[1];
 ERROR:  SQL function with unquoted function body cannot have polymorphic arguments
+-- check reporting of parse-analysis errors
+CREATE FUNCTION functest_S_xx(x date) RETURNS boolean
+    LANGUAGE SQL
+    RETURN x > 1;
+ERROR:  operator does not exist: date > integer
+HINT:  No operator matches the given name and argument types. You might need to add explicit type casts.
 -- tricky parsing
 CREATE FUNCTION functest_S_15(x int) RETURNS boolean
 LANGUAGE SQL
diff --git a/src/test/regress/sql/create_function_3.sql b/src/test/regress/sql/create_function_3.sql
index 3575ecc693..9a03301621 100644
--- a/src/test/regress/sql/create_function_3.sql
+++ b/src/test/regress/sql/create_function_3.sql
@@ -191,6 +191,11 @@ CREATE FUNCTION functest_S_xx(x anyarray) RETURNS anyelement
     LANGUAGE SQL
     RETURN x[1];

+-- check reporting of parse-analysis errors
+CREATE FUNCTION functest_S_xx(x date) RETURNS boolean
+    LANGUAGE SQL
+    RETURN x > 1;
+
 -- tricky parsing
 CREATE FUNCTION functest_S_15(x int) RETURNS boolean
 LANGUAGE SQL
diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c
index dc317c83af..e7cb5c65e9 100644
--- a/src/backend/commands/functioncmds.c
+++ b/src/backend/commands/functioncmds.c
@@ -852,7 +852,9 @@ static void
 interpret_AS_clause(Oid languageOid, const char *languageName,
                     char *funcname, List *as, Node *sql_body_in,
                     List *parameterTypes, List *inParameterNames,
-                    char **prosrc_str_p, char **probin_str_p, Node **sql_body_out)
+                    char **prosrc_str_p, char **probin_str_p,
+                    Node **sql_body_out,
+                    const char *queryString)
 {
     if (!sql_body_in && !as)
         ereport(ERROR,
@@ -929,6 +931,7 @@ interpret_AS_clause(Oid languageOid, const char *languageName,
                 Query       *q;
                 ParseState *pstate = make_parsestate(NULL);

+                pstate->p_sourcetext = queryString;
                 sql_fn_parser_setup(pstate, pinfo);
                 q = transformStmt(pstate, stmt);
                 if (q->commandType == CMD_UTILITY)
@@ -947,6 +950,7 @@ interpret_AS_clause(Oid languageOid, const char *languageName,
             Query       *q;
             ParseState *pstate = make_parsestate(NULL);

+            pstate->p_sourcetext = queryString;
             sql_fn_parser_setup(pstate, pinfo);
             q = transformStmt(pstate, sql_body_in);
             if (q->commandType == CMD_UTILITY)
@@ -954,6 +958,7 @@ interpret_AS_clause(Oid languageOid, const char *languageName,
                         errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                         errmsg("%s is not yet supported in unquoted SQL function body",
                                GetCommandTagName(CreateCommandTag(q->utilityStmt))));
+            free_parsestate(pstate);

             *sql_body_out = (Node *) q;
         }
@@ -1220,7 +1225,8 @@ CreateFunction(ParseState *pstate, CreateFunctionStmt *stmt)

     interpret_AS_clause(languageOid, language, funcname, as_clause, stmt->sql_body,
                         parameterTypes_list, inParameterNames_list,
-                        &prosrc_str, &probin_str, &prosqlbody);
+                        &prosrc_str, &probin_str, &prosqlbody,
+                        pstate->p_sourcetext);

     /*
      * Set default values for COST and ROWS depending on other parameters;
diff --git a/src/test/regress/expected/create_function_3.out b/src/test/regress/expected/create_function_3.out
index 9f8733f75c..e11a7a2dad 100644
--- a/src/test/regress/expected/create_function_3.out
+++ b/src/test/regress/expected/create_function_3.out
@@ -295,6 +295,8 @@ CREATE FUNCTION functest_S_xx(x date) RETURNS boolean
     LANGUAGE SQL
     RETURN x > 1;
 ERROR:  operator does not exist: date > integer
+LINE 3:     RETURN x > 1;
+                     ^
 HINT:  No operator matches the given name and argument types. You might need to add explicit type casts.
 -- tricky parsing
 CREATE FUNCTION functest_S_15(x int) RETURNS boolean
diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c
index e7cb5c65e9..73f9fe1f70 100644
--- a/src/backend/commands/functioncmds.c
+++ b/src/backend/commands/functioncmds.c
@@ -964,13 +964,13 @@ interpret_AS_clause(Oid languageOid, const char *languageName,
         }

         /*
-         * We must put something in prosrc.  For the moment, just record an
-         * empty string.  It might be useful to store the original text of the
-         * CREATE FUNCTION statement --- but to make actual use of that in
-         * error reports, we'd also have to adjust readfuncs.c to not throw
-         * away node location fields when reading prosqlbody.
+         * Use the text of the CREATE FUNCTION statement as prosrc.  Note that
+         * this might be more text than strictly necessary, if CREATE FUNCTION
+         * is part of a multi-statement string; but trimming it would require
+         * adjusting the location fields in prosqlbody, which seems like way
+         * more trouble than the case is worth.
          */
-        *prosrc_str_p = pstrdup("");
+        *prosrc_str_p = pstrdup(queryString);

         /* But we definitely don't need probin. */
         *probin_str_p = NULL;

Re: SQL-standard function body

From
Andrew Dunstan
Date:
On 4/9/21 12:09 PM, Tom Lane wrote:
> One could make an argument, therefore, for holding off 0003 until
> there's more support for execution-time error cursors.  I don't
> think we should though, for two reasons:
> 1. It'd be better to keep the pg_proc representation of new-style
> SQL functions stable across versions.
> 2. Storing the CREATE text means we'll capture comments associated
> with the function text, which is something that at least some
> people will complain about the loss of.  Admittedly we have no way
> to re-integrate the comments into the de-parsed body, but some
> folks might be satisfied with grabbing the prosrc text.
>


+many for storing query text.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: SQL-standard function body

From
Noah Misch
Date:
On Fri, Apr 09, 2021 at 12:09:43PM -0400, Tom Lane wrote:
> Finally, 0003 might be a bit controversial: it changes the stored
> prosrc for new-style SQL functions to be the query text of the CREATE
> FUNCTION command.  The main argument I can see being made against this
> is that it'll bloat the pg_proc entry.  But I think that that's
> not a terribly reasonable concern

Such storage cost should be acceptable, but ...

> The real value of 0003 of course would be to get an error cursor at
> runtime

A key benefit of $SUBJECT is the function body following DDL renames:

create table foo ();
insert into foo default values;
create function count_it() returns int begin atomic return (select count(*) from foo); end;
select count_it();
insert into foo default values;
alter table foo rename to some_new_long_table_name;
select count_it();  -- still works

After the rename, any stored prosrc is obsolete.  To show accurate error
cursors, deparse prosqlbody and use that in place of prosrc.



Re: SQL-standard function body

From
Tom Lane
Date:
Noah Misch <noah@leadboat.com> writes:
> On Fri, Apr 09, 2021 at 12:09:43PM -0400, Tom Lane wrote:
>> The real value of 0003 of course would be to get an error cursor at
>> runtime

> A key benefit of $SUBJECT is the function body following DDL renames:

Agreed.  But ...

> After the rename, any stored prosrc is obsolete.  To show accurate error
> cursors, deparse prosqlbody and use that in place of prosrc.

... I'm not sure this conclusion follows.  There are two problems with it:

1. I don't see an acceptably low-overhead way to mechanize it.
Deparsing prosqlbody is unlikely to be safe in a post-error transaction,
but surely we'd not want to expend that cost in advance on every use
of a SQL function.  Even ignoring that, the act of deparsing would not
in itself tell you what offset to use.  Should we deparse and then
re-parse to get a new node tree with corrected token locations?

2. The reason we can get away with showing a fragment of a large query
(or function body) in an error message is that the user is supposed to
be able to correlate the display with what she wrote.  That assumption
falls to the ground if the display is based on a deconstruction that is
virtually certain to have line breaks in different places, not to mention
that the details of what is shown may be substantially different from the
original.

Still, I take your point that the original text may be less useful
for this purpose than I was supposing.

            regards, tom lane



Re: SQL-standard function body

From
Noah Misch
Date:
On Sat, Apr 10, 2021 at 10:52:15AM -0400, Tom Lane wrote:
> Noah Misch <noah@leadboat.com> writes:
> > On Fri, Apr 09, 2021 at 12:09:43PM -0400, Tom Lane wrote:
> >> The real value of 0003 of course would be to get an error cursor at
> >> runtime
> 
> > A key benefit of $SUBJECT is the function body following DDL renames:
> 
> Agreed.  But ...
> 
> > After the rename, any stored prosrc is obsolete.  To show accurate error
> > cursors, deparse prosqlbody and use that in place of prosrc.
> 
> ... I'm not sure this conclusion follows.  There are two problems with it:
> 
> 1. I don't see an acceptably low-overhead way to mechanize it.
> Deparsing prosqlbody is unlikely to be safe in a post-error transaction,
> but surely we'd not want to expend that cost in advance on every use
> of a SQL function.  Even ignoring that, the act of deparsing would not
> in itself tell you what offset to use.  Should we deparse and then
> re-parse to get a new node tree with corrected token locations?

If you really want those error cursors, yes.  (I feel we can continue to live
without them; their absence is no more important than it was ten years ago.)
One can envision several ways to cache that high-overhead work.  Otherwise,
the usual PostgreSQL answer would be to omit an error cursor, not show one
that reflects an obsolete sense of the function.

If the original CREATE FUNCTION query text were so valuable, I'd be arguing to
preserve it across dump/reload.

> 2. The reason we can get away with showing a fragment of a large query
> (or function body) in an error message is that the user is supposed to
> be able to correlate the display with what she wrote.  That assumption
> falls to the ground if the display is based on a deconstruction that is
> virtually certain to have line breaks in different places, not to mention
> that the details of what is shown may be substantially different from the
> original.

Preferences on this matter will be situation-dependent.  If I do CREATE
FUNCTION f() ...; SELECT f() all in one sitting, then it's fine for an error
in the SELECT to show the function I wrote.  If I'm calling a function defined
years ago, I'm likely to compare the error report to "\sf foo" and not likely
to compare it to a years-old record of the SQL statement.  I think it's fine
to expect users to consult "\sf foo" when the user is in doubt.

> Still, I take your point that the original text may be less useful
> for this purpose than I was supposing.



Re: SQL-standard function body

From
Tom Lane
Date:
Based on the discussion so far, I've committed 0001 and 0002 but not 0003,
and marked this open issue as closed.

            regards, tom lane



Re: SQL-standard function body

From
Noah Misch
Date:
On Tue, Jun 30, 2020 at 02:51:38PM -0400, Tom Lane wrote:
> The point remains that exposing the function body's dependencies will
> constrain restore order far more than we are accustomed to see.  It
> might be possible to build examples that flat out can't be restored,
> even granting that we teach pg_dump how to break dependency loops
> by first creating the function with empty body and later redefining
> it with the real body.  (Admittedly, if that's possible then you
> likely could make it happen with views too.  But somehow it seems
> more likely that people would create spaghetti dependencies for
> functions than views.)

Should we be okay releasing v14 without support for breaking function
dependency loops, or does that call for an open item?

-- example
create function f() returns int language sql return 1;
create function g() returns int language sql return f();
create or replace function f() returns int language sql return coalesce(2, g());

-- but when a view can break the cycle, pg_dump still does so
create view v as select 1 as c;
create function f() returns int language sql return coalesce(0, (select count(*) from v));
create or replace view v as select f() as c;



Re: SQL-standard function body

From
Tom Lane
Date:
Noah Misch <noah@leadboat.com> writes:
> Should we be okay releasing v14 without support for breaking function
> dependency loops, or does that call for an open item?

Oh!  That should definitely be an open item.  It doesn't seem
that hard to do something similar to what we do for views,
i.e. create a dummy function and replace it later.

            regards, tom lane



Re: SQL-standard function body

From
Justin Pryzby
Date:
On Sun, Apr 18, 2021 at 03:08:44PM -0400, Tom Lane wrote:
> Noah Misch <noah@leadboat.com> writes:
> > Should we be okay releasing v14 without support for breaking function
> > dependency loops, or does that call for an open item?
> 
> Oh!  That should definitely be an open item.  It doesn't seem
> that hard to do something similar to what we do for views,
> i.e. create a dummy function and replace it later.

I added
https://wiki.postgresql.org/index.php?title=PostgreSQL_14_Open_Items&type=revision&diff=35926&oldid=35925



Re: SQL-standard function body

From
Tom Lane
Date:
... BTW, a dependency loop is also possible without using this feature,
by abusing default-value expressions:

create function f1(x int, y int) returns int language sql
as 'select $1 + $2';
create function f2(x int, y int default f1(1,2)) returns int language sql
as 'select $1 + $2';
create or replace function f1(x int, y int default f2(11,12)) returns int language sql
as 'select $1 + $2';

The actual use-case for that seems pretty thin, so we never bothered
to worry about it before.  But if we're going to build loop-breaking
logic to handle function body dependencies, it should deal with this
too.  I think that all that's required is for the initial dummy
function declaration to omit defaults as well as providing a dummy
body.

            regards, tom lane



Re: SQL-standard function body

From
Jeff Janes
Date:
On Wed, Apr 7, 2021 at 3:55 PM Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:

Committed.  Thanks!


This commit break line continuation prompts for unbalanced parentheses in the psql binary.  Skimming through this thread, I don't see that this is intentional or has been noticed before.

with psql -X

Before:

jjanes=# asdf (
jjanes(# 

Now:

jjanes=# asdf (
jjanes-# 

I've looked through the parts of the commit that change psql, but didn't see an obvious culprit.

Cheers,

Jeff

Re: SQL-standard function body

From
Peter Eisentraut
Date:
On 18.04.21 23:33, Tom Lane wrote:
> ... BTW, a dependency loop is also possible without using this feature,
> by abusing default-value expressions:
> 
> create function f1(x int, y int) returns int language sql
> as 'select $1 + $2';
> create function f2(x int, y int default f1(1,2)) returns int language sql
> as 'select $1 + $2';
> create or replace function f1(x int, y int default f2(11,12)) returns int language sql
> as 'select $1 + $2';
> 
> The actual use-case for that seems pretty thin, so we never bothered
> to worry about it before.  But if we're going to build loop-breaking
> logic to handle function body dependencies, it should deal with this
> too.  I think that all that's required is for the initial dummy
> function declaration to omit defaults as well as providing a dummy
> body.

I have studied this a bit.  I'm not sure where the dummy function 
declaration should be created.  The current dependency-breaking logic in 
pg_dump_sort.c doesn't appear to support injecting additional objects 
into the set of dumpable objects.  So we would need to create it perhaps 
in dumpFunc() and then later set flags that indicate whether it will be 
required.

Another option would be that we disallow this at creation time.  It 
seems we could detect dependency loops using findDependentObjects(), so 
this might not be so difficult.  The use case for recursive SQL 
functions is probably low, at least with the current limited set of 
control flow options in SQL.  (And you can always use a quoted body to 
work around it.)




Re: SQL-standard function body

From
Tom Lane
Date:
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> On 18.04.21 23:33, Tom Lane wrote:
>> The actual use-case for that seems pretty thin, so we never bothered
>> to worry about it before.  But if we're going to build loop-breaking
>> logic to handle function body dependencies, it should deal with this
>> too.  I think that all that's required is for the initial dummy
>> function declaration to omit defaults as well as providing a dummy
>> body.

> I have studied this a bit.  I'm not sure where the dummy function 
> declaration should be created.  The current dependency-breaking logic in 
> pg_dump_sort.c doesn't appear to support injecting additional objects 
> into the set of dumpable objects.  So we would need to create it perhaps 
> in dumpFunc() and then later set flags that indicate whether it will be 
> required.

Hmm, good point.  The existing code that breaks loops involving views
depends on the fact that the view relation and the view's ON SELECT
rule are already treated as distinct objects within pg_dump.  So we
just need to mark the rule object to indicate whether to emit it or
not.  To make it work for functions, there would have to be a secondary
object representing the function body (and the default expressions,
I guess).

That's kind of a lot of complication, and inefficiency, for a corner case
that may never arise in practice.  We've ignored the risk for default
expressions, and AFAIR have yet to receive any field complaints about it.
So maybe it's okay to do the same for SQL-style function bodies, at least
for now.

> Another option would be that we disallow this at creation time.

Don't like that one much.  The backend shouldn't be in the business
of rejecting valid commands just because pg_dump might be unable
to cope later.

            regards, tom lane



Re: SQL-standard function body

From
Peter Eisentraut
Date:
On 27.04.21 04:44, Justin Pryzby wrote:
> On Thu, Apr 22, 2021 at 04:04:18PM -0400, Jeff Janes wrote:
>> This commit break line continuation prompts for unbalanced parentheses in
>> the psql binary.  Skimming through this thread, I don't see that this is
>> intentional or has been noticed before.
>>
>> with psql -X
>>
>> Before:
>>
>> jjanes=# asdf (
>> jjanes(#
>>
>> Now:
>>
>> jjanes=# asdf (
>> jjanes-#
>>
>> I've looked through the parts of the commit that change psql, but didn't
>> see an obvious culprit.
> 
> I haven't studied it in detail, but it probably needs something like this.

Yeah, fixed like that.



Re: SQL-standard function body

From
Peter Eisentraut
Date:
On 27.04.21 18:16, Tom Lane wrote:
> That's kind of a lot of complication, and inefficiency, for a corner case
> that may never arise in practice.  We've ignored the risk for default
> expressions, and AFAIR have yet to receive any field complaints about it.
> So maybe it's okay to do the same for SQL-style function bodies, at least
> for now.
> 
>> Another option would be that we disallow this at creation time.
> 
> Don't like that one much.  The backend shouldn't be in the business
> of rejecting valid commands just because pg_dump might be unable
> to cope later.

Since this is listed as an open item, I want to clarify that I'm 
currently not planning to work on this, based on this discussion. 
Certainly something to look into sometime later, but it's not in my 
plans right now.



Re: SQL-standard function body

From
Tom Lane
Date:
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> On 27.04.21 18:16, Tom Lane wrote:
>> That's kind of a lot of complication, and inefficiency, for a corner case
>> that may never arise in practice.  We've ignored the risk for default
>> expressions, and AFAIR have yet to receive any field complaints about it.
>> So maybe it's okay to do the same for SQL-style function bodies, at least
>> for now.

>>> Another option would be that we disallow this at creation time.

>> Don't like that one much.  The backend shouldn't be in the business
>> of rejecting valid commands just because pg_dump might be unable
>> to cope later.

> Since this is listed as an open item, I want to clarify that I'm 
> currently not planning to work on this, based on this discussion. 
> Certainly something to look into sometime later, but it's not in my 
> plans right now.

Right, I concur with moving it to the "won't fix" category.

            regards, tom lane



Re: SQL-standard function body

From
Noah Misch
Date:
On Mon, May 10, 2021 at 11:09:43AM -0400, Tom Lane wrote:
> Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> > On 27.04.21 18:16, Tom Lane wrote:
> >> That's kind of a lot of complication, and inefficiency, for a corner case
> >> that may never arise in practice.  We've ignored the risk for default
> >> expressions, and AFAIR have yet to receive any field complaints about it.
> >> So maybe it's okay to do the same for SQL-style function bodies, at least
> >> for now.
> 
> >>> Another option would be that we disallow this at creation time.
> 
> >> Don't like that one much.  The backend shouldn't be in the business
> >> of rejecting valid commands just because pg_dump might be unable
> >> to cope later.
> 
> > Since this is listed as an open item, I want to clarify that I'm 
> > currently not planning to work on this, based on this discussion. 
> > Certainly something to look into sometime later, but it's not in my 
> > plans right now.
> 
> Right, I concur with moving it to the "won't fix" category.

Works for me.



Re: SQL-standard function body

From
Noah Misch
Date:
On Wed, Apr 07, 2021 at 09:55:40PM +0200, Peter Eisentraut wrote:
> Committed.  Thanks!

I get a NULL pointer dereference if the function body has a doubled semicolon:

  create function f() returns int language sql begin atomic select 1;; end;

Program received signal SIGSEGV, Segmentation fault.
transformStmt (pstate=pstate@entry=0x2623978, parseTree=parseTree@entry=0x0) at analyze.c:297
297             switch (nodeTag(parseTree))
#0  transformStmt (pstate=pstate@entry=0x2623978, parseTree=parseTree@entry=0x0) at analyze.c:297
#1  0x00000000006132a4 in interpret_AS_clause (queryString=<optimized out>, sql_body_out=<synthetic pointer>,
probin_str_p=<syntheticpointer>, prosrc_str_p=<synthetic pointer>, inParameterNames=<optimized out>,
parameterTypes=<optimizedout>,
 
    sql_body_in=<optimized out>, as=<optimized out>, funcname=<optimized out>, languageName=<optimized out>,
languageOid=14)at functioncmds.c:937
 
#2  CreateFunction (pstate=pstate@entry=0x26213e0, stmt=stmt@entry=0x25fd048) at functioncmds.c:1227
#3  0x0000000000813e23 in ProcessUtilitySlow (pstate=pstate@entry=0x26213e0, pstmt=pstmt@entry=0x25fd3b8,
queryString=queryString@entry=0x25fc040"create function f() returns int language sql begin atomic select 1;; end;",
 
    context=context@entry=PROCESS_UTILITY_TOPLEVEL, params=params@entry=0x0, queryEnv=queryEnv@entry=0x0,
qc=qc@entry=0x7fff4b715b70,dest=0x25fd4a8) at utility.c:1607
 
#4  0x0000000000812944 in standard_ProcessUtility (pstmt=0x25fd3b8, queryString=0x25fc040 "create function f() returns
intlanguage sql begin atomic select 1;; end;", context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0,
dest=0x25fd4a8,
    qc=0x7fff4b715b70) at utility.c:1034
#5  0x0000000000810efe in PortalRunUtility (portal=portal@entry=0x265fb60, pstmt=0x25fd3b8,
isTopLevel=isTopLevel@entry=true,setHoldSnapshot=setHoldSnapshot@entry=false, dest=0x25fd4a8, qc=0x7fff4b715b70) at
pquery.c:1147
#6  0x0000000000811053 in PortalRunMulti (portal=portal@entry=0x265fb60, isTopLevel=isTopLevel@entry=true,
setHoldSnapshot=setHoldSnapshot@entry=false,dest=dest@entry=0x25fd4a8, altdest=altdest@entry=0x25fd4a8,
qc=qc@entry=0x7fff4b715b70)at pquery.c:1310
 
#7  0x00000000008115e4 in PortalRun (portal=portal@entry=0x265fb60, count=count@entry=9223372036854775807,
isTopLevel=isTopLevel@entry=true,run_once=run_once@entry=true, dest=dest@entry=0x25fd4a8,
altdest=altdest@entry=0x25fd4a8,qc=qc@entry=0x7fff4b715b70)
 
    at pquery.c:786
#8  0x000000000080d004 in exec_simple_query (query_string=0x25fc040 "create function f() returns int language sql begin
atomicselect 1;; end;") at postgres.c:1214
 
#9  0x000000000080ee1f in PostgresMain (argc=argc@entry=1, argv=argv@entry=0x7fff4b716030, dbname=0x2627788 "test",
username=<optimizedout>) at postgres.c:4486
 
#10 0x000000000048bc97 in BackendRun (port=<optimized out>, port=<optimized out>) at postmaster.c:4507
#11 BackendStartup (port=0x261f480) at postmaster.c:4229
#12 ServerLoop () at postmaster.c:1745
#13 0x000000000077c278 in PostmasterMain (argc=argc@entry=1, argv=argv@entry=0x25f6a00) at postmaster.c:1417
#14 0x000000000048d51e in main (argc=1, argv=0x25f6a00) at main.c:209
(gdb) p parseTree
$1 = (Node *) 0x0



Re: SQL-standard function body

From
Julien Rouhaud
Date:
On Sat, Jun 05, 2021 at 09:44:18PM -0700, Noah Misch wrote:
> On Wed, Apr 07, 2021 at 09:55:40PM +0200, Peter Eisentraut wrote:
> > Committed.  Thanks!
> 
> I get a NULL pointer dereference if the function body has a doubled semicolon:
> 
>   create function f() returns int language sql begin atomic select 1;; end;

You don't even need a statements to reproduce the problem, a body containing
only semi-colon(s) will behave the same.

Attached patch should fix the problem.

Attachment

Re: SQL-standard function body

From
Peter Eisentraut
Date:
On 06.06.21 09:32, Julien Rouhaud wrote:
> On Sat, Jun 05, 2021 at 09:44:18PM -0700, Noah Misch wrote:
>> I get a NULL pointer dereference if the function body has a doubled semicolon:
>>
>>    create function f() returns int language sql begin atomic select 1;; end;
> 
> You don't even need a statements to reproduce the problem, a body containing
> only semi-colon(s) will behave the same.
> 
> Attached patch should fix the problem.

Your patch filters out empty statements at the parse transformation 
phase, so they are no longer present when you dump the body back out. 
So your edits in the test expected files don't fit.

I suggest we just prohibit empty statements at the parse stage.  I don't 
see a strong reason to allow them, and if we wanted to, we'd have to do 
more work, e.g., in ruleutils.c to print them back out correctly.

Attachment

Re: SQL-standard function body

From
Julien Rouhaud
Date:
On Mon, Jun 7, 2021 at 4:52 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
>
> Your patch filters out empty statements at the parse transformation
> phase, so they are no longer present when you dump the body back out.
> So your edits in the test expected files don't fit.

Oh, somehow the tests aren't failing here, I'm not sure what I did wrong.

> I suggest we just prohibit empty statements at the parse stage.  I don't
> see a strong reason to allow them, and if we wanted to, we'd have to do
> more work, e.g., in ruleutils.c to print them back out correctly.

I always thought extraneous semicolons were tokens to be ignored,
which happens to be internally implemented as empty statements, so
deparsing them is not required, similar to deparsing extraneous
whitespaces.  If the spec says otherwise then I agree it's not worth
implementing, but otherwise I'm not sure if it's really helpful to
error out.



Re: SQL-standard function body

From
Tom Lane
Date:
Julien Rouhaud <rjuju123@gmail.com> writes:
> On Mon, Jun 7, 2021 at 4:52 PM Peter Eisentraut
> <peter.eisentraut@enterprisedb.com> wrote:
>> Your patch filters out empty statements at the parse transformation
>> phase, so they are no longer present when you dump the body back out.
>> So your edits in the test expected files don't fit.

> Oh, somehow the tests aren't failing here, I'm not sure what I did wrong.

Modulo getting the tests right ...

>> I suggest we just prohibit empty statements at the parse stage.  I don't
>> see a strong reason to allow them, and if we wanted to, we'd have to do
>> more work, e.g., in ruleutils.c to print them back out correctly.

> I always thought extraneous semicolons were tokens to be ignored,

... I tend to agree with Julien's position here.  It seems really ugly
to prohibit empty statements just for implementation convenience.
However, the way I'd handle it is to have the grammar remove them,
which is what it does in other contexts.  I don't think there's any
need to preserve them in ruleutils output --- there's a lot of other
normalization we do on the way to that, and this seems to fit in.

BTW, is it just me, or does SQL:2021 fail to permit multiple
statements in a procedure at all?  After much searching, I found the
BEGIN ATOMIC ... END syntax, but it's in <triggered SQL statement>,
in other words the body of a trigger not a procedure.  I cannot find
any production that connects a <routine body> to that.  There's an
example showing use of BEGIN ATOMIC as a procedure statement, so
they clearly *meant* to allow it, but it looks like somebody messed
up the grammar.

            regards, tom lane



Re: SQL-standard function body

From
Julien Rouhaud
Date:
On Mon, Jun 7, 2021 at 11:27 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Julien Rouhaud <rjuju123@gmail.com> writes:
> > On Mon, Jun 7, 2021 at 4:52 PM Peter Eisentraut
> > <peter.eisentraut@enterprisedb.com> wrote:
> >> Your patch filters out empty statements at the parse transformation
> >> phase, so they are no longer present when you dump the body back out.
> >> So your edits in the test expected files don't fit.
>
> > Oh, somehow the tests aren't failing here, I'm not sure what I did wrong.
>
> Modulo getting the tests right ...

I can certainly accept that my patch broke the tests, but I just ran
another make check-world and it passed without any problem.  What am I
missing?



Re: SQL-standard function body

From
Tom Lane
Date:
I wrote:
> ... I tend to agree with Julien's position here.  It seems really ugly
> to prohibit empty statements just for implementation convenience.
> However, the way I'd handle it is to have the grammar remove them,
> which is what it does in other contexts.

Concretely, I think the right fix is per attached.

Like Julien, I don't see any additional change in regression test outputs.
Maybe Peter thinks there should be some?  But I think the reverse-listing
we get for functest_S_3a is fine.

            regards, tom lane

diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 9ee90e3f13..52a254928f 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -7990,7 +7990,11 @@ opt_routine_body:
 routine_body_stmt_list:
             routine_body_stmt_list routine_body_stmt ';'
                 {
-                    $$ = lappend($1, $2);
+                    /* As in stmtmulti, discard empty statements */
+                    if ($2 != NULL)
+                        $$ = lappend($1, $2);
+                    else
+                        $$ = $1;
                 }
             | /*EMPTY*/
                 {
diff --git a/src/test/regress/expected/create_function_3.out b/src/test/regress/expected/create_function_3.out
index 5b6bc5eddb..5955859bb5 100644
--- a/src/test/regress/expected/create_function_3.out
+++ b/src/test/regress/expected/create_function_3.out
@@ -267,7 +267,7 @@ CREATE FUNCTION functest_S_3() RETURNS boolean
     RETURN false;
 CREATE FUNCTION functest_S_3a() RETURNS boolean
     BEGIN ATOMIC
-        RETURN false;
+        ;;RETURN false;;
     END;
 CREATE FUNCTION functest_S_10(a text, b date) RETURNS boolean
     LANGUAGE SQL
diff --git a/src/test/regress/sql/create_function_3.sql b/src/test/regress/sql/create_function_3.sql
index 4b778999ed..6e8b838ff2 100644
--- a/src/test/regress/sql/create_function_3.sql
+++ b/src/test/regress/sql/create_function_3.sql
@@ -165,7 +165,7 @@ CREATE FUNCTION functest_S_3() RETURNS boolean
     RETURN false;
 CREATE FUNCTION functest_S_3a() RETURNS boolean
     BEGIN ATOMIC
-        RETURN false;
+        ;;RETURN false;;
     END;

 CREATE FUNCTION functest_S_10(a text, b date) RETURNS boolean

Re: SQL-standard function body

From
Peter Eisentraut
Date:
On 07.06.21 17:27, Tom Lane wrote:
> ... I tend to agree with Julien's position here.  It seems really ugly
> to prohibit empty statements just for implementation convenience.
> However, the way I'd handle it is to have the grammar remove them,
> which is what it does in other contexts.  I don't think there's any
> need to preserve them in ruleutils output --- there's a lot of other
> normalization we do on the way to that, and this seems to fit in.

Ok, if that's what people prefer.

> BTW, is it just me, or does SQL:2021 fail to permit multiple
> statements in a procedure at all?  After much searching, I found the
> BEGIN ATOMIC ... END syntax, but it's in <triggered SQL statement>,
> in other words the body of a trigger not a procedure.  I cannot find
> any production that connects a <routine body> to that.  There's an
> example showing use of BEGIN ATOMIC as a procedure statement, so
> they clearly*meant*  to allow it, but it looks like somebody messed
> up the grammar.

It's in the SQL/PSM part.



Re: SQL-standard function body

From
Julien Rouhaud
Date:
On Mon, Jun 07, 2021 at 03:24:33PM -0400, Tom Lane wrote:
> 
> Concretely, I think the right fix is per attached.

+1, I agree that this approach is better.



Re: SQL-standard function body

From
Tom Lane
Date:
Julien Rouhaud <rjuju123@gmail.com> writes:
> On Mon, Jun 07, 2021 at 03:24:33PM -0400, Tom Lane wrote:
>> Concretely, I think the right fix is per attached.

> +1, I agree that this approach is better.

Pushed.

            regards, tom lane