Thread: [HACKERS] merging some features from plpgsql2 project

[HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

27 December 2016, 10:54:27

I reread ideas described on page https://github.com/trustly/plpgsql2

Some points are well and can be benefit for PlpgSQL.

First I describe my initial position. I am strongly against introduction "new" language - plpgsql2 or new plpgsql, or any else. The trust of developers to us is important and introduction of any not compatible or different feature has to have really big reason. PostgreSQL is conservative environment, and PLpgSQL should not be a exception. More - I have not any information from my customers, colleagues about missing features in this language. If there is some gaps, then it is in outer environment - IDE, deployment, testing,

I understand so we have to do some break compatibility changes, but the changes should be smoothly with possibility to simply identify necessary changes, but better don't do it - and use other possibility.

i lost hope so plpgsql_check can be integrated to core. It is living outside well - only preparing dll for MSWindows is annoyance. But some checks from plpgsql_check can be implemented in core as extra_checks, and some checks from plpgsql2 can be implemented in plpgsql_check.

Points from plpgsql2:
* SELECT .. INTO vs. TOO_MANY_ROWS - can be implemented as extra check
* SELECT .. INTO and the number of result columns - good extra check too

* EXECUTE and FOUND - this is incompatible change, extra check can be used (test on unset variable). I see solution in leaving FOUND variable and introduction of some new without this issue - ROW_COUNT maybe (this is another possible incompatible change, but with higher benefit - maybe we can introduce some aliasing, PRAGMA clause, default PRAGMAs, ..).

* SELECT .. INTO and := - looks bizarre, but I see clean benefit and I can accept it

* The OUT namespace and OUT parameter visibility - I don't like it - not in this form - can we introduce some form of namespace aliasing? The arguments are in function name named namespace already.

Now, we doesn't use plpgsql_extra_check much and it is pity.

I thing so real question is support some macros, that can help with code maintenance for different PostgreSQL versions. PostGIS code is nice example what we are missing.

Regards

Pavel

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

27 December 2016, 10:58:47

2016-12-27 8:54 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

Hi

I reread ideas described on page https://github.com/trustly/plpgsql2

Some points are well and can be benefit for PlpgSQL.

First I describe my initial position. I am strongly against introduction "new" language - plpgsql2 or new plpgsql, or any else. The trust of developers to us is important and introduction of any not compatible or different feature has to have really big reason. PostgreSQL is conservative environment, and PLpgSQL should not be a exception. More - I have not any information from my customers, colleagues about missing features in this language. If there is some gaps, then it is in outer environment - IDE, deployment, testing,

I forgot - the big plpgsql issue are too weak expressions on left part of assignment statements.

Pavel

Re: [HACKERS] merging some features from plpgsql2 project

From

Merlin Moncure

Date:

28 December 2016, 01:56:43

On Tue, Dec 27, 2016 at 1:54 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> Hi
>
> I reread ideas described on page https://github.com/trustly/plpgsql2
>
> Some points are well and can be benefit for PlpgSQL.
>
> First I describe my initial position. I am strongly against introduction
> "new" language - plpgsql2 or new plpgsql, or any else. The trust of
> developers to us is important and introduction of any not compatible or
> different feature has to have really big reason. PostgreSQL is conservative
> environment, and PLpgSQL should not be a exception. More - I have not any
> information from my customers, colleagues about missing features in this
> language.  If there is some gaps, then it is in outer environment - IDE,
> deployment, testing,

Breaking language compatibility is a really big deal.  There has to be
a lot of benefits to the effort and you have to make translation from
plpgsql1 to plpgsql2 really simple.  You have made some good points on
the rationale but not nearly enough to justify implementation fork. So
basically I agree.  Having said that, If you don't mind I'd like to
run with the topic (which I'm loosely interpreting as, "Things I'd
like to do in SQL/PLPGSQL and can't").

#1 problem with plpgsql in my point of view is that the language and
grammar are not supersets of sql.  A lot of PLPGSQL keywords (EXECUTE,
BEGIN, INTO, END) have incompatible meanings with our SQL
implementation.  IMNSHO, SQL ought to give the same behavior inside or
outside of plpgsql.  It doesn't, and this is one of the reasons why
plpgsql may not be a good candidate for stored procedure
implementation.

#2 problem with plpgsql is after function entry it's too late to do
things like set transaction isolation level and change certain kinds
of variables (like statement_timeout).  This is very obnoxious, I
can't wrap the database in an API 100%; the application has to manage
things that really should be controlled in SQL.

#3 problem with plpgsql is complete lack of inlining.  inlining
function calls in postgres is a black art even for very trivial cases.
This makes it hard for us to write quick things and in the worst case
causes endless duplications of simple expressions.

In short I guess the issue is that we don't have stored procedures and
I don't see an easy path to getting there with the current language.
There are a lot of other little annoyances but most of them can be
solved without a compatibility break.

It would be pretty neat if postgres SQL implementation could directly
incorporate limited flow control and command execution.  For example,
CREATE my_proc(Done OUT BOOL) RETURNS BOOL AS
$$ BEGIN; SET transaction_isolation = 'serializable'; SELECT some_plpgsql_func_returning_bool(); COMMIT;
$$;
CALL my_proc() UNTIL Done;

Key points here are:
*) my_proc is in native SQL (not plpgsql), and run outside of snapshot
*) CALL is invocation into stored procedure.  I extended it in similar
fashion as pl/sql CALL
(https://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_4008.htm)
but anything will do for syntaxs as long as you get arbitrary control
of procedure lifetime external to snapshot and transaction
*) simple addition of UNTIL gets us out of the debate for best 'stored
procedure language'.   Keeping things to pure SQL really simplifies
things since we already have statement parsing at tcop level.  We just
need some special handling for CALL.
*) In my usage of plpgsql maybe 80% of database cases are covered
purely in language but maybe 20% of cases need support from
application typically where threading and transaction management is
involved.  With the above it would be more like 95% would be covered
and if you extended CALL to something like:

CALL my_proc() IN BACKGROUND UNTIL Done;

..where "IN BACKGOUND" moved execution to a background worker one
could do just about everything in SQL in tasks that do nothing but
read and write to the database that today need significant support
from outside language (primarily bash for me).

With respect to stuff you mentioned, like smarter handling of INTO,
are you really sure you need to break compatibility for that?

merlin

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

28 December 2016, 02:31:54

2016-12-27 23:56 GMT+01:00 Merlin Moncure <mmoncure@gmail.com>:

On Tue, Dec 27, 2016 at 1:54 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> Hi
>
> I reread ideas described on page https://github.com/trustly/plpgsql2
>
> Some points are well and can be benefit for PlpgSQL.
>
> First I describe my initial position. I am strongly against introduction
> "new" language - plpgsql2 or new plpgsql, or any else. The trust of
> developers to us is important and introduction of any not compatible or
> different feature has to have really big reason. PostgreSQL is conservative
> environment, and PLpgSQL should not be a exception. More - I have not any
> information from my customers, colleagues about missing features in this
> language. If there is some gaps, then it is in outer environment - IDE,
> deployment, testing,

Breaking language compatibility is a really big deal. There has to be
a lot of benefits to the effort and you have to make translation from
plpgsql1 to plpgsql2 really simple. You have made some good points on
the rationale but not nearly enough to justify implementation fork. So
basically I agree. Having said that, If you don't mind I'd like to
run with the topic (which I'm loosely interpreting as, "Things I'd
like to do in SQL/PLPGSQL and can't").

#1 problem with plpgsql in my point of view is that the language and
grammar are not supersets of sql. A lot of PLPGSQL keywords (EXECUTE,
BEGIN, INTO, END) have incompatible meanings with our SQL
implementation. IMNSHO, SQL ought to give the same behavior inside or
outside of plpgsql. It doesn't, and this is one of the reasons why
plpgsql may not be a good candidate for stored procedure
implementation.

There is little bit cleaner language for this purpose - SQL/PSM. But it is hard to switch main language without big lost of reputation. I am not sure about benefit.

#2 problem with plpgsql is after function entry it's too late to do
things like set transaction isolation level and change certain kinds
of variables (like statement_timeout). This is very obnoxious, I
can't wrap the database in an API 100%; the application has to manage
things that really should be controlled in SQL.

It is long story about implementation procedures - it is not related to PLpgSQL - the language is not a issue.

#3 problem with plpgsql is complete lack of inlining. inlining
function calls in postgres is a black art even for very trivial cases.
This makes it hard for us to write quick things and in the worst case
causes endless duplications of simple expressions.

In short I guess the issue is that we don't have stored procedures and
I don't see an easy path to getting there with the current language.
There are a lot of other little annoyances but most of them can be
solved without a compatibility break.

I don't think so implementation of procedures will be simple, but I don't see any issue in PLpgSQL.

It would be pretty neat if postgres SQL implementation could directly
incorporate limited flow control and command execution. For example,
CREATE my_proc(Done OUT BOOL) RETURNS BOOL AS
$$
BEGIN;
SET transaction_isolation = 'serializable';
SELECT some_plpgsql_func_returning_bool();
COMMIT;
$$;
CALL my_proc() UNTIL Done;

Key points here are:
*) my_proc is in native SQL (not plpgsql), and run outside of snapshot
*) CALL is invocation into stored procedure. I extended it in similar
fashion as pl/sql CALL
(https://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_4008.htm)
but anything will do for syntaxs as long as you get arbitrary control
of procedure lifetime external to snapshot and transaction
*) simple addition of UNTIL gets us out of the debate for best 'stored
procedure language'. Keeping things to pure SQL really simplifies
things since we already have statement parsing at tcop level. We just
need some special handling for CALL.
*) In my usage of plpgsql maybe 80% of database cases are covered
purely in language but maybe 20% of cases need support from
application typically where threading and transaction management is
involved. With the above it would be more like 95% would be covered
and if you extended CALL to something like:

It is similar to my older proposals of stored procedures.

CALL my_proc() IN BACKGROUND UNTIL Done;

..where "IN BACKGOUND" moved execution to a background worker one
could do just about everything in SQL in tasks that do nothing but
read and write to the database that today need significant support
from outside language (primarily bash for me).

With respect to stuff you mentioned, like smarter handling of INTO,
are you really sure you need to break compatibility for that?

I didn't propose any compatibility break.

Can we talk about another proposals separately, please. Stored procedures, batch processing, different language are different topic.

Regards

Pavel

merlin

Re: [HACKERS] merging some features from plpgsql2 project

From

Jim Nasby

Date:

28 December 2016, 07:09:53

On 12/27/16 4:56 PM, Merlin Moncure wrote:
> On Tue, Dec 27, 2016 at 1:54 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> First I describe my initial position. I am strongly against introduction
>> "new" language - plpgsql2 or new plpgsql, or any else. The trust of
>> developers to us is important and introduction of any not compatible or
>> different feature has to have really big reason. PostgreSQL is conservative
>> environment, and PLpgSQL should not be a exception. More - I have not any

Which is why this is an external fork of plpgsql.

** The real problem is that we have no mechanism for allowing a PL's 
language/syntax/API to move forward without massive backwards 
compatibility problems. **

This is NOT unique to plpgsql. plpython (for one) definitely has some 
stupidity that will require an API break to fix.

A secondary issue is the lack of a blessed collection of extensions. If 
we had that we could maintain some of this stuff outside of the core 
release schedule, as well as provide more room for people to run 
experimental versions of extensions if they desired. If we had this then 
perhaps plpgsql_check would become a viable answer to some of this 
(though IMHO plpgsql_check is just a work-around for our lack of dealing 
with API compatibility).

>> information from my customers, colleagues about missing features in this
>> language.  If there is some gaps, then it is in outer environment - IDE,
>> deployment, testing,

I'm honestly surprised (even shocked) that you've never run into any of 
the problems plpgsql2 is trying to solve. I've hit all those problems 
except for OUT parameters. I'd say the order they're listed in actually 
corresponds to how often I hit the problems.

> Breaking language compatibility is a really big deal.  There has to be
> a lot of benefits to the effort and you have to make translation from
> plpgsql1 to plpgsql2 really simple.  You have made some good points on

I think trying to move the ball forward in a meaningful way without 
breaking compatibility is a lost cause. Some of these issues could be 
addressed by adding more syntax, but even that has limits (do we really 
want another variation of STRICT that allows only 0 or 1 rows?). And 
there's no way to fix your #1 item below without breaking compatibility.

There *are* other ways this could be done, besides creating a different 
PL. One immediate possibility is custom GUCs; there may be other options.

> #1 problem with plpgsql in my point of view is that the language and
> grammar are not supersets of sql.  A lot of PLPGSQL keywords (EXECUTE,
> BEGIN, INTO, END) have incompatible meanings with our SQL
> implementation.  IMNSHO, SQL ought to give the same behavior inside or
> outside of plpgsql.  It doesn't, and this is one of the reasons why
> plpgsql may not be a good candidate for stored procedure
> implementation.

While this doesn't bug me, it's got to be confusing as hell for newbies.

> #2 problem with plpgsql is after function entry it's too late to do
> things like set transaction isolation level and change certain kinds
> of variables (like statement_timeout).  This is very obnoxious, I
> can't wrap the database in an API 100%; the application has to manage
> things that really should be controlled in SQL.

+1

> #3 problem with plpgsql is complete lack of inlining.  inlining
> function calls in postgres is a black art even for very trivial cases.
> This makes it hard for us to write quick things and in the worst case
> causes endless duplications of simple expressions.

Instead of banging our heads against the fmgr API to try and solve this, 
I suspect it would be much simpler (and easier to understand) if we had 
the equivalent to a #define for queries. The fmgr API just isn't 
amenable to trying to inline stuff. This would allow you to define 
things like views that accept arguments, so you can shove the argument 
way down in the guts of the query without getting tripped up by fences.

Here's some other plpgsql pain-points (though, not all of these require 
an API break):

#4: it's impossible to operate on a Datum in-place. Though, maybe the 
work Tom did with ExpandedObjects eliminates some of this problem, but 
if it does it's hidden behind the existing syntax and you have no way to 
know it (and AFAICT the only thing using that infrastructure right now 
is arrays). Aside from the performance aspects, it'd be damn nice to be 
able to do things like ++, +=, etc.

#5: handling of parameter name collisions still sucks. One way to 
improve this would be to put parameters inside the outer-most statement 
block, so you could use a block name that was different from the 
function name. Something else that might help is the ability to assign a 
namespace for query identifiers, so you don't have to alias every 
individual relation in a query.

#6: The variations of syntax between the FOR variants is annoying 
(specifically, FOREACH necessitating the ARRAY keyword).

#7: = vs := vs INTO. = can do everything the others can do except for 
STRICT, and when it comes to STRICT I actually wish we had language 
support for whether 0 or >1 rows are allowed. I've wanted that in the 
past for views, and if we had that then you'd be able to use it in SQL 
functions as well. If that's not possible then we should fid some other 
way to handle this in plpgsql, because STRICT is often too broad.

#8: EVERYTHING command option should accept a variable. In particular, 
RAISE should accept a variable for level, but there's other cases of 
this I've run into. I'd also be nice if you could plop variables into 
SQL commands where you'd have an identifier, though presumably that 
would require some kind of explicit variable identifier.

#9: You should be able to modify an exception before re-raising it.

#10: Exception information should be passed around as a composite.

#11: Composite support is very lacking. There's no easy way to get a 
list of what fields exist in a composite, let alone do something generic 
to some set of them. There are ways to work around this, but they're 
very tedious and ugly.

#12: It'd be nice if any was allowed, as there are operations that can 
apply to more than one class of data type.

#13: cstring support would allow a lot more people to experiment with 
things like custom types. Yes, plpgsql might be slow as hell for this, 
but sometimes that doesn't matter. Even if it does, it can be a lot 
easier to prototype in something other than C. (Granted, I think there's 
some non-plpgsql stuff that would need to happen to allow this.)

> In short I guess the issue is that we don't have stored procedures and
> I don't see an easy path to getting there with the current language.
> There are a lot of other little annoyances but most of them can be
> solved without a compatibility break.

Well, actual stored *procedures* is an entirely different problem, which 
(again) fmgr is absolutely not designed to handle. All the PL handlers 
that I've looked at have completely in-grained the notion that they're 
running inside a transaction, so it would be a lot of work to try and 
change that. While there may be some plpgsql-specific problems with it 
supporting stored procs, there are much bigger questions to answer 
before worrying about that.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

28 December 2016, 16:16:19

2016-12-28 5:09 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:

On 12/27/16 4:56 PM, Merlin Moncure wrote:
On Tue, Dec 27, 2016 at 1:54 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
First I describe my initial position. I am strongly against introduction
"new" language - plpgsql2 or new plpgsql, or any else. The trust of
developers to us is important and introduction of any not compatible or
different feature has to have really big reason. PostgreSQL is conservative
environment, and PLpgSQL should not be a exception. More - I have not any

Which is why this is an external fork of plpgsql.

ok. Just I would not to repeat Perl6 or Python3 story - it is big adventure, but big fail too

** The real problem is that we have no mechanism for allowing a PL's language/syntax/API to move forward without massive backwards compatibility problems. **

We have not, but there are few possibilities:

1. enhance #option command

2. we can introduce PRAGMA command https://en.wikipedia.org/wiki/Ada_(programming_language)#Pragmas

This is NOT unique to plpgsql. plpython (for one) definitely has some stupidity that will require an API break to fix.

A secondary issue is the lack of a blessed collection of extensions. If we had that we could maintain some of this stuff outside of the core release schedule, as well as provide more room for people to run experimental versions of extensions if they desired. If we had this then perhaps plpgsql_check would become a viable answer to some of this (though IMHO plpgsql_check is just a work-around for our lack of dealing with API compatibility).

plpgsql_check can do some test, that are impossible in plpgsql - from performance view, from features. But some "blessed collections of extension" can be nice. More if will be joined with some automatic test and build tools. Although lot of extensions are really mature, the knowleadge about these extensions are minimal - and building extensions on windows is hard work still (for Linux developer).

information from my customers, colleagues about missing features in this
language. If there is some gaps, then it is in outer environment - IDE,
deployment, testing,

I'm honestly surprised (even shocked) that you've never run into any of the problems plpgsql2 is trying to solve. I've hit all those problems except for OUT parameters. I'd say the order they're listed in actually corresponds to how often I hit the problems.

I hit lot of older harder (now solved) issues - now, with more experience I am able to see these issues. And I wrote plpgsql_check, partially for self too. Years ago I prefer safe expressions.

Breaking language compatibility is a really big deal. There has to be
a lot of benefits to the effort and you have to make translation from
plpgsql1 to plpgsql2 really simple. You have made some good points on

I think trying to move the ball forward in a meaningful way without breaking compatibility is a lost cause. Some of these issues could be addressed by adding more syntax, but even that has limits (do we really want another variation of STRICT that allows only 0 or 1 rows?). And there's no way to fix your #1 item below without breaking compatibility.

I think so there is way with extra check, or with persistent plpgsql options - just use it, please. Some checks are clear, some other not.

There *are* other ways this could be done, besides creating a different PL. One immediate possibility is custom GUCs; there may be other options.

#1 problem with plpgsql in my point of view is that the language and
grammar are not supersets of sql. A lot of PLPGSQL keywords (EXECUTE,
BEGIN, INTO, END) have incompatible meanings with our SQL
implementation. IMNSHO, SQL ought to give the same behavior inside or
outside of plpgsql. It doesn't, and this is one of the reasons why
plpgsql may not be a good candidate for stored procedure
implementation.

While this doesn't bug me, it's got to be confusing as hell for newbies.

If you know ALGOL family languages, then it is not problem. What is a harder problem for people is different implementation of mix SQL and PL - different than Oracle, or MSSQL. Our model is better, simpler but different. It is difficult for people without knowleadge of differences between functions and procedures. Partially we badly speaking so our void functions are procedures.

#2 problem with plpgsql is after function entry it's too late to do
things like set transaction isolation level and change certain kinds
of variables (like statement_timeout). This is very obnoxious, I
can't wrap the database in an API 100%; the application has to manage
things that really should be controlled in SQL.

+1

#3 problem with plpgsql is complete lack of inlining. inlining
function calls in postgres is a black art even for very trivial cases.
This makes it hard for us to write quick things and in the worst case
causes endless duplications of simple expressions.

Instead of banging our heads against the fmgr API to try and solve this, I suspect it would be much simpler (and easier to understand) if we had the equivalent to a #define for queries. The fmgr API just isn't amenable to trying to inline stuff. This would allow you to define things like views that accept arguments, so you can shove the argument way down in the guts of the query without getting tripped up by fences.

Here's some other plpgsql pain-points (though, not all of these require an API break):

#4: it's impossible to operate on a Datum in-place. Though, maybe the work Tom did with ExpandedObjects eliminates some of this problem, but if it does it's hidden behind the existing syntax and you have no way to know it (and AFAICT the only thing using that infrastructure right now is arrays). Aside from the performance aspects, it'd be damn nice to be able to do things like ++, +=, etc.

#5: handling of parameter name collisions still sucks. One way to improve this would be to put parameters inside the outer-most statement block, so you could use a block name that was different from the function name. Something else that might help is the ability to assign a namespace for query identifiers, so you don't have to alias every individual relation in a query.

#6: The variations of syntax between the FOR variants is annoying (specifically, FOREACH necessitating the ARRAY keyword).

this is design - FOR is old PL/SQL syntax. FOREACH is prepared for extending

#7: = vs := vs INTO. = can do everything the others can do except for STRICT, and when it comes to STRICT I actually wish we had language support for whether 0 or >1 rows are allowed. I've wanted that in the past for views, and if we had that then you'd be able to use it in SQL functions as well. If that's not possible then we should fid some other way to handle this in plpgsql, because STRICT is often too broad.

#8: EVERYTHING command option should accept a variable. In particular, RAISE should accept a variable for level, but there's other cases of this I've run into. I'd also be nice if you could plop variables into SQL commands where you'd have an identifier, though presumably that would require some kind of explicit variable identifier.

It is hiding dynamic SQL - I am strongly against it - minimally due performance issues. Important functionality should not be hidden.

#9: You should be able to modify an exception before re-raising it.

#10: Exception information should be passed around as a composite

I have not any problem with last two points

.

#11: Composite support is very lacking. There's no easy way to get a list of what fields exist in a composite, let alone do something generic to some set of them. There are ways to work around this, but they're very tedious and ugly.

#12: It'd be nice if any was allowed, as there are operations that can apply to more than one class of data type.

#13: cstring support would allow a lot more people to experiment with things like custom types. Yes, plpgsql might be slow as hell for this, but sometimes that doesn't matter. Even if it does, it can be a lot easier to prototype in something other than C. (Granted, I think there's some non-plpgsql stuff that would need to happen to allow this.)

Not sure about it (I have really realy wrong experience with some developers about performance) - but PLPython, PLPerl can do it well, and I miss some possibility - We can use transformations more time - SQL/MM is based on new datatypes and transformations.

In short I guess the issue is that we don't have stored procedures and
I don't see an easy path to getting there with the current language.
There are a lot of other little annoyances but most of them can be
solved without a compatibility break.

Well, actual stored *procedures* is an entirely different problem, which (again) fmgr is absolutely not designed to handle. All the PL handlers that I've looked at have completely in-grained the notion that they're running inside a transaction, so it would be a lot of work to try and change that. While there may be some plpgsql-specific problems with it supporting stored procs, there are much bigger questions to answer before worrying about that.

yes. The design of transaction controlling inside stored procedures is hard work not related to any PL. Some can be partially solved by functions executed in autonomous transactions. With background workers we can implement asynchronous autonomous transactions - what can coverage lot of use cases where transaction controlling should be used in other databases.

Regards

Pavel

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Jim Nasby

Date:

28 December 2016, 20:54:19

On 12/28/16 7:16 AM, Pavel Stehule wrote:
>     ** The real problem is that we have no mechanism for allowing a PL's
>     language/syntax/API to move forward without massive backwards
>     compatibility problems. **
>
>
> We have not, but there are few possibilities:
>
> 1. enhance #option command
> 2. we can introduce PRAGMA command
> https://en.wikipedia.org/wiki/Ada_(programming_language)#Pragmas
> <https://en.wikipedia.org/wiki/Ada_%28programming_language%29#Pragmas>

I wanted to break this out separately, because IMO it's the real heart 
of the matter.

I think it would be silly not to allow a global setting of 
compatibility. You certainly don't want to force people to stick magic 
keywords in their code forevermore.

To that end, would GUCs be a workable answer here? That should give you 
the ability to control incompatibilities at a function, user, database 
and global level. It would also allow you to chose between raising a 
WARNING vs a FATAL.

I realize we've had some bad experiences with compatibility GUCs in the 
past, but I'd argue we've also had some good experiences. I see that 
add_missing_from is now completely gone, for example, presumably with no 
complaints. There's probably several other compatibility GUCs we could 
remove now.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Jim Nasby

Date:

28 December 2016, 20:54:23

On 12/28/16 7:16 AM, Pavel Stehule wrote:
>
>
> 2016-12-28 5:09 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com
> <mailto:Jim.Nasby@bluetreble.com>>:
>
>     On 12/27/16 4:56 PM, Merlin Moncure wrote:
>
>         On Tue, Dec 27, 2016 at 1:54 AM, Pavel Stehule
>         <pavel.stehule@gmail.com <mailto:pavel.stehule@gmail.com>> wrote:
>     Which is why this is an external fork of plpgsql.
>
>
> ok. Just I would not to repeat Perl6 or Python3 story - it is big
> adventure, but big fail too

Yeah, creating an entirely "new" PL to deal with compatibility doesn't 
seem like a good idea to me.

>     ** The real problem is that we have no mechanism for allowing a PL's
>     language/syntax/API to move forward without massive backwards
>     compatibility problems. **
>
>
> We have not, but there are few possibilities:
>
> 1. enhance #option command
> 2. we can introduce PRAGMA command
> https://en.wikipedia.org/wiki/Ada_(programming_language)#Pragmas

See separate reply.

<discussion about "blessed extensions">

>     I'm honestly surprised (even shocked) that you've never run into any
>     of the problems plpgsql2 is trying to solve. I've hit all those
>     problems except for OUT parameters. I'd say the order they're listed
>     in actually corresponds to how often I hit the problems.
>
>
> I hit lot of older harder (now solved) issues - now, with more
> experience I am able to see these issues. And I wrote plpgsql_check,
> partially for self too. Years ago I prefer safe expressions.

Recognizing a problem ahead of time (or having plpgsql_check do it for 
you) still means you have to find a way to work around it. In some cases 
(ie: STRICT), that workaround can be a serious PITA. Better to just 
eliminate the problem itself.

>     I think trying to move the ball forward in a meaningful way without
>     breaking compatibility is a lost cause. Some of these issues could
>     be addressed by adding more syntax, but even that has limits (do we
>     really want another variation of STRICT that allows only 0 or 1
>     rows?). And there's no way to fix your #1 item below without
>     breaking compatibility.
>
>
> I think so there is way with extra check, or with persistent plpgsql
> options - just use it, please. Some checks are clear, some other not.

I will assert that there will ALWAYS be problems that you can't plaster 
over with some kind of extra checking (like plpgsql_check). At some 
point, in order to fix those, you have to somehow break compatibility.

Look at libpq as an example. There's a reason we're on protocol V3.

> If you know ALGOL family languages, then it is not problem. What is a

Lets be realistic... what % of our users have even heard of ALGOL, let 
alone used it? :)

> harder problem for people is different implementation of mix SQL and PL
> - different than Oracle, or MSSQL. Our model is better, simpler but
> different. It is difficult for people without knowleadge of differences
> between functions and procedures. Partially we badly speaking so our
> void functions are procedures.

I suspect that's only confusing for people coming from Oracle (which of 
course is a non-trivial number of people).

>     #6: The variations of syntax between the FOR variants is annoying
>     (specifically, FOREACH necessitating the ARRAY keyword).
>
>
> this is design - FOR is old PL/SQL syntax. FOREACH is prepared for
> extending

Understood. It still sucks though. :)

>     #8: EVERYTHING command option should accept a variable. In
>     particular, RAISE should accept a variable for level, but there's
>     other cases of this I've run into. I'd also be nice if you could
>     plop variables into SQL commands where you'd have an identifier,
>     though presumably that would require some kind of explicit variable
>     identifier.
>
>
> It is hiding dynamic SQL - I am strongly against it - minimally due
> performance issues. Important functionality should not be hidden.

There's definitely ways around the performance issue. I do agree that it 
needs to be clear when you're doing something dynamic so it's not 
accidental. One way to do that would be to add support for variable 
decorators and mandate the use of decorators when using a variable for 
an identifier.

That said, *every* option to RAISE can be a variable except the level. 
That's just plain silly and should be fixed.

>     #13: cstring support would allow a lot more people to experiment
>     with things like custom types. Yes, plpgsql might be slow as hell
>     for this, but sometimes that doesn't matter. Even if it does, it can
>     be a lot easier to prototype in something other than C. (Granted, I
>     think there's some non-plpgsql stuff that would need to happen to
>     allow this.)
>
>
> Not sure about it (I have really realy wrong experience with some
> developers about performance) - but PLPython, PLPerl can do it well, and
> I miss some possibility - We can use transformations more time - SQL/MM
> is based on new datatypes and transformations.

Well, there's probably some other things that could be done to make 
plpgsql perform better in this regard. One thing I've wondered about is 
allowing array-like access to a plain string (or in this case, cstring). 
That would allow you to write code that would translate much better into 
fast C code. IE: if you needed to scan through an entire string you 
could do something like for (i=0; i<strlen(); i++).

> yes. The design of transaction controlling inside stored procedures is
> hard work not related to any PL. Some can be partially solved by
> functions executed in autonomous transactions. With background workers
> we can implement asynchronous autonomous transactions - what can
> coverage lot of use cases where transaction controlling should be used
> in other databases.

Well, those are all really hacks around a fundamental problem of 
allowing user-defined, pre-stored code to execute outside of a 
transaction. I don't think solving that is necessarily *that* hard 
(though certainly non-trivial), but the fmgr interface is certainly NOT 
the right way to go about that. In any case, that's a completely 
different discussion.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

28 December 2016, 21:15:00

2016-12-28 18:54 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:

On 12/28/16 7:16 AM, Pavel Stehule wrote:
** The real problem is that we have no mechanism for allowing a PL's
language/syntax/API to move forward without massive backwards
compatibility problems. **

We have not, but there are few possibilities:

1. enhance #option command
2. we can introduce PRAGMA command
https://en.wikipedia.org/wiki/Ada_(programming_language)#Pragmas
<https://en.wikipedia.org/wiki/Ada_%28programming_language%29#Pragmas>

I wanted to break this out separately, because IMO it's the real heart of the matter.

I think it would be silly not to allow a global setting of compatibility. You certainly don't want to force people to stick magic keywords in their code forevermore.

To that end, would GUCs be a workable answer here? That should give you the ability to control incompatibilities at a function, user, database and global level. It would also allow you to chose between raising a WARNING vs a FATAL.

GUC are fragile - the source code and settings can be separated.

Our #option is more robust, because source code holds all flags required for execution. So I would to see a mechanism, that will be strongly joined with code.

Using function assigned GUC is similar, but it is looking less robust - and some editors can forgot this information.

Lot of issues we can solved by plpgsq.extra_error, extra_warnings - but probably not all - for example issue of FOUND variable or introducing new auto variable ROW_COUNT. PLpgSQL - PL/SQL is safe - it propose the statement GET DIAGNOSTICS, but I understand so isn't funny to write more and more GET DIAGNOSTICS rc = ROW_COUNT; So some shortcuts can be nice, but there is risk, so this shortcut breaks existing code, and the costs/benefits are individual. There cannot be 100% agreement ever. So some customisation should be good.

I realize we've had some bad experiences with compatibility GUCs in the past, but I'd argue we've also had some good experiences. I see that add_missing_from is now completely gone, for example, presumably with no complaints. There's probably several other compatibility GUCs we could remove now.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Jim Nasby

Date:

28 December 2016, 21:23:39

On 12/28/16 12:15 PM, Pavel Stehule wrote:
> GUC are fragile - the source code and settings can be separated.

*Can* be, but they don't *have* to be. That's a huge feature, not a bug.

> Our #option is more robust, because source code holds all flags required
> for execution. So I would to see a mechanism, that will be strongly
> joined with code.

That means you must ALWAYS specify, which is an enormous pain. It 
basically guarantees that users will NEVER switch to the new syntax.

> Using function assigned GUC is similar, but it is looking less robust -
> and some editors can forgot this information.

If you forget then you get an error. Then you remember.

> Lot of issues we can solved by plpgsq.extra_error, extra_warnings - but
> probably not all - for example issue of FOUND variable or introducing
> new auto variable ROW_COUNT. PLpgSQL - PL/SQL is safe - it propose the
> statement GET DIAGNOSTICS, but I understand so isn't funny to write more
> and more GET DIAGNOSTICS rc = ROW_COUNT; So some shortcuts can be nice,
> but there is risk, so this shortcut breaks existing code, and the
> costs/benefits are individual. There cannot be 100% agreement ever. So
> some customisation should be good.

That's the whole point of having settings to deal with 
incompatibilities: so we can actually fix these warts without breaking 
everyone's code, yet also make it clear to users that they should stop 
using the warts and instead use the new and improved syntax.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

28 December 2016, 21:51:31

2016-12-28 19:23 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:

On 12/28/16 12:15 PM, Pavel Stehule wrote:
GUC are fragile - the source code and settings can be separated.

*Can* be, but they don't *have* to be. That's a huge feature, not a bug.

Our #option is more robust, because source code holds all flags required
for execution. So I would to see a mechanism, that will be strongly
joined with code.

That means you must ALWAYS specify, which is an enormous pain. It basically guarantees that users will NEVER switch to the new syntax.

Using function assigned GUC is similar, but it is looking less robust -
and some editors can forgot this information.

If you forget then you get an error. Then you remember.

Lot of issues we can solved by plpgsq.extra_error, extra_warnings - but
probably not all - for example issue of FOUND variable or introducing
new auto variable ROW_COUNT. PLpgSQL - PL/SQL is safe - it propose the
statement GET DIAGNOSTICS, but I understand so isn't funny to write more
and more GET DIAGNOSTICS rc = ROW_COUNT; So some shortcuts can be nice,
but there is risk, so this shortcut breaks existing code, and the
costs/benefits are individual. There cannot be 100% agreement ever. So
some customisation should be good.

That's the whole point of having settings to deal with incompatibilities: so we can actually fix these warts without breaking everyone's code, yet also make it clear to users that they should stop using the warts and instead use the new and improved syntax.

Now, the incompatibility can be hard issue - it is big question if we lock some users on old versions because some users can save to lines of code. Introduction of ROW_COUNT is lowly incompatibility - it can be simply detected - but for example change of behave of FOUND variable is terrible, because the code will be quietly calculate differently. sometimes we can break code - probably people will not be happy, but sometimes we can change the results - it can be big fail. So on one side is big costs. On second side is few lines less code.

Regards

Pavel

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

28 December 2016, 22:02:59

2016-12-28 18:54 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:

On 12/28/16 7:16 AM, Pavel Stehule wrote:

2016-12-28 5:09 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com
<mailto:Jim.Nasby@bluetreble.com>>:

On 12/27/16 4:56 PM, Merlin Moncure wrote:

On Tue, Dec 27, 2016 at 1:54 AM, Pavel Stehule
<pavel.stehule@gmail.com <mailto:pavel.stehule@gmail.com>> wrote:
Which is why this is an external fork of plpgsql.

ok. Just I would not to repeat Perl6 or Python3 story - it is big
adventure, but big fail too

Yeah, creating an entirely "new" PL to deal with compatibility doesn't seem like a good idea to me.

** The real problem is that we have no mechanism for allowing a PL's
language/syntax/API to move forward without massive backwards
compatibility problems. **

We have not, but there are few possibilities:

1. enhance #option command
2. we can introduce PRAGMA command
https://en.wikipedia.org/wiki/Ada_(programming_language)#Pragmas

See separate reply.

<discussion about "blessed extensions">

I'm honestly surprised (even shocked) that you've never run into any
of the problems plpgsql2 is trying to solve. I've hit all those
problems except for OUT parameters. I'd say the order they're listed
in actually corresponds to how often I hit the problems.

I hit lot of older harder (now solved) issues - now, with more
experience I am able to see these issues. And I wrote plpgsql_check,
partially for self too. Years ago I prefer safe expressions.

Recognizing a problem ahead of time (or having plpgsql_check do it for you) still means you have to find a way to work around it. In some cases (ie: STRICT), that workaround can be a serious PITA. Better to just eliminate the problem itself.

I think trying to move the ball forward in a meaningful way without
breaking compatibility is a lost cause. Some of these issues could
be addressed by adding more syntax, but even that has limits (do we
really want another variation of STRICT that allows only 0 or 1
rows?). And there's no way to fix your #1 item below without
breaking compatibility.

I think so there is way with extra check, or with persistent plpgsql
options - just use it, please. Some checks are clear, some other not.

I will assert that there will ALWAYS be problems that you can't plaster over with some kind of extra checking (like plpgsql_check). At some point, in order to fix those, you have to somehow break compatibility.

Look at libpq as an example. There's a reason we're on protocol V3.

If you know ALGOL family languages, then it is not problem. What is a

Lets be realistic... what % of our users have even heard of ALGOL, let alone used it? :)

not too much - but the problem is not in BEGIN, END. I wrote PL/PSM where BEGIN END doesn't exists. The functionality was same as PLpgSQL - and there was not anybody who use it.

harder problem for people is different implementation of mix SQL and PL
- different than Oracle, or MSSQL. Our model is better, simpler but
different. It is difficult for people without knowleadge of differences
between functions and procedures. Partially we badly speaking so our
void functions are procedures.

I suspect that's only confusing for people coming from Oracle (which of course is a non-trivial number of people).

#6: The variations of syntax between the FOR variants is annoying
(specifically, FOREACH necessitating the ARRAY keyword).

this is design - FOR is old PL/SQL syntax. FOREACH is prepared for
extending

Understood. It still sucks though. :)

#8: EVERYTHING command option should accept a variable. In
particular, RAISE should accept a variable for level, but there's
other cases of this I've run into. I'd also be nice if you could
plop variables into SQL commands where you'd have an identifier,
though presumably that would require some kind of explicit variable
identifier.

It is hiding dynamic SQL - I am strongly against it - minimally due
performance issues. Important functionality should not be hidden.

There's definitely ways around the performance issue. I do agree that it needs to be clear when you're doing something dynamic so it's not accidental. One way to do that would be to add support for variable decorators and mandate the use of decorators when using a variable for an identifier.

That said, *every* option to RAISE can be a variable except the level. That's just plain silly and should be fixed.

I am sorry - I read it wrong - If there is not a parser issue, then it can be fixed simply.

#13: cstring support would allow a lot more people to experiment
with things like custom types. Yes, plpgsql might be slow as hell
for this, but sometimes that doesn't matter. Even if it does, it can
be a lot easier to prototype in something other than C. (Granted, I
think there's some non-plpgsql stuff that would need to happen to
allow this.)

Not sure about it (I have really realy wrong experience with some
developers about performance) - but PLPython, PLPerl can do it well, and
I miss some possibility - We can use transformations more time - SQL/MM
is based on new datatypes and transformations.

Well, there's probably some other things that could be done to make plpgsql perform better in this regard. One thing I've wondered about is allowing array-like access to a plain string (or in this case, cstring). That would allow you to write code that would translate much better into fast C code. IE: if you needed to scan through an entire string you could do something like for (i=0; i<strlen(); i++).

It is fixable - and faster string operations can be nice feature - Just I am not volunteer for this task :)

yes. The design of transaction controlling inside stored procedures is
hard work not related to any PL. Some can be partially solved by
functions executed in autonomous transactions. With background workers
we can implement asynchronous autonomous transactions - what can
coverage lot of use cases where transaction controlling should be used
in other databases.

Well, those are all really hacks around a fundamental problem of allowing user-defined, pre-stored code to execute outside of a transaction. I don't think solving that is necessarily *that* hard (though certainly non-trivial), but the fmgr interface is certainly NOT the right way to go about that. In any case, that's a completely different discussion.

This points needs real use cases. Our current design is limited, but it isn't bad - It is much better than Oracle or MSSQL design - we can talk what use cases we can solve. I am sure, so we can find good mapping for PLpgSQL for any design.

Regards

Pavel

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Jim Nasby

Date:

28 December 2016, 22:25:33

On 12/28/16 12:51 PM, Pavel Stehule wrote:
> Now, the incompatibility can be hard issue - it is big question if we
> lock some users on old versions because some users can save to lines of
> code. Introduction of ROW_COUNT is lowly incompatibility - it can be
> simply detected - but for example change of behave of FOUND variable is
> terrible, because the code will be quietly calculate differently.
> sometimes we can break code - probably people will not be happy, but
> sometimes we can change the results - it can be big fail. So on one side
> is big costs. On second side is few lines less code.

That's my whole point of why this needs to be settable at a global 
level: so that people with a lot of legacy code can set the OLD behavior 
at a global level, and deal with the old code over time.

If there's no global setting then there are only two choices: we default 
to new behavior and force everyone to add a bunch of stuff to *every* 
function they have (loads of complaints), or we default to old behavior 
and no one bothers to even adopt the new usage because they have to add 
extra stuff to every function. Either way is a failure. This is why I 
think there MUST be some way to control this at a higher level than per 
function.

Certainly GUCs aren't the only option, we could invent something else. 
One feature I could see being useful is being able to set a default on a 
schema level, which isn't currently possible with a GUC. But I can 
certainly see database and global settings being useful, and perhaps 
per-user as well. GUCs already have those.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

28 December 2016, 22:45:41

2016-12-28 20:25 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:

On 12/28/16 12:51 PM, Pavel Stehule wrote:
Now, the incompatibility can be hard issue - it is big question if we
lock some users on old versions because some users can save to lines of
code. Introduction of ROW_COUNT is lowly incompatibility - it can be
simply detected - but for example change of behave of FOUND variable is
terrible, because the code will be quietly calculate differently.
sometimes we can break code - probably people will not be happy, but
sometimes we can change the results - it can be big fail. So on one side
is big costs. On second side is few lines less code.

That's my whole point of why this needs to be settable at a global level: so that people with a lot of legacy code can set the OLD behavior at a global level, and deal with the old code over time.

If there's no global setting then there are only two choices: we default to new behavior and force everyone to add a bunch of stuff to *every* function they have (loads of complaints), or we default to old behavior and no one bothers to even adopt the new usage because they have to add extra stuff to every function. Either way is a failure. This is why I think there MUST be some way to control this at a higher level than per function.

we can have both - plpgsql.variable_conflict can be precedent.

Certainly GUCs aren't the only option, we could invent something else. One feature I could see being useful is being able to set a default on a schema level, which isn't currently possible with a GUC. But I can certainly see database and global settings being useful, and perhaps per-user as well. GUCs already have those.

yes, without GUC you cannot set the behave of plpgsql globally.

Regards

Pavel

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

01 January 2017, 21:17:56

I wrote some initial patch

Do you think so has sense to continue in this topic?

Regards

Pavel

Attachment

plpgsql_extra_runtime_checks.patch

Re: [HACKERS] merging some features from plpgsql2 project

From

Jim Nasby

Date:

02 January 2017, 20:36:12

On 1/1/17 12:17 PM, Pavel Stehule wrote:
> I wrote some initial patch
>
> Do you think so has sense to continue in this topic?

Perhaps I'm not understanding what plpgsql_extra_errors does, but I 
don't think either of these should depend on that being true. IMO these 
two checks should be default to throwing an exception.

I think instead of tying these to extra_*, each GUC should accept a LOG 
level.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

02 January 2017, 21:06:25

2017-01-02 18:36 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:

On 1/1/17 12:17 PM, Pavel Stehule wrote:
I wrote some initial patch

Do you think so has sense to continue in this topic?

Perhaps I'm not understanding what plpgsql_extra_errors does, but I don't think either of these should depend on that being true. IMO these two checks should be default to throwing an exception.

There are use cases where these patters should be used and has sense like

SELECT (polymorphiccomposite).* INTO c1, c2; -- take first two columns

SELECT xx FROM tab ORDER BY yy INTO target -- more rows not a issue

I understand plpgsql_extra_errors as feature that can be enabled on developer, test, or preprod environments and can help to identify some strange places.

I think instead of tying these to extra_*, each GUC should accept a LOG level.

Why? Why the none, warning, error are not enough? Why are you think so separate GUC can be better than plpgsql_extra_* ?

The fast setting plpgsql.extra_errors = 'all' can switch to some "safe" configuration.

The fast setting plpgsql.extra_warnings = 'all' can helps with identification, but doesn't break production (or doesn't breaks other tests)

Regards

Pavel

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Jim Nasby

Date:

02 January 2017, 22:16:18

On 1/2/17 12:06 PM, Pavel Stehule wrote:
> SELECT (polymorphiccomposite).* INTO c1, c2; -- take first two columns
>
> SELECT xx FROM tab ORDER BY yy INTO target -- more rows not a issue
>
> I understand plpgsql_extra_errors as feature that can be enabled on
> developer, test, or preprod environments and can help to identify some
> strange places.

Yes, but the two cases you mentioned above are the "strange" cases, and 
you should have to do something extra to allow those, not the other way 
around.

>     I think instead of tying these to extra_*, each GUC should accept a
>     LOG level.
>
>
> Why? Why the none, warning, error are not enough? Why are you think so
> separate GUC can be better than plpgsql_extra_* ?
>
> The fast setting plpgsql.extra_errors = 'all' can switch to some "safe"
> configuration.
> The fast setting plpgsql.extra_warnings = 'all' can helps with
> identification, but doesn't break production (or doesn't breaks other tests)

I see two problems with those settings:

1) Neither is enabled by default, so 90% of users have no idea they 
exist. Obviously that's an easy enough fix, but...
2) There's no way to incrementally change those values for a single 
function. If you've set extra_errors = 'all' globally, a single function 
can't say "turn off the too many rows setting for this function".

BTW, while I can see value in being able to change these settings for an 
entire function, I think the recommended use should be to only change 
them for a specific statement.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

02 January 2017, 22:51:05

2017-01-02 20:16 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:

On 1/2/17 12:06 PM, Pavel Stehule wrote:
SELECT (polymorphiccomposite).* INTO c1, c2; -- take first two columns

SELECT xx FROM tab ORDER BY yy INTO target -- more rows not a issue

I understand plpgsql_extra_errors as feature that can be enabled on
developer, test, or preprod environments and can help to identify some
strange places.

Yes, but the two cases you mentioned above are the "strange" cases, and you should have to do something extra to allow those, not the other way around.

The second example is really strange. But the first example is used in composite types conversion - when you convert from base to extend type. This routine is used in plpgsql when you use a assignment statement

composite_var := another_composite_var

I think instead of tying these to extra_*, each GUC should accept a
LOG level.

Why? Why the none, warning, error are not enough? Why are you think so
separate GUC can be better than plpgsql_extra_* ?

The fast setting plpgsql.extra_errors = 'all' can switch to some "safe"
configuration.
The fast setting plpgsql.extra_warnings = 'all' can helps with
identification, but doesn't break production (or doesn't breaks other tests)

I see two problems with those settings:

1) Neither is enabled by default, so 90% of users have no idea they exist. Obviously that's an easy enough fix, but...

We can strongly talk about it - there can be a chapter in plpgsql doc. Now, the patterns and antipatterns are not officially documented.

2) There's no way to incrementally change those values for a single function. If you've set extra_errors = 'all' globally, a single function can't say "turn off the too many rows setting for this function".

We can enhance the GUC syntax like "all -too_many_rows,-xxx"

BTW, while I can see value in being able to change these settings for an entire function, I think the recommended use should be to only change them for a specific statement.

What you can do in plain assign statement

target := expression ?

My border is any compatibility break - and I would not to across it. First issue is probably harder

related to typo "select 1 x into c1,c2" and it can be detected by plpgsql_check.

Second issue is not a performance issue today (we read only 2 rows everytime) and it is hard how often it returns wrong result. This issue cannot be detected by plpgsql_check now.

Regards

Pavel

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Merlin Moncure

Date:

03 January 2017, 18:23:48

On Tue, Dec 27, 2016 at 10:09 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> On 12/27/16 4:56 PM, Merlin Moncure wrote:
>>
>> On Tue, Dec 27, 2016 at 1:54 AM, Pavel Stehule <pavel.stehule@gmail.com>
>> wrote:
>>>
>>> First I describe my initial position. I am strongly against introduction
>>> "new" language - plpgsql2 or new plpgsql, or any else. The trust of
>>> developers to us is important and introduction of any not compatible or
>>> different feature has to have really big reason. PostgreSQL is
>>> conservative
>>> environment, and PLpgSQL should not be a exception. More - I have not any
>
>
> Which is why this is an external fork of plpgsql.
>
> ** The real problem is that we have no mechanism for allowing a PL's
> language/syntax/API to move forward without massive backwards compatibility
> problems. **

Just got back from break :-).  Have some thoughts on this.  Backwards
compatibility is really a fundamental problem.  There's really no
solution to it other than to try and avoid using syntax to solve
problems.  It should be obvious to everyone that plgsql cannot
withstand a compatibility break.  Another language could be offered as
an alternative in core (say, pl/psm or pl/v8), but pl/pgsql has to
support old code.   Some really out there features could maybe be
redacted (in particular, using = for assignment), but not not much.
But I guess we're stuck with the status quo.

I think we ought to avoid language features that influence the
behavior (performance is ok) of the code (and that includes throwing
errors).  That's a freight train headed towards javscript's 'strict'
mode, which is thinly disguised language fork.  #option and pragma
type syntaxes are trying to cheat the language -- hardly anyone uses
them and it's a tricky way to try and make the language into something
other than it is.

C does it right -- dubious code is raised as warnings and it's up to
the end user to determine which warnings are interesting and likely to
be an error.  So, rather than hacking the language to control throwing
and errors and such there should be some ability validate the function
heavily and verify suspicious use of INTO or other dubious things
(unused variables, masked assignments, etc).  The validation output
could even be a set returning function.

So -1 to strict mode, unless we can make a case why this can't be done
as part of checking/validation.

Other random points:
*) Another major pain point is swapping in the input variables for
debugging purposes.  Something that emits a script based on a set of
arguments would be wonderful.

*) Would also like to have a FINALLY block

*) A mechanic to manually print out a stack trace for debugging
purposes would be helpful.

*) COPY not being able to accept arguments as variables (in particular
the filename) is a major headache

*) Upon error, we ought to print a few lines of context on either side
of the error.  Not very fun to try and figure out where the errors is
when you are working off of 'syntax error near "FROM"' etc.  This is a
major problem throughout the platform IMO.

*) Some user visible mechanic other than forcing SQL through EXECUTE
to be able to control plan caching would be useful.

merlin

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

03 January 2017, 18:58:25

2017-01-03 16:23 GMT+01:00 Merlin Moncure <mmoncure@gmail.com>:

On Tue, Dec 27, 2016 at 10:09 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> On 12/27/16 4:56 PM, Merlin Moncure wrote:
>>
>> On Tue, Dec 27, 2016 at 1:54 AM, Pavel Stehule <pavel.stehule@gmail.com>
>> wrote:
>>>
>>> First I describe my initial position. I am strongly against introduction
>>> "new" language - plpgsql2 or new plpgsql, or any else. The trust of
>>> developers to us is important and introduction of any not compatible or
>>> different feature has to have really big reason. PostgreSQL is
>>> conservative
>>> environment, and PLpgSQL should not be a exception. More - I have not any
>
>
> Which is why this is an external fork of plpgsql.
>
> ** The real problem is that we have no mechanism for allowing a PL's
> language/syntax/API to move forward without massive backwards compatibility
> problems. **

Just got back from break :-). Have some thoughts on this. Backwards
compatibility is really a fundamental problem. There's really no
solution to it other than to try and avoid using syntax to solve
problems. It should be obvious to everyone that plgsql cannot
withstand a compatibility break. Another language could be offered as
an alternative in core (say, pl/psm or pl/v8), but pl/pgsql has to
support old code. Some really out there features could maybe be
redacted (in particular, using = for assignment), but not not much.
But I guess we're stuck with the status quo.

I think we ought to avoid language features that influence the
behavior (performance is ok) of the code (and that includes throwing
errors). That's a freight train headed towards javscript's 'strict'
mode, which is thinly disguised language fork. #option and pragma
type syntaxes are trying to cheat the language -- hardly anyone uses
them and it's a tricky way to try and make the language into something
other than it is.

C does it right -- dubious code is raised as warnings and it's up to
the end user to determine which warnings are interesting and likely to
be an error. So, rather than hacking the language to control throwing
and errors and such there should be some ability validate the function
heavily and verify suspicious use of INTO or other dubious things
(unused variables, masked assignments, etc). The validation output
could even be a set returning function.

So -1 to strict mode, unless we can make a case why this can't be done
as part of checking/validation.

Can be plpgsq.extra_errors and plpgsql.extra_warnings solution?

I am thinking so there is a space for improvement (in extra_* usage)

Do you know plpgsql_check https://github.com/okbob/plpgsql_check ?

Other random points:
*) Another major pain point is swapping in the input variables for
debugging purposes. Something that emits a script based on a set of
arguments would be wonderful.

???

*) Would also like to have a FINALLY block

What you can do there?

*) A mechanic to manually print out a stack trace for debugging
purposes would be helpful.

I had plan to develop a extension for this purpose - easy printing stack, function parameters, and local variables. But I had a motivation to start it. It can be usable for any PL

*) COPY not being able to accept arguments as variables (in particular
the filename) is a major headache

There is a patch "COPY as function"

*) Upon error, we ought to print a few lines of context on either side
of the error. Not very fun to try and figure out where the errors is
when you are working off of 'syntax error near "FROM"' etc. This is a
major problem throughout the platform IMO.

Have not idea how to fix it

*) Some user visible mechanic other than forcing SQL through EXECUTE
to be able to control plan caching would be useful.

fully agree.

Have you some ideas?

What about plpgsql option (function scope) -- WITHOUT-PLAN-CACHE - any non trivial plans will not be cached - and evaluated as parametrized query only.

Regards

Pavel

merlin

Re: [HACKERS] merging some features from plpgsql2 project

From

Jim Nasby

Date:

03 January 2017, 19:57:30

On 1/2/17 1:51 PM, Pavel Stehule wrote:
>     1) Neither is enabled by default, so 90% of users have no idea they
>     exist. Obviously that's an easy enough fix, but...
>
> We can strongly talk about it - there can be a chapter in plpgsql doc.
> Now, the patterns and antipatterns are not officially documented.

Or just fix the issue, provide the backwards compatability GUCs and move on.

>     2) There's no way to incrementally change those values for a single
>     function. If you've set extra_errors = 'all' globally, a single
>     function can't say "turn off the too many rows setting for this
>     function".
>
>
> We can enhance the GUC syntax like "all -too_many_rows,-xxx"

Why create all that framework when we could just have multiple 
plpgsql.blah GUCs? plpgsql.multirow_assign_level=FATAL solves that 
problem. We just need a plpgsql GUC for each backwards compatibility break.

>     BTW, while I can see value in being able to change these settings
>     for an entire function, I think the recommended use should be to
>     only change them for a specific statement.
>
>
> What you can do in plain assign statement
>
> target := expression ?

The point I was trying to make there is if you do have some cases where 
you need to silently ignore extra rows (for example) it's probably only 
one statement and not an entire function. That said, if we just make 
these options GUCs then you can just do SET and RESET.

> My border is any compatibility break - and I would not to across it.
> First issue is probably harder

If we never broke compatibility we'd still be allowing SELECT without 
FROM, NULL = NULL being TRUE, and a whole bunch of other problems. We'd 
also be stuck on protocol v1 (and of course not talking about what we 
want in v4).

We've successfully made incompatible changes that were *far worse* than 
this (ie: renaming pg_stat_activity.procpid). Obviously we shouldn't be 
breaking things willy-nilly, but these are long-standing warts (dare I 
say BUGS?) that should be fixed. They're ugly enough that someone took 
the time to break plpgsql out of the core code and fork it.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

03 January 2017, 20:19:08

2017-01-03 17:57 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:

On 1/2/17 1:51 PM, Pavel Stehule wrote:
1) Neither is enabled by default, so 90% of users have no idea they
exist. Obviously that's an easy enough fix, but...

We can strongly talk about it - there can be a chapter in plpgsql doc.
Now, the patterns and antipatterns are not officially documented.

Or just fix the issue, provide the backwards compatability GUCs and move on.

It is still compatibility break.

2) There's no way to incrementally change those values for a single
function. If you've set extra_errors = 'all' globally, a single
function can't say "turn off the too many rows setting for this
function".

We can enhance the GUC syntax like "all -too_many_rows,-xxx"

Why create all that framework when we could just have multiple plpgsql.blah GUCs? plpgsql.multirow_assign_level=FATAL solves that problem. We just need a plpgsql GUC for each backwards compatibility break.

We have this framework already, so why don't use it.

BTW, while I can see value in being able to change these settings
for an entire function, I think the recommended use should be to
only change them for a specific statement.

What you can do in plain assign statement

target := expression ?

The point I was trying to make there is if you do have some cases where you need to silently ignore extra rows (for example) it's probably only one statement and not an entire function. That said, if we just make these options GUCs then you can just do SET and RESET.

My border is any compatibility break - and I would not to across it.
First issue is probably harder

If we never broke compatibility we'd still be allowing SELECT without FROM, NULL = NULL being TRUE, and a whole bunch of other problems. We'd also be stuck on protocol v1 (and of course not talking about what we want in v4).

This was in dark age - how much users of plpgsql was in 2000? Hard to speak about Postgres as mature software in this era.

We've successfully made incompatible changes that were *far worse* than this (ie: renaming pg_stat_activity.procpid). Obviously we shouldn't be breaking things willy-nilly, but these are long-standing warts (dare I say BUGS?) that should be fixed. They're ugly enough that someone took the time to break plpgsql out of the core code and fork it.

We are not talk about features that can be simply marked as bugs, so there is not too much what we should to fix it. We should to help to users to identify some possible risk places.

Regards

Pavel

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Jim Nasby

Date:

03 January 2017, 20:32:35

On 1/3/17 9:58 AM, Pavel Stehule wrote:
>     > ** The real problem is that we have no mechanism for allowing a PL's
>     > language/syntax/API to move forward without massive backwards compatibility
>     > problems. **
>
>     Just got back from break :-).  Have some thoughts on this.  Backwards
>     compatibility is really a fundamental problem.  There's really no
>     solution to it other than to try and avoid using syntax to solve
>     problems.  It should be obvious to everyone that plgsql cannot
>     withstand a compatibility break.  Another language could be offered as

I don't think that's obvious at all. We've introduced incompatibility in 
the main grammar without problem. You just need a way for people to get 
the old behavior back if they need it. Eventually people will stop 
relying on the old, broken behavior.

>     an alternative in core (say, pl/psm or pl/v8), but pl/pgsql has to
>     support old code.   Some really out there features could maybe be
>     redacted (in particular, using = for assignment), but not not much.
>     But I guess we're stuck with the status quo.
>
>     I think we ought to avoid language features that influence the
>     behavior (performance is ok) of the code (and that includes throwing
>     errors).  That's a freight train headed towards javscript's 'strict'
>     mode, which is thinly disguised language fork.  #option and pragma
>     type syntaxes are trying to cheat the language -- hardly anyone uses
>     them and it's a tricky way to try and make the language into something
>     other than it is.

Yeah, trying to bulk all these changes into one "magic setting" is not a 
way to move forward. I think we're actually really well off in that 
regard, because unlike most languages we have a very robust settings 
system that allows controlling this behavior even at the statement level.

>     C does it right -- dubious code is raised as warnings and it's up to
>     the end user to determine which warnings are interesting and likely to
>     be an error.  So, rather than hacking the language to control throwing
>     and errors and such there should be some ability validate the function
>     heavily and verify suspicious use of INTO or other dubious things
>     (unused variables, masked assignments, etc).  The validation output
>     could even be a set returning function.

While static analysis can do some good (and I think we should actually 
be enabling more of that by default), it won't realistically solve 
everything. Multi-row assignment is a good example: NO ONE is going to 
be OK with tons of warnings for every little := or SELECT INTO (without 
strict), but the reality is that most code actually won't work correctly 
if you have multiple rows coming back, so there's nothing technically 
wrong with `var = field FROM table WHERE table_id = plpgsql_variable` if 
table_id is the PK: you'll always get 0 or 1 rows back.

>     So -1 to strict mode, unless we can make a case why this can't be done
>     as part of checking/validation.
>
>
> Can be plpgsq.extra_errors and plpgsql.extra_warnings solution?
>
> I am thinking so there is a space for improvement (in extra_* usage)
>
> Do you know plpgsql_check https://github.com/okbob/plpgsql_check ?

I think we should look at what parts of that we should pull into core 
(as well as enabling more by default). Stuff that can be done at 
compile/load time is certainly better than runtime checks.

>     Other random points:
>     *) Another major pain point is swapping in the input variables for
>     debugging purposes.  Something that emits a script based on a set of
>     arguments would be wonderful.
>
> ???

Yeah, could you elaborate here?

>     *) Would also like to have a FINALLY block
>
> What you can do there?

It's a block that ALWAYS executes, even if an exception occurs. Python 
has this[1]. That (along with an ELSE clause for if there is no 
exception) would mean you could catch an exception for a single command 
instead of a bunch of commands.

Somewhat related to that, I wish you could make GUC changes that were 
local only to a specific BEGIN block. AFAIK the GUC infrastructure fully 
supports that, it would just need to be exposed in plpgsql.

>     *) A mechanic to manually print out a stack trace for debugging
>     purposes would be helpful.
>
>
> I had plan to develop a extension for this purpose - easy printing
> stack, function parameters, and local variables. But I had a motivation
> to start it. It can be usable for any PL

I assume you're thinking an SRF that spits out PG_CONTEXT? It'd be 
really nice if you could also get things like function names and line 
numbers broken out separately. I've thought of building this myself.

BTW, the biggest case I can think of using this for is a userspace 
method of doing "private" functions, where the function throws an 
exception unless it was called directly by a set of allowed functions 
(or views).

>     *) COPY not being able to accept arguments as variables (in particular
>     the filename) is a major headache
>
>
> There is a patch "COPY as function"

Instead of just COPY, I'd like an easier way to pass identifiers into 
SQL commands. format() certainly makes this easier than it was, but %3$s 
gets old pretty quick. (Of course, if we had the concept of a dictionary 
you could actually name the parameters and it wouldn't be quite so bad...)

>     *) Upon error, we ought to print a few lines of context on either side
>     of the error.  Not very fun to try and figure out where the errors is
>     when you are working off of 'syntax error near "FROM"' etc.  This is a
>     major problem throughout the platform IMO.
>
>
> Have not idea how to fix it

To do this I think you'd need to keep an array of preceding line 
positions instead of just one, which I don't think would be that hard. 
The bigger problem would be only spitting out the extra info on the 
initial error, and not for subsequent context calls up the stack. I 
don't think there's currently a way to tell if you're the level that the 
ereport originated at (at least, not in an error).

Improving that would definitely be useful across the board, because 
right now DEBUG becomes completely useless once you have more than 1 or 
2 levels of nested calls.

>     *) Some user visible mechanic other than forcing SQL through EXECUTE
>     to be able to control plan caching would be useful.
>
>
> fully agree.
>
> Have you some ideas?
>
> What about plpgsql option (function scope) -- WITHOUT-PLAN-CACHE - any
> non trivial plans will not be cached - and evaluated as parametrized
> query only.

I'd also like the ability to do a "localized" PREPARE; similar to a SQL 
level PREPARE statement, but ensuring that the statement got deallocated 
when the block ended.

1: https://docs.python.org/3/tutorial/errors.html#defining-clean-up-actions
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Jim Nasby

Date:

03 January 2017, 20:41:24

On 1/3/17 11:19 AM, Pavel Stehule wrote:
>             2) There's no way to incrementally change those values for a
>         single
>             function. If you've set extra_errors = 'all' globally, a single
>             function can't say "turn off the too many rows setting for this
>             function".
>
>
>         We can enhance the GUC syntax like "all -too_many_rows,-xxx"
>
>
>     Why create all that framework when we could just have multiple
>     plpgsql.blah GUCs? plpgsql.multirow_assign_level=FATAL solves that
>     problem. We just need a plpgsql GUC for each backwards compatibility
>     break.
>
> We have this framework already, so why don't use it.

We *don't* have a framework that works for this, because you can't 
incrementally modify extra_errors. Maybe extra_errors is an OK API for 
static checking, but it's definitely a BAD API for something you'd need 
to control at a function (or even statement) level.

>     If we never broke compatibility we'd still be allowing SELECT
>     without FROM, NULL = NULL being TRUE, and a whole bunch of other
>     problems. We'd also be stuck on protocol v1 (and of course not
>     talking about what we want in v4).
>
>
> This was in dark age - how much users of plpgsql was in 2000? Hard to
> speak about Postgres as mature software in this era.

I don't know about you' but I've considered Postgres to be mature since 
at least 8.0, if not earlier. Actually, in many ways it was far more 
mature than other databases I was using in 2000 (let alone 2007).

>     We've successfully made incompatible changes that were *far worse*
>     than this (ie: renaming pg_stat_activity.procpid). Obviously we
>     shouldn't be breaking things willy-nilly, but these are
>     long-standing warts (dare I say BUGS?) that should be fixed. They're
>     ugly enough that someone took the time to break plpgsql out of the
>     core code and fork it.
>
> We are not talk about features that can be simply marked as bugs, so
> there is not too much what we should to fix it. We should to help to
> users to identify some possible risk places.

You keep claiming that these aren't serious bugs, yet someone felt so 
strongly that they ARE serious bugs that they forked the entire PL.

If you're not willing to even consider a compatibility break (with a 
means to get the old behavior back) then I don't think there's any point 
in continuing this thread, because some of these issues can NOT be 
reasonably solved by a checker.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

03 January 2017, 20:51:55

2017-01-03 18:41 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:

On 1/3/17 11:19 AM, Pavel Stehule wrote:
2) There's no way to incrementally change those values for a
single
function. If you've set extra_errors = 'all' globally, a single
function can't say "turn off the too many rows setting for this
function".

We can enhance the GUC syntax like "all -too_many_rows,-xxx"

Why create all that framework when we could just have multiple
plpgsql.blah GUCs? plpgsql.multirow_assign_level=FATAL solves that
problem. We just need a plpgsql GUC for each backwards compatibility
break.

We have this framework already, so why don't use it.

We *don't* have a framework that works for this, because you can't incrementally modify extra_errors. Maybe extra_errors is an OK API for static checking, but it's definitely a BAD API for something you'd need to control at a function (or even statement) level.

I have different opinion then you - sure - it should not to change behave, it should to help with identification. And it is enough for this purpose.

If we never broke compatibility we'd still be allowing SELECT
without FROM, NULL = NULL being TRUE, and a whole bunch of other
problems. We'd also be stuck on protocol v1 (and of course not
talking about what we want in v4).

This was in dark age - how much users of plpgsql was in 2000? Hard to
speak about Postgres as mature software in this era.

I don't know about you' but I've considered Postgres to be mature since at least 8.0, if not earlier. Actually, in many ways it was far more mature than other databases I was using in 2000 (let alone 2007).

We've successfully made incompatible changes that were *far worse*
than this (ie: renaming pg_stat_activity.procpid). Obviously we
shouldn't be breaking things willy-nilly, but these are
long-standing warts (dare I say BUGS?) that should be fixed. They're
ugly enough that someone took the time to break plpgsql out of the
core code and fork it.

We are not talk about features that can be simply marked as bugs, so
there is not too much what we should to fix it. We should to help to
users to identify some possible risk places.

You keep claiming that these aren't serious bugs, yet someone felt so strongly that they ARE serious bugs that they forked the entire PL.

Sorry, but it it is subjective - and there can be different opinions - some body would to prefer more rigidity, some other less rigidity.

If you're not willing to even consider a compatibility break (with a means to get the old behavior back) then I don't think there's any point in continuing this thread, because some of these issues can NOT be reasonably solved by a checker.

yes, I don't would to consider about a compatibility break. I accept so you have different opinion.

I'll send this patch + doc to next commitfest - and depends on commiters if the patch will be rejected or not. I know so it should not be fully fixed, but it is step forward from my perspective.

Thank you for discussion

Regards

Pavel

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Merlin Moncure

Date:

03 January 2017, 22:54:03

On Tue, Jan 3, 2017 at 9:58 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> 2017-01-03 16:23 GMT+01:00 Merlin Moncure <mmoncure@gmail.com>:
>> So -1 to strict mode, unless we can make a case why this can't be done
>> as part of checking/validation.
>
> Can be plpgsq.extra_errors and plpgsql.extra_warnings solution?
>
> I am thinking so there is a space for improvement (in extra_* usage)

extra_warnings seems ok at the GUC level.  However it's bad to have a
body of code fail to compile based on GUC.  check_function_bodies for
example is a complete hack and should be avoided if at all possible
IMO.  There is very good informal rule that GUC should not impact
behavior (minus some special cases like timeouts).   Good examples of
failure to follow this rule are mysql and php.

Maybe settings at level of extension could be ok, but I'm skeptical.
Good languages are clear without needing extra context.

> Do you know plpgsql_check https://github.com/okbob/plpgsql_check ?

Yes.  This is good design and should be model for core-work (if any).In my ideal world, this could would be part of
pgxnand to have pgxn

client be installed in core.   For plpgsql to enter modern era we need
standardized packaging and deployment like cran, npm, etc.

>> Other random points:
>> *) Another major pain point is swapping in the input variables for
>> debugging purposes.  Something that emits a script based on a set of
>> arguments would be wonderful.
>
> ???

Often for debugging of complicated cases I'm starting from errors in
database log with function name and argument values.  Sometimes I find
myself pasting pl/pgsql function into text editor and replacing input
variables with known values.

>>
>> *) Would also like to have a FINALLY block
>
> What you can do there?

This is syntax sugar so you don't need second begin/end/exception
block or duplicated code.  It separates error handling from cleanup.

BEGIN PERFORM dblink_connect(... <risky_stuff>
EXCEPTION WHEN OTHERS THEN <log/handle error>
FINALLY PERFORM dblink_disconnect(...
END;

>> *) Some user visible mechanic other than forcing SQL through EXECUTE
>> to be able to control plan caching would be useful.
>
> fully agree.
>
> Have you some ideas?
>
> What about plpgsql option (function scope) -- WITHOUT-PLAN-CACHE - any non
> trivial plans will not be cached - and evaluated as parametrized query only.

I have slight preference for syntax marker for each query, similar to
INTO.  Maybe 'UNCACHED'?

On Tue, Jan 3, 2017 at 10:57 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> Or just fix the issue, provide the backwards compatability GUCs and move on.

I really don't think this will fly.  I'm not buying your argument (at
all) that compatibility breaks have have been cleanly done in the
past, at least not in the modern era.  In any event, marginal language
improvements are not a good justification to do it.   And yes, the
continual monkey around with column names in pg_stat_activity are a
major hassle.  For heaven's sake, can we just add new columns and/or
create a new view?

merlin

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

03 January 2017, 23:15:45

2017-01-03 20:54 GMT+01:00 Merlin Moncure <mmoncure@gmail.com>:

On Tue, Jan 3, 2017 at 9:58 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> 2017-01-03 16:23 GMT+01:00 Merlin Moncure <mmoncure@gmail.com>:
>> So -1 to strict mode, unless we can make a case why this can't be done
>> as part of checking/validation.
>
> Can be plpgsq.extra_errors and plpgsql.extra_warnings solution?
>
> I am thinking so there is a space for improvement (in extra_* usage)

extra_warnings seems ok at the GUC level. However it's bad to have a
body of code fail to compile based on GUC. check_function_bodies for
example is a complete hack and should be avoided if at all possible
IMO. There is very good informal rule that GUC should not impact
behavior (minus some special cases like timeouts). Good examples of
failure to follow this rule are mysql and php.

Maybe settings at level of extension could be ok, but I'm skeptical.
Good languages are clear without needing extra context.

> Do you know plpgsql_check https://github.com/okbob/plpgsql_check ?

Yes. This is good design and should be model for core-work (if any).
In my ideal world, this could would be part of pgxn and to have pgxn
client be installed in core. For plpgsql to enter modern era we need
standardized packaging and deployment like cran, npm, etc.

>> Other random points:
>> *) Another major pain point is swapping in the input variables for
>> debugging purposes. Something that emits a script based on a set of
>> arguments would be wonderful.
>
> ???

Often for debugging of complicated cases I'm starting from errors in
database log with function name and argument values. Sometimes I find
myself pasting pl/pgsql function into text editor and replacing input
variables with known values.

is it related to plpgsql debugger? Have not idea how it can be better on language level.

>>
>> *) Would also like to have a FINALLY block
>
> What you can do there?

This is syntax sugar so you don't need second begin/end/exception
block or duplicated code. It separates error handling from cleanup.

BEGIN
PERFORM dblink_connect(...
<risky_stuff>
EXCEPTION WHEN OTHERS THEN
<log/handle error>
FINALLY
PERFORM dblink_disconnect(...
END;

Does know somebody this pattern from Ada or PL/SQL?

>> *) Some user visible mechanic other than forcing SQL through EXECUTE
>> to be able to control plan caching would be useful.
>
> fully agree.
>
> Have you some ideas?
>
> What about plpgsql option (function scope) -- WITHOUT-PLAN-CACHE - any non
> trivial plans will not be cached - and evaluated as parametrized query only.

I have slight preference for syntax marker for each query, similar to
INTO. Maybe 'UNCACHED'?

I am not clean opinion - the statement level is nice, but what readability?

SELECT UNCACHED t.a, t.b FROM INTO a,b;

Regards

Pavel

On Tue, Jan 3, 2017 at 10:57 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> Or just fix the issue, provide the backwards compatability GUCs and move on.

I really don't think this will fly. I'm not buying your argument (at
all) that compatibility breaks have have been cleanly done in the
past, at least not in the modern era. In any event, marginal language
improvements are not a good justification to do it. And yes, the
continual monkey around with column names in pg_stat_activity are a
major hassle. For heaven's sake, can we just add new columns and/or
create a new view?

merlin

Re: [HACKERS] merging some features from plpgsql2 project

From

Merlin Moncure

Date:

04 January 2017, 18:49:23

On Tue, Jan 3, 2017 at 2:15 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
>
> 2017-01-03 20:54 GMT+01:00 Merlin Moncure <mmoncure@gmail.com>:
>>
>> On Tue, Jan 3, 2017 at 9:58 AM, Pavel Stehule <pavel.stehule@gmail.com>
>> wrote:
>> > 2017-01-03 16:23 GMT+01:00 Merlin Moncure <mmoncure@gmail.com>:
>> >> *) Would also like to have a FINALLY block
>> >
>> > What you can do there?
>>
>> This is syntax sugar so you don't need second begin/end/exception
>> block or duplicated code.  It separates error handling from cleanup.
>>
>> BEGIN
>>   PERFORM dblink_connect(...
>>   <risky_stuff>
>> EXCEPTION WHEN OTHERS THEN
>>   <log/handle error>
>> FINALLY
>>   PERFORM dblink_disconnect(...
>> END;
>
>
> Does know somebody this pattern from Ada or PL/SQL?

I guess probably not.  It's a standard pattern in modern EH languages
(for example, https://msdn.microsoft.com/en-us/library/dszsf989.aspx).

>>
>> >> *) Some user visible mechanic other than forcing SQL through EXECUTE
>> >> to be able to control plan caching would be useful.
>> >
>> > fully agree.
>> >
>> > Have you some ideas?
>> >
>> > What about plpgsql option (function scope) -- WITHOUT-PLAN-CACHE - any
>> > non
>> > trivial plans will not be cached - and evaluated as parametrized query
>> > only.
>>
>> I have slight preference for syntax marker for each query, similar to
>> INTO.  Maybe 'UNCACHED'?
>
>
> I am not clean opinion - the statement level is nice, but what readability?
>
> SELECT UNCACHED t.a, t.b FROM INTO a,b;

Yeah -- this is pretty ugly admittedly.  Maybe control directive is
ok, as long as you can set it mid function?

merlin

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

04 January 2017, 19:07:06

2017-01-04 16:49 GMT+01:00 Merlin Moncure <mmoncure@gmail.com>:

On Tue, Jan 3, 2017 at 2:15 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
>
> 2017-01-03 20:54 GMT+01:00 Merlin Moncure <mmoncure@gmail.com>:
>>
>> On Tue, Jan 3, 2017 at 9:58 AM, Pavel Stehule <pavel.stehule@gmail.com>
>> wrote:
>> > 2017-01-03 16:23 GMT+01:00 Merlin Moncure <mmoncure@gmail.com>:
>> >> *) Would also like to have a FINALLY block
>> >
>> > What you can do there?
>>
>> This is syntax sugar so you don't need second begin/end/exception
>> block or duplicated code. It separates error handling from cleanup.
>>
>> BEGIN
>> PERFORM dblink_connect(...
>> <risky_stuff>
>> EXCEPTION WHEN OTHERS THEN
>> <log/handle error>
>> FINALLY
>> PERFORM dblink_disconnect(...
>> END;
>
>
> Does know somebody this pattern from Ada or PL/SQL?

I guess probably not. It's a standard pattern in modern EH languages
(for example, https://msdn.microsoft.com/en-us/library/dszsf989.aspx).

>>
>> >> *) Some user visible mechanic other than forcing SQL through EXECUTE
>> >> to be able to control plan caching would be useful.
>> >
>> > fully agree.
>> >
>> > Have you some ideas?
>> >
>> > What about plpgsql option (function scope) -- WITHOUT-PLAN-CACHE - any
>> > non
>> > trivial plans will not be cached - and evaluated as parametrized query
>> > only.
>>
>> I have slight preference for syntax marker for each query, similar to
>> INTO. Maybe 'UNCACHED'?
>
>
> I am not clean opinion - the statement level is nice, but what readability?
>
> SELECT UNCACHED t.a, t.b FROM INTO a,b;

Yeah -- this is pretty ugly admittedly. Maybe control directive is
ok, as long as you can set it mid function?

ADA uses for this purpose PRAGMA keyword - it is used for everything in ADA - cycle iteration optimization, ...the scope can be statement, block, procedure.

so something like

BEGIN

PRAGMA uncached_plans;

SELECT ...

END;

But it should be verified by some PL/SQL or Ada experts

Regards

merlin

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

04 January 2017, 19:09:52

> SELECT UNCACHED t.a, t.b FROM INTO a,b;

Yeah -- this is pretty ugly admittedly. Maybe control directive is
ok, as long as you can set it mid function?

ADA uses for this purpose PRAGMA keyword - it is used for everything in ADA - cycle iteration optimization, ...the scope can be statement, block, procedure.

Pragma is used for changing (enforcing) behave. There are pragmas ada_05, ada_2012, ..

so something like

BEGIN
PRAGMA uncached_plans;
SELECT ...
..
END;

But it should be verified by some PL/SQL or Ada experts

Regards

merlin

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

04 January 2017, 22:00:24

>>
>> >> *) Some user visible mechanic other than forcing SQL through EXECUTE
>> >> to be able to control plan caching would be useful.
>> >
>> > fully agree.
>> >
>> > Have you some ideas?
>> >
>> > What about plpgsql option (function scope) -- WITHOUT-PLAN-CACHE - any
>> > non
>> > trivial plans will not be cached - and evaluated as parametrized query
>> > only.
>>
>> I have slight preference for syntax marker for each query, similar to
>> INTO. Maybe 'UNCACHED'?
>
>
> I am not clean opinion - the statement level is nice, but what readability?
>
> SELECT UNCACHED t.a, t.b FROM INTO a,b;

Yeah -- this is pretty ugly admittedly. Maybe control directive is
ok, as long as you can set it mid function?

ADA uses for this purpose PRAGMA keyword - it is used for everything in ADA - cycle iteration optimization, ...the scope can be statement, block, procedure.

so something like

BEGIN
PRAGMA uncached_plans;
SELECT ...
..
END;

But it should be verified by some PL/SQL or Ada experts

Little bit better - if PRAGMA is used in DECLARE part, then it has block scope

so some possible design can be:

DECLARE

PRAGMA UNCACHED_PLANS;

BEGIN

SELECT ... INTO ;

END;

This respects Ada and PL/SQL style - probably easy implementation

Regards

Pavel

Regards

merlin

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

04 January 2017, 22:49:47

so some possible design can be:

DECLARE
PRAGMA UNCACHED_PLANS;
BEGIN
SELECT ... INTO ;
SELECT ... INTO ;
END;

This respects Ada and PL/SQL style - probably easy implementation

Regards

Pavel

some examples based on Ada doc

FUNCTION xxx RETURN int AS

PRAGMA yyy -- pragma has function scope

BEGIN

FUNCTION xxx RETURN int AS

BEGIN

DECLARE

PRAGMA yyy -- pragma has block scope

Regards

Pavel

Regards

merlin

Re: [HACKERS] merging some features from plpgsql2 project

From

Merlin Moncure

Date:

05 January 2017, 00:07:09

On Wed, Jan 4, 2017 at 1:49 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
>>
>> so some possible design can be:
>>
>> DECLARE
>>   PRAGMA UNCACHED_PLANS;
>> BEGIN
>>   SELECT ... INTO ;
>>   SELECT ... INTO ;
>> END;
>>
>> This respects Ada and PL/SQL style - probably easy implementation
>>
>> Regards
>>
>> Pavel
>
>
> some examples based on Ada doc
>
> FUNCTION xxx RETURN int AS
>   PRAGMA yyy -- pragma has function scope
> BEGIN
>
> FUNCTION xxx RETURN int AS
> BEGIN
>   DECLARE
>     PRAGMA yyy -- pragma has block scope

ok, sub-block makes sense over statement level IMO.

merlin

Re: [HACKERS] merging some features from plpgsql2 project

From

Robert Haas

Date:

05 January 2017, 20:03:06

On Wed, Dec 28, 2016 at 2:25 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> That's my whole point of why this needs to be settable at a global level: so
> that people with a lot of legacy code can set the OLD behavior at a global
> level, and deal with the old code over time.

This has the same problem being discussed nearby on the case-folding
thread, though: any extension or third-party tool has to either work
with every possible value, or else it has to require one particular
value and therefore not be usable if you need another value for some
other reason.

Now, that's not to say we should never break backward compatibility.
Sometimes we should.  I think the problem with PL/pgsql is that many
of the compatibility breaks that people want are likely to lead to
subtle misbehavior rather than outright failure, or are not easy to
spot via a cursory look ("hmm, could that SELECT query ever return
more than one row?").  Also, while everybody agrees that a bunch of
things should be changed and improved, not everybody agrees about
which ones, and sometimes person A desperately wants X changed while
person B desperately wants it changed in the other direction or left
alone.  If there were a set of changes that we could make all at once,
call the result plpgsql2 or nplpgsql or whatever, and make everybody
happy, that'd be fabulous, but we don't.  So we're left with doing
nothing, or having 2^n language variants controlled by GUCs or
pragmas, neither of which is appealing.

I think it would be a good idea to lock all the people who really care
about PL/pgsql in a room until they agree on what changes should be
made for the next version of the language.  If they don't agree
quickly enough, we can resort to the techniques described in
https://en.wikipedia.org/wiki/Papal_election,_1268%E2%80%9371

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] merging some features from plpgsql2 project

From

Merlin Moncure

Date:

05 January 2017, 20:36:21

On Thu, Jan 5, 2017 at 11:03 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> Now, that's not to say we should never break backward compatibility.
> Sometimes we should.  I think the problem with PL/pgsql is that many
> of the compatibility breaks that people want are likely to lead to
> subtle misbehavior rather than outright failure, or are not easy to
> spot via a cursory look ("hmm, could that SELECT query ever return
> more than one row?").

The core issue is that developers tend to be very poor at estimating
the impacts of changes; they look at things the the lens of the "new".
Professional software development is quite expensive and framework-
(I'll lump the database and it's various built-in features under that
term) level changes are essentially throwing out some portion of our
user's investments.  Even fairly innocent compatibility breaks can
have major downstream impacts on our users and it's always much worse
than expected.  For example, nobody thought that changing the bytea
text encoding format to hex would have corrupted our user's data, but
it did.

TBH, the discussion should shift away from specific issues on
compatibility and towards a specific set of standards and policies
around how to do it and what kinds of technical justifications need to
be made in advance.  Security problems for example could be argued as
a valid reason to break user code, or poor adherence to the the SQL
standard which are in turn blocking other content.  Minus those kinds
of considerations it's really just not worth doing, and there's no
tricky strategy like playing with version numbers that can game that
rule.  A formal deprecation policy might be a good start.

The C language really should be considered the gold standard here.
Changes did have to be made, like getting rid of the notoriously
broken and insecure gets(), but they were made very, very slowly and
unobtrusively.

(I do think lpad should except "any" FWIW) :-D

merlin

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

06 January 2017, 09:28:22

2017-01-05 18:36 GMT+01:00 Merlin Moncure <mmoncure@gmail.com>:

On Thu, Jan 5, 2017 at 11:03 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> Now, that's not to say we should never break backward compatibility.
> Sometimes we should. I think the problem with PL/pgsql is that many
> of the compatibility breaks that people want are likely to lead to
> subtle misbehavior rather than outright failure, or are not easy to
> spot via a cursory look ("hmm, could that SELECT query ever return
> more than one row?").

The core issue is that developers tend to be very poor at estimating
the impacts of changes; they look at things the the lens of the "new".
Professional software development is quite expensive and framework-
(I'll lump the database and it's various built-in features under that
term) level changes are essentially throwing out some portion of our
user's investments. Even fairly innocent compatibility breaks can
have major downstream impacts on our users and it's always much worse
than expected. For example, nobody thought that changing the bytea
text encoding format to hex would have corrupted our user's data, but
it did.

TBH, the discussion should shift away from specific issues on
compatibility and towards a specific set of standards and policies
around how to do it and what kinds of technical justifications need to
be made in advance. Security problems for example could be argued as
a valid reason to break user code, or poor adherence to the the SQL
standard which are in turn blocking other content. Minus those kinds
of considerations it's really just not worth doing, and there's no
tricky strategy like playing with version numbers that can game that
rule. A formal deprecation policy might be a good start.

The C language really should be considered the gold standard here.
Changes did have to be made, like getting rid of the notoriously
broken and insecure gets(), but they were made very, very slowly and
unobtrusively.

(I do think lpad should except "any" FWIW) :-D

I fully agree - sometimes there is fuzzy border in understanding what is bug and what unhappy designed feature - probably lot of useful features can be taken and used wrong.

Regards

Pavel

merlin

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

06 January 2017, 21:36:54

>
> some examples based on Ada doc
>
> FUNCTION xxx RETURN int AS
> PRAGMA yyy -- pragma has function scope
> BEGIN
>
> FUNCTION xxx RETURN int AS
> BEGIN
> DECLARE
> PRAGMA yyy -- pragma has block scope

ok, sub-block makes sense over statement level IMO.

I am sending proof concept (parser only implementation) - it allows to control query plan usage on function and on block level

Examples

CREATE OR REPLACE FUNCTION fx()

RETURNS int AS $$

PRAGMA use_query_plan_cache(off); -- disable query plan cache on function level

DECLARE r record;

BEGIN

FOR r IN SELECT ... -- some complex query, where we prefer on one shot plan

LOOP

DECLARE

PRAGMA use_query_plan_cache(on); -- enable query plan cache for block

BEGIN

... statements inside cycle reuses query plan

END;

END LOOP;

END;

$$ LANGUAGE plpgsql;

BEGIN

...

DECLARE

PRAGMA use_query_plan_cache(off);

BEGIN

-- these queries has fresh plan only

SELECT ...

END; -- end of PRAGMA scope

...

-- usual behave

END;

The behave is static - controlled on compile time only - the controlled feature can be enabled/disabled. The impact on runtime is zero

* the syntax is verbose - readable - I prefer strong clean signal for readers so something internals is different

* consistent with Ada, PL/SQL

* remove one reason for dynamic SQL

* allows to mix queries with without query plan cache - interesting for patter FOR IN slow query LOOP fast query; END LOOP;

* there is small risk of compatibility break - if somebody use variables named PRAGMA, because new reserved keyword is necessary - fails on syntax error - so it is easy identified.

* this syntax can be reused - autonomous_transaction like PL/SQL. I read a manual of Gnu Ada - and this is used often for implementation legacy (obsolete) behave, functionality.

Notes, comments?

Regards

Pavel

Attachment

pragma_query_plan_cache.patch

Re: [HACKERS] merging some features from plpgsql2 project

From

Joel Jacobson

Date:

07 January 2017, 02:42:21

On Thu, Jan 5, 2017 at 7:03 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> I think it would be a good idea to lock all the people who really care
> about PL/pgsql in a room until they agree on what changes should be
> made for the next version of the language.  If they don't agree
> quickly enough, we can resort to the techniques described in
> https://en.wikipedia.org/wiki/Papal_election,_1268%E2%80%9371

I think that's a very good idea, and I'm happy to be locked into such a room.

I think such a discussion will be very fruitful,
given the others in the room have also
already decided they want a new language
and are there to discuss "the next version of the language",
instead of debating why they don't think we need a new language.

It would also be good if those people could bring laptops
with all their plpgsql code bases, to check if any of
the proposed possibly non-backwards compatible
syntax proposals would break nothing, just a few functions,
or a lot of functions in their code bases.

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

07 January 2017, 11:06:55

2017-01-03 17:57 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:

On 1/2/17 1:51 PM, Pavel Stehule wrote:
1) Neither is enabled by default, so 90% of users have no idea they
exist. Obviously that's an easy enough fix, but...

We can strongly talk about it - there can be a chapter in plpgsql doc.
Now, the patterns and antipatterns are not officially documented.

Or just fix the issue, provide the backwards compatability GUCs and move on.

2) There's no way to incrementally change those values for a single
function. If you've set extra_errors = 'all' globally, a single
function can't say "turn off the too many rows setting for this
function".

We can enhance the GUC syntax like "all -too_many_rows,-xxx"

Why create all that framework when we could just have multiple plpgsql.blah GUCs? plpgsql.multirow_assign_level=FATAL solves that problem. We just need a plpgsql GUC for each backwards compatibility break.

BTW, while I can see value in being able to change these settings
for an entire function, I think the recommended use should be to
only change them for a specific statement.

What you can do in plain assign statement

target := expression ?

The point I was trying to make there is if you do have some cases where you need to silently ignore extra rows (for example) it's probably only one statement and not an entire function. That said, if we just make these options GUCs then you can just do SET and RESET.

My border is any compatibility break - and I would not to across it.
First issue is probably harder

If we never broke compatibility we'd still be allowing SELECT without FROM, NULL = NULL being TRUE, and a whole bunch of other problems. We'd also be stuck on protocol v1 (and of course not talking about what we want in v4).

We've successfully made incompatible changes that were *far worse* than this (ie: renaming pg_stat_activity.procpid). Obviously we shouldn't be breaking things willy-nilly, but these are long-standing warts (dare I say BUGS?) that should be fixed. They're ugly enough that someone took the time to break plpgsql out of the core code and fork it.

The discussion about changing behave of current features has not a solution. I don't believe so there is possible to find a win/win solution - it is not possible. I respect the opinion all people here, but somebody more afraid of language fragmentation, somebody else more from some possible production issues. All arguments are valid, all arguments has same value, all arguments are sent by people with lot of experience (but with different experience - anybody works with different domains, uses different patters, hits different kind of errors).

This discussion was +/- about behave of INTO clause - and if STRICT clause should be default and if STRICT clause should be more strict.

if we introduce new pattern (new syntax), that can be strict enough then we can go forward. New syntax will not have a impact on current customers code base. If somebody prefer very strict behave, then he can use new syntax quickly. Issues in old code can be detected by other tools - plpgsql_check, and extra_warnings, extra_errors.

Current state

==========

INTO

-------

* is not strict in any directions - columns or rows

* not equal target and source column number cannot be detected in plpgsql (plpgsql_check does it)

* missing rows - uses FOUND (PL/SQL), too much rows - uses GET DIAGNOSTICS x = ROW_COUNT (ANSI/SQL pattern)

* doesn't need outer subtransaction (handled exception) for handling specific situations

select a into t from test where a = i;

get diagnostics r = row_count;

if r > 1 then raise exception 'too much rows'; end if;

INTO STRICT

------------------

* is strict in rows - require exactly one row result

* not equal target and source column number cannot be detected in plpgsql (plpgsql_check does it)

* raises a exceptions no_data_found, too_much_rows

* require outer subtransaction (handled exception) for handling no_data_found (10% slowdown in worst case)

begin

select a into t from test where a = i;

exception

when no_data_found then

t = null;

end;

There is safe pattern (workaround) for missing columns in source - using RECORD type. Access to missing field raises runtime error. Another solution - plpgsql_check.

subselect assignment

-----------------------------

* is column strict - doesn't allow more columns now

* don't allow too_much_rows

* the FOUND is always true - has not change to check it

* although it is ANSI/SQL pattern it is undocumented in PostgreSQL (only INTO clause is documented)

x := (select a from test where a = i)

officially unsupported implementation side effect assignment FROM

------------------------------------------------------------------------------------------

* is undocumented

* allows only one column

* behave is similar like INTO

x := a from test where a = i

Possible risk

============

* typo SELECT a b INTO a,b or SELECT a,b INTO b,a - currently I use a RECORD type to be safe

* undetermined result in more rows

* unknown result in no_data_found

The questions - what we expect from new behave?

=======================================

* safety against typo - 100% yes

* safety against undetermined result - 100% yes

* what behave should be default in no_data_found case ? Exception or NULL? - both has advantages for who require designed behave and disadvantage for require second behave (a) handling the exception, b) IF NOT FOUND test)

Possible solution (without compatibility break)

===================================

introduction ANSI/SQL assingnment, multiassignment SET

------------------------------------------------------------------------------

SET varname = (SELECT ...)

SET (t1, t2, t3, ...) = (SELECT c1, c2, c3 ..)

- possible identifier collision with GUC (not in multiassignment variant)

- is not too robust against typo SET (t1, t2,..) = (SELECT t2,t1, ...)

- it is ANSI/SQL design/behave

introduction new syntax proposed by plpgsql2 project

-----------------------------------------------------------------------

SELECT t1 := c1, t2 := c2, ...

- it can be PostgreSQL specific syntax - full control over design

- maximally robust against typo

- long syntax, but for short syntax can be used SELECT c1,c2,c3, .. INTO STRICT recvar

- what should be no_data_found behave?

I have nothing about a cost of "new syntax" implementation - but for me - it looks like good solution for us - it can be win/win solution. It breaks nothing - it introduce nice to have typo robust syntax.

Comments, notes?

Regards

Pavel

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

07 January 2017, 14:39:39

* EXECUTE and FOUND - this is incompatible change, extra check can be used (test on unset variable). I see solution in leaving FOUND variable and introduction of some new without this issue - ROW_COUNT maybe (this is another possible incompatible change, but with higher benefit - maybe we can introduce some aliasing, PRAGMA clause, default PRAGMAs, ..).

I checked current implementation of FOUND variable. If we introduce new auto variable ROW_COUNT - exactly like FOUND, then it doesn't introduce any compatibility break.

ROW_COUNT .. shortcut for GET DIAGNOSTICS row_count = ROW_COUNT.

Comments, notes?

Regards

Pavel

Re: [HACKERS] merging some features from plpgsql2 project

From

Jim Nasby

Date:

08 January 2017, 05:27:15

On 1/5/17 11:36 AM, Merlin Moncure wrote:
> The C language really should be considered the gold standard here.
> Changes did have to be made, like getting rid of the notoriously
> broken and insecure gets(), but they were made very, very slowly and
> unobtrusively.

For those not familiar... how did they accomplish that?

I'm certainly fine with changes being made very slowly. We carried the 
missing_From GUC around for like a decade before ditching it. We have 
other GUCs that have defaulted to not allowing silly behavior for a long 
time as well. We might well need to leave the default for compatibility 
GUCs set to current behavior for several more releases, to allow people 
to start specifying which behavior they want.

I agree with Robert that there needs to be consensus that a change needs 
to be made, but frankly I think 50% of this thread was people 
disagreeing with *ANY* change that would be incompatible. IMHO that's a 
ridiculous position that does not match expectations outside of plpgsql. 
That kind of expectation means we have absolutely no way of fixing past 
mistakes.

Certainly, there also needs to be agreement on what the new behavior 
should be, but again, what I observed was an adamant insistence that 
absolutely no break would be permitted.

As for using GUCs for these changes and that impact on extensions, I 
don't see why that won't work for what we're discussing here. In a 
worst-case scenario, extension authors would need to specify what 
behavior they wanted in their extensions instead of blindly accepting 
the default, by making sure those options were set for each function 
they defined. While it would certainly be nice to avoid that extra work, 
all the necessary infrastructure to handle that is already in place. And 
if we wanted to avoid that hassle, we could allow custom GUC settings on 
extensions, like we currently do for roles and databases.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Jim Nasby

Date:

08 January 2017, 05:31:33

On 1/7/17 5:39 AM, Pavel Stehule wrote:
>
> I checked current implementation of FOUND variable. If we introduce new
> auto variable ROW_COUNT - exactly like FOUND, then it doesn't introduce
> any compatibility break.

Except it would break every piece of code that had a row_count variable, 
though I guess you could see which scoping level the variable had been 
defined in.

I think the right solution in this case is to replace GET DIAGNOSTICs 
with something easier to use, but I'm not sure what that would be.

I think this is another example of where not using some kind of 
character to distinguish variables screws us. :/
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Jim Nasby

Date:

08 January 2017, 05:39:24

On 1/7/17 2:06 AM, Pavel Stehule wrote:
>
>     SELECT t1 := c1, t2 := c2, ...
>
> - it can be PostgreSQL specific syntax - full control over design
> - maximally robust against typo
> - long syntax, but for short syntax can be used SELECT c1,c2,c3, .. INTO
> STRICT recvar

I don't think overloading a SQL command is a good idea. We'd be in 
trouble if ANSI ever introduced :=. I think that could also conflict 
with existing operators.

> - what should be no_data_found behave?

Considering where we're at today, I don't think there should be a 
default behavior; make the user specify somehow whether missing data is 
allowed or not.

> I have nothing about a cost of "new syntax" implementation - but for  me
> - it looks like good solution for us - it can be win/win solution. It
> breaks nothing - it introduce nice to have typo robust syntax.

Related to that, I suspect we could add better support to existing 
commands for at least some of these things. For example, SELECT ... INTO 
NOMULTI (instead of STRICT) to indicate that multiple rows are an error 
but missing data is OK.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Tom Lane

Date:

08 January 2017, 05:53:57

Jim Nasby <Jim.Nasby@BlueTreble.com> writes:
> On 1/7/17 5:39 AM, Pavel Stehule wrote:
>> I checked current implementation of FOUND variable. If we introduce new
>> auto variable ROW_COUNT - exactly like FOUND, then it doesn't introduce
>> any compatibility break.

> Except it would break every piece of code that had a row_count variable, 
> though I guess you could see which scoping level the variable had been 
> defined in.

If FOUND were declared at an outer scoping level such that any
user-created declaration overrode the name, then we could do likewise
for other auto variables and not fear compatibility breaks.

Currently, though, we don't seem to be quite there: it looks like
FOUND is an outer variable with respect to DECLARE blocks, but it's
more closely nested than parameter names.  Compare:

regression=# create function foo1(bool) returns bool as
'declare found bool := $1; begin return found; end' language plpgsql;
CREATE FUNCTION
regression=# select foo1(true);foo1 
------t
(1 row)

regression=# create function foo2(found bool) returns bool as
regression-# 'begin return found; end' language plpgsql;
CREATE FUNCTION
regression=# select foo2(true);foo2 
------f
(1 row)

Not sure if changing this would be a good thing or not --- was
there reasoning behind this behavior, or was it just accidental?
        regards, tom lane

Re: [HACKERS] merging some features from plpgsql2 project

From

Jim Nasby

Date:

08 January 2017, 06:11:54

On 1/7/17 8:53 PM, Tom Lane wrote:
> If FOUND were declared at an outer scoping level such that any
> user-created declaration overrode the name, then we could do likewise
> for other auto variables and not fear compatibility breaks.
>
> Currently, though, we don't seem to be quite there: it looks like
> FOUND is an outer variable with respect to DECLARE blocks, but it's
> more closely nested than parameter names.

Sorry, I'm not following... you can override a parameter name the same 
way and get the same behavior, no?

BTW, I do wish you could change the label of the scope that arguments 
went into, so that you could use that label to refer to function 
parameters. If we allowed that it'd perhaps be the best of both worlds: 
you'd be guaranteed access to all auto variables and parameters, and 
that access wouldn't need to be tied to the function name (which can be 
both painful and error prone).
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

08 January 2017, 08:19:50

2017-01-08 3:53 GMT+01:00 Tom Lane <tgl@sss.pgh.pa.us>:

Jim Nasby <Jim.Nasby@BlueTreble.com> writes:
> On 1/7/17 5:39 AM, Pavel Stehule wrote:
>> I checked current implementation of FOUND variable. If we introduce new
>> auto variable ROW_COUNT - exactly like FOUND, then it doesn't introduce
>> any compatibility break.

> Except it would break every piece of code that had a row_count variable,
> though I guess you could see which scoping level the variable had been
> defined in.

If FOUND were declared at an outer scoping level such that any
user-created declaration overrode the name, then we could do likewise
for other auto variables and not fear compatibility breaks.

Currently, though, we don't seem to be quite there: it looks like
FOUND is an outer variable with respect to DECLARE blocks, but it's
more closely nested than parameter names. Compare:

regression=# create function foo1(bool) returns bool as
'declare found bool := $1; begin return found; end' language plpgsql;
CREATE FUNCTION
regression=# select foo1(true);
foo1
------
t
(1 row)

regression=# create function foo2(found bool) returns bool as
regression-# 'begin return found; end' language plpgsql;
CREATE FUNCTION
regression=# select foo2(true);
foo2
------
f
(1 row)

Not sure if changing this would be a good thing or not --- was
there reasoning behind this behavior, or was it just accidental?

There are two related features in plpgsql2 project:

1. dynamic SQL sets FOUND variable

2. direct access to processed rows info via variable ROW_COUNT

@1 is incompatible change, @2 is good enough - so we should not to change FOUND, but we can propagate ROW_COUNT instead.

Regards

Pavel

regards, tom lane

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

08 January 2017, 08:26:01

2017-01-08 4:11 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:

On 1/7/17 8:53 PM, Tom Lane wrote:
If FOUND were declared at an outer scoping level such that any
user-created declaration overrode the name, then we could do likewise
for other auto variables and not fear compatibility breaks.

Currently, though, we don't seem to be quite there: it looks like
FOUND is an outer variable with respect to DECLARE blocks, but it's
more closely nested than parameter names.

Sorry, I'm not following... you can override a parameter name the same way and get the same behavior, no?

It is declared before any custom identifier. If you override it, then you are working with own variable - not with auto variable.

BTW, I do wish you could change the label of the scope that arguments went into, so that you could use that label to refer to function parameters. If we allowed that it'd perhaps be the best of both worlds: you'd be guaranteed access to all auto variables and parameters, and that access wouldn't need to be tied to the function name (which can be both painful and error prone).

We can talk about compiler directive.

PRAGMA auto_variables_label(xxxx) -- require function scope only

BEGIN

IF xxxx.FOUND THEN

Regards

Pavel

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

08 January 2017, 08:44:42

2017-01-08 3:39 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:

On 1/7/17 2:06 AM, Pavel Stehule wrote:

SELECT t1 := c1, t2 := c2, ...

- it can be PostgreSQL specific syntax - full control over design
- maximally robust against typo
- long syntax, but for short syntax can be used SELECT c1,c2,c3, .. INTO
STRICT recvar

I don't think overloading a SQL command is a good idea. We'd be in trouble if ANSI ever introduced :=. I think that could also conflict with existing operators.

The ":=" operator is used ANSI/SQL already for named arguments. Isn't probable so ANSI uses it in different context.

This is not overloading of SQL command - it is like annotations. It is smart idea, so I was not surprised if ANSI/SQL reuses it. There is not any possible construct, that is robust against typo - because assignment is very verbose and natural.

ANSI - SQL/PSM uses two methods

1. multiassignment

SET (a,b,c) = (SELECT a, b, c ...)

2. auto variables in dedicated new scope

FOR scope_label IN SELECT a, b, c

-- you can use variables a, b, c

-- you can use qualified variables scope_label.a, scope_label.b, ..

END FOR

This method is not possible in PL/pgSQL - but a work with record type is similar

- what should be no_data_found behave?

Considering where we're at today, I don't think there should be a default behavior; make the user specify somehow whether missing data is allowed or not.

I have nothing about a cost of "new syntax" implementation - but for me
- it looks like good solution for us - it can be win/win solution. It
breaks nothing - it introduce nice to have typo robust syntax.

Related to that, I suspect we could add better support to existing commands for at least some of these things. For example, SELECT ... INTO NOMULTI (instead of STRICT) to indicate that multiple rows are an error but missing data is

Another flag into NOMULTI can be solution too.

The new syntax ":=" has some advantages:

1. it robust against type - it is harder to do unwanted swap of variables, and this mistake is very clear

2. the syntax ensure equality of target variables and source expressions.

I see valuable benefit of this syntax

Regards

Pavel

OK.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

08 January 2017, 08:45:46

2017-01-08 3:31 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:

On 1/7/17 5:39 AM, Pavel Stehule wrote:

I checked current implementation of FOUND variable. If we introduce new
auto variable ROW_COUNT - exactly like FOUND, then it doesn't introduce
any compatibility break.

Except it would break every piece of code that had a row_count variable, though I guess you could see which scoping level the variable had been defined in.

I think the right solution in this case is to replace GET DIAGNOSTICs with something easier to use, but I'm not sure what that would be.

I invite any ideas?

Regards

Pavel

I think this is another example of where not using some kind of character to distinguish variables screws us. :/

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

08 January 2017, 09:03:51

BTW, I do wish you could change the label of the scope that arguments went into, so that you could use that label to refer to function parameters. If we allowed that it'd perhaps be the best of both worlds: you'd be guaranteed access to all auto variables and parameters, and that access wouldn't need to be tied to the function name (which can be both painful and error prone).

We can talk about compiler directive.

PRAGMA auto_variables_label(xxxx) -- require function scope only

If we know a list of all auto variables, then it can be on function or block level - it can create aliases.

Regards

Pavel

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

08 January 2017, 09:56:17

2017-01-08 3:27 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:

On 1/5/17 11:36 AM, Merlin Moncure wrote:
The C language really should be considered the gold standard here.
Changes did have to be made, like getting rid of the notoriously
broken and insecure gets(), but they were made very, very slowly and
unobtrusively.

For those not familiar... how did they accomplish that?

I'm certainly fine with changes being made very slowly. We carried the missing_From GUC around for like a decade before ditching it. We have other GUCs that have defaulted to not allowing silly behavior for a long time as well. We might well need to leave the default for compatibility GUCs set to current behavior for several more releases, to allow people to start specifying which behavior they want.

I agree with Robert that there needs to be consensus that a change needs to be made, but frankly I think 50% of this thread was people disagreeing with *ANY* change that would be incompatible. IMHO that's a ridiculous position that does not match expectations outside of plpgsql. That kind of expectation means we have absolutely no way of fixing past mistakes.

Certainly, there also needs to be agreement on what the new behavior should be, but again, what I observed was an adamant insistence that absolutely no break would be permitted.

As for using GUCs for these changes and that impact on extensions, I don't see why that won't work for what we're discussing here. In a worst-case scenario, extension authors would need to specify what behavior they wanted in their extensions instead of blindly accepting the default, by making sure those options were set for each function they defined. While it would certainly be nice to avoid that extra work, all the necessary infrastructure to handle that is already in place. And if we wanted to avoid that hassle, we could allow custom GUC settings on extensions, like we currently do for roles and databases.

The discussion related to plpgsql2, future development of plpgsql has more levels, topics

1. incompatible changes - INTO, INTO STRICT, FOUND - there is a agreement so current behave is not ideal for all cases, but there is not a agreement so it broken and should be "fixed" - GUC doesn't helps here.

2. new features - the question was "how much we would to move PL/pgSQL from verbose Ada language to different place - convention against configuration principle", and what (if) conventions should be default. GUC can partially helps.

I still hope so there is some space for finding a partial agreement - and we can do some evolution steps forward.

I would not to use GUC like "We cannot to find a agreement, so we use GUC and somebody will use this feature, some one not" - it is not way how to do things better long time.

Jim, Marko, Joel - is there a place, features where we can find a partial agreement? If it is, then we can move our view there.

Regards

Pavel

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

08 January 2017, 09:57:16

Related to that, I suspect we could add better support to existing commands for at least some of these things. For example, SELECT ... INTO NOMULTI (instead of STRICT) to indicate that multiple rows are an error but missing data is

Another flag into NOMULTI can be solution too.

The new syntax ":=" has some advantages:

1. it robust against type - it is harder to do unwanted swap of variables, and this mistake is very clear

should be "against typo", sorry

Pavel

2. the syntax ensure equality of target variables and source expressions.

Re: [HACKERS] merging some features from plpgsql2 project

From

Joel Jacobson

Date:

08 January 2017, 11:52:09

On Sat, Jan 7, 2017 at 8:56 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
> Jim, Marko, Joel - is there a place, features where we can find a partial agreement? If it is, then we can move our
viewthere.

I have decided I definitively want a new language, and I'm willing to
pay for it.

Hopefully the community will join forces and contribute with ideas and
code, but with or without you or the rest of the community, plpgsql2
is going to happen.
Call it pltrustly or plpgsql2, I don't care. I just care about ending
my suffering from being forced writing plpgsql every day. It sucks,
and I'm going to end it.

I'm just too fed up with the annoyances of plpgsql. I cannot care less
about _hypothetical_ incompatibility problems,
I think your arguments "this is like Perl6 or Python3" are delusional.
You can easily intermix plpgsql and plpgsql2 in the same
"application", something you cannot do with Perl6 or Python3. So
please stop using that as an argument.

If anyone has an application where the hypothetical incompatibility
problems would be a problem, then just continue to use plpgsql.

And please kill all these GUCs ideas. The best thing with PostgreSQL
is the natural expected behaviour of the default configuration.
Contrary to MySQL where you have to enable lots and lots of
configuration options just to get a behaviour you expect as a novice
user.

It's much better to just come together and agree on whatever we have
learned during the last 15 years of PL/pgSQL1, and sample all ideas
during a year maybe, and decide what to put into PL/pgSQL2. To make it
useful, we should aim to not break compatibility for _most_ code, but
accept some necessary rewrites of functions with deprecated
anti-patterns.

I'm even willing to suggest it might be a good idea to first try out
PL/pgSQL2 at Trustly, and after a year of usage, report back to the
community of our findings on how well it worked out for us, to allow
all others to learn from our mistakes during our first year of using
the new language. That way less people and companies will have to
suffer when we discover what we got wrong in what we thought would
work out well for us.

During the same trial period maybe your company Pavel and others can
try out their ideas of a PL/pgSQL2 and implement it, see how it works
out for you, and then report back to the community on your findings
from production environments.

That way we can avoid all these hypothetical discussions on what will
be good or bad without having any empirical evidence at hand.

Re: [HACKERS] merging some features from plpgsql2 project

From

Jim Nasby

Date:

09 January 2017, 02:37:15

On 1/8/17 2:52 AM, Joel Jacobson wrote:
> And please kill all these GUCs ideas. The best thing with PostgreSQL
> is the natural expected behaviour of the default configuration.
> Contrary to MySQL where you have to enable lots and lots of
> configuration options just to get a behaviour you expect as a novice
> user.

The only reason to use GUCs or some other kind of backwards 
compatibility setting would be to allow the current plpgsql itself to 
move forwards. If you think that's a dead end (which I can certainly 
understand) then they make no sense at all.

> It's much better to just come together and agree on whatever we have
> learned during the last 15 years of PL/pgSQL1, and sample all ideas
> during a year maybe, and decide what to put into PL/pgSQL2. To make it
> useful, we should aim to not break compatibility for _most_ code, but
> accept some necessary rewrites of functions with deprecated
> anti-patterns.

If we're going to create a brand new language then I think it would be 
extremely foolish to keep *any* of the current pain points around. Off 
the top of my head:

- variables must have an identifier (what $ in most languages does). The 
steps you have to go through to avoid simple naming collisions are insane.

- Support for composite types needs to be stronger. Off the top of my 
head, you need to be able to reference an element name via a variable. 
OR, maybe it'd be better to just provide a plpgsql equivalent to a dict.

- GET DIAGNOSTICS and their ilk need to die. There needs to be an easier 
way to get that kind of info back (perhaps via an automatic 
composite/record/dict).

- There needs to be real support for dealing with exceptions. IE: get a 
composite of all exception deatils, modify parts of it, then re-raise 
with the new info.

- Real support for using variables as identifiers / nothing restricted 
to only accepting a Const.

- Support for the notion of a variable being unset (which is NOT the 
same thing as NULL).


That said, I'll bet we still get some of that wrong, so there better be 
some way of fixing those issues down the road...
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Jim Nasby

Date:

09 January 2017, 02:39:42

On 1/7/17 11:44 PM, Pavel Stehule wrote:
> This is not overloading of SQL command - it is like annotations. It is
> smart idea, so I was not surprised if ANSI/SQL reuses it.

SHas ANSI declared that they will NEVER support := in a SELECT that's 
not running in a stored function? Because if they haven't done that, 
there's nothing preventing them from doing just that. If that happens 
we're going to have some very difficult choices to make.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Jim Nasby

Date:

09 January 2017, 03:10:19

On 1/8/17 12:03 AM, Pavel Stehule wrote:
>         BTW, I do wish you could change the label of the scope that
>         arguments went into, so that you could use that label to refer
>         to function parameters. If we allowed that it'd perhaps be the
>         best of both worlds: you'd be guaranteed access to all auto
>         variables and parameters, and that access wouldn't need to be
>         tied to the function name (which can be both painful and error
>         prone).
>
>
>     We can talk about compiler directive.
>
>     PRAGMA auto_variables_label(xxxx) -- require function scope only
>
>
> If we know a list of all auto variables, then it can be on function or
> block level - it can create aliases.

Oh, the problem is that if you have an argument with the same name as an 
auto variable you're in trouble.

Probably the easiest thing is to have a scope that sits above the scope 
containing the arguments, and then allow the user to rename both scopes 
if so desired. So in effect you'd end up with

<<plpgsql>> -- new scope
DECLARE
FOUND;
etc
BEGIN  <<function_name>>  DECLARE  argument_1;  argument_2;  BEGIN    -- User supplied block goes here, with optional
label END;

END;

Alternatively, we could do...

<<function_name>>
DECLARE
FOUND;
etc
BEGIN  DECLARE-- User's DECLARE  argument_1;  argomuent_2;  -- User supplied declare code  BEGIN -- User's BEGIN  ....
END

That removes one level of nesting. It's probably better to go with the 
first option though, since it's simpler.

In both cases, I'd really like the ability to rename those blocks. 
#pragma would be fine for that.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

09 January 2017, 11:42:52

2017-01-09 0:37 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:

On 1/8/17 2:52 AM, Joel Jacobson wrote:
And please kill all these GUCs ideas. The best thing with PostgreSQL
is the natural expected behaviour of the default configuration.
Contrary to MySQL where you have to enable lots and lots of
configuration options just to get a behaviour you expect as a novice
user.

The only reason to use GUCs or some other kind of backwards compatibility setting would be to allow the current plpgsql itself to move forwards. If you think that's a dead end (which I can certainly understand) then they make no sense at all.

It's much better to just come together and agree on whatever we have
learned during the last 15 years of PL/pgSQL1, and sample all ideas
during a year maybe, and decide what to put into PL/pgSQL2. To make it
useful, we should aim to not break compatibility for _most_ code, but
accept some necessary rewrites of functions with deprecated
anti-patterns.

If we're going to create a brand new language then I think it would be extremely foolish to keep *any* of the current pain points around. Off the top of my head:

- variables must have an identifier (what $ in most languages does). The steps you have to go through to avoid simple naming collisions are insane.

just note - from 9.0 the collisions are not a issue

- Support for composite types needs to be stronger. Off the top of my head, you need to be able to reference an element name via a variable. OR, maybe it'd be better to just provide a plpgsql equivalent to a dict.

This point self needs significant code refactoring - maybe total rewriting PL executor - it allows to change expression result data type in cycle. It doesn't mean so I fully disagree with this point, but it is not easy to implement it in type strict environment - C, C++, Pascal, Ada - hasn't any similar - maybe it is possible with some libraries.

- GET DIAGNOSTICS and their ilk need to die. There needs to be an easier way to get that kind of info back (perhaps via an automatic composite/record/dict).

It is about performance - probably you wouldn't to fill all dict fields after any statement.

- There needs to be real support for dealing with exceptions. IE: get a composite of all exception deatils, modify parts of it, then re-raise with the new info.

- Real support for using variables as identifiers / nothing restricted to only accepting a Const.

second point that enforces new PL environment - writing from scratch - and it hides the cost of dynamic SQL

- Support for the notion of a variable being unset (which is NOT the same thing as NULL).

That said, I'll bet we still get some of that wrong, so there better be some way of fixing those issues down the road...

With these requests you have to specify first, how much close will be your ideal language with PostgreSQL. Currently PL/pgSQL is pretty close - with some impacts. Your mentioned features can requires more independent environment from Postgres.

What is really weak in plpgsql is a left side of assign statement and missing some global/module/extensions variables.

Maybe if we integrate more PLLua or PLPython, PLPerl it can be better for these requests.

I am not sure about benefit new only PostgreSQL specific language. What do you think about Lua - it is light, pretty fast, dynamic, fast dictionary API that can mask lot of internals.

Regards

Pavel

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

09 January 2017, 12:01:15

2017-01-09 1:10 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:

On 1/8/17 12:03 AM, Pavel Stehule wrote:
BTW, I do wish you could change the label of the scope that
arguments went into, so that you could use that label to refer
to function parameters. If we allowed that it'd perhaps be the
best of both worlds: you'd be guaranteed access to all auto
variables and parameters, and that access wouldn't need to be
tied to the function name (which can be both painful and error
prone).

We can talk about compiler directive.

PRAGMA auto_variables_label(xxxx) -- require function scope only

If we know a list of all auto variables, then it can be on function or
block level - it can create aliases.

Oh, the problem is that if you have an argument with the same name as an auto variable you're in trouble.

I didn't well explained my idea

It is similar to your plpgsql scope. You are introducing the convention. I proposed a explicit specification. The result is similar.

Probably the easiest thing is to have a scope that sits above the scope containing the arguments, and then allow the user to rename both scopes if so desired. So in effect you'd end up with

<<plpgsql>> -- new scope
DECLARE
FOUND;
etc
BEGIN
<<function_name>>
DECLARE
argument_1;
argument_2;
BEGIN
-- User supplied block goes here, with optional label
END;
END;

It is similar to

PRAGMA auto_variables_namespace(plpgsql);

BEGIN

...

END;

Using PRAGMA is more verbose - it is useful for code audit, review - it is speaking "I will overwrite some auto variables here, and I need special namespace"

plpgsql_check, maybe plpgsql self can raise warning if these variables are shadowed and some option/pragma is not used. Maybe current extra check does it already.

Alternatively, we could do...

<<function_name>>
DECLARE
FOUND;
etc
BEGIN
DECLARE-- User's DECLARE
argument_1;
argomuent_2;
-- User supplied declare code
BEGIN -- User's BEGIN
....
END

That removes one level of nesting. It's probably better to go with the first option though, since it's simpler.

You are forgot on function paramaters - somebody can use a function argument like FOUND, .. So auto variables should to be declared in most top namespace.

Usually it is invisible for users - one, two more namespaces has zero cost for compilation and absolute zero impact for evaluation.

In both cases, I'd really like the ability to rename those blocks. #pragma would be fine for that.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

09 January 2017, 12:05:49

2017-01-09 0:39 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:

On 1/7/17 11:44 PM, Pavel Stehule wrote:
This is not overloading of SQL command - it is like annotations. It is
smart idea, so I was not surprised if ANSI/SQL reuses it.

SHas ANSI declared that they will NEVER support := in a SELECT that's not running in a stored function? Because if they haven't done that, there's nothing preventing them from doing just that. If that happens we're going to have some very difficult choices to make.

No, there is nothing declared in ANSI. But currently in ANSI is not using one operator for two different thing.

Regards

Pavel

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

09 January 2017, 17:14:14

Real support for using variables as identifiers / nothing restricted to only accepting a Const.

This point is problematic not only from performance perspective.

if you don't use some special syntax and you allow variables as identifier, then you will got a ambiguous situation quickly - although variables can have special symbol prefix

SELECT * FROM tab WHERE $var1 = $var3

What is $var1, what is $var2? identifier or value?

Regards

Pavel

Re: [HACKERS] merging some features from plpgsql2 project

From

Merlin Moncure

Date:

10 January 2017, 02:12:54

On Sun, Jan 8, 2017 at 2:52 AM, Joel Jacobson <joel@trustly.com> wrote:
> On Sat, Jan 7, 2017 at 8:56 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>
>> Jim, Marko, Joel - is there a place, features where we can find a partial agreement? If it is, then we can move our
viewthere.

>
> I have decided I definitively want a new language, and I'm willing to
> pay for it.

well, as they say, "money talks" :-D.

> Hopefully the community will join forces and contribute with ideas and
> code, but with or without you or the rest of the community, plpgsql2
> is going to happen.
> Call it pltrustly or plpgsql2, I don't care. I just care about ending
> my suffering from being forced writing plpgsql every day. It sucks,
> and I'm going to end it.

Curious, are you mainly troubled by the 'INTO STRICT' family of
problems? Or something else?  Pavel has scored some points with PRAGMA
syntax and ISTM that does not require compatibility break.

> And please kill all these GUCs ideas. The best thing with PostgreSQL
> is the natural expected behaviour of the default configuration.
> Contrary to MySQL where you have to enable lots and lots of
> configuration options just to get a behaviour you expect as a novice
> user.

I think there is a lot of support for this point of view.  Jim is
notable outlier here, but for the most part we don't do language
behavior changes with GUC.

> It's much better to just come together and agree on whatever we have
> learned during the last 15 years of PL/pgSQL1, and sample all ideas
> during a year maybe, and decide what to put into PL/pgSQL2. To make it
> useful, we should aim to not break compatibility for _most_ code, but
> accept some necessary rewrites of functions with deprecated
> anti-patterns.

Agreed: If you want to break compatibility, pushing a new language is
the better way than GUC.   If you got consensus on this, having both
languages side by side supported for a while (maybe 4-5 releases) is
they way to go, and finally the only language is frozen and moved to
extension.  But this is a lot of work and aggravation, are you *sure*
you can only get what you want with a full compatibility break?

With respect to your company developers specifically?  I'm genuinely
curious if you've taken a good look at pl/v8 and why you've determined
it's not suitable to move forward with.  It's got a different set of
headaches, but is really fast, and sometimes wonder if with some
alternative preprocessing (like coffeescript but geared towards SQL)
could have some long term promise.

merlin

Re: [HACKERS] merging some features from plpgsql2 project

From

Marko Tiikkaja

Date:

10 January 2017, 02:30:16

On Mon, Jan 9, 2017 at 12:37 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:

If we're going to create a brand new language then I think it would be extremely foolish to keep *any* of the current pain points around. Off the top of my head:

- variables must have an identifier (what $ in most languages does). The steps you have to go through to avoid simple naming collisions are insane.

This is exactly what we did not want to do with this project. The idea is to create a language which is really close to PL/PgSQL, but removes some of the brain diarrhoea currently present.

Now, this *is* a problem, and the solution we had (well I, mostly, at this point) in mind is to use the underscore prefix for all input variables and make OUT parameters invisible to queries inside function bodies unless explicitly prefixed with OUT. As far as I can tell this eliminates most if not all collisions while staying almost completely compatible with arguably well-written PL/PgSQL 1.

- Support for the notion of a variable being unset (which is NOT the same thing as NULL).

My idea was that the currently unsupported combination of NOT NULL and no DEFAULT would mean "has to be assigned to a non-NULL value before it can be read from, or an exception is thrown". Solves the most common use case and is backwards compatible.

Re: [HACKERS] merging some features from plpgsql2 project

From

Andrew Dunstan

Date:

10 January 2017, 02:33:39

On 01/09/2017 06:12 PM, Merlin Moncure wrote:
> With respect to your company developers specifically?  I'm genuinely
> curious if you've taken a good look at pl/v8 and why you've determined
> it's not suitable to move forward with.  It's got a different set of
> headaches, but is really fast, and sometimes wonder if with some
> alternative preprocessing (like coffeescript but geared towards SQL)
> could have some long term promise.
>

Yeah, especially if built against a modern V8, with all or most of the 
ES6 stuff. Without template strings and lexically scoped variables it's 
very unpleasant for large functions, but with them it's usable.  It's 
also something a very large number of people are familiar with. As you 
say it's damn fast.

cheers

andrew

-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] merging some features from plpgsql2 project

From

Jim Nasby

Date:

10 January 2017, 02:47:43

On 1/9/17 5:30 PM, Marko Tiikkaja wrote:
>
> This is exactly what we did not want to do with this project.  The idea
> is to create a language which is really close to PL/PgSQL, but removes
> some of the brain diarrhoea currently present.

As a general comment, ISTM it'd be much better to do as much as we can 
in the current language then. It's going to take a LOT to get people to 
switch to a different language, so there needs to be a LOT of added value.

> Now, this *is* a problem, and the solution we had (well I, mostly, at
> this point) in mind is to use the underscore prefix for all input
> variables and make OUT parameters invisible to queries inside function
> bodies unless explicitly prefixed with   OUT.  As far as I can tell this
> eliminates most if not all collisions while staying almost completely
> compatible with arguably well-written PL/PgSQL 1.

That might be workable... it's still rather ugly though.

I don't see prefixing everything with _ as being useful though; people 
can already do that if they want to uglify the function's argument names.

I do think there's stuff that could be done along these lines with 
namespaces though. Allowing users to rename the namespace that arguments 
went into would be a huge step forward. I think having a separate 
namespace for all the automatic variables would be a big help too.

Amusingly, that would allow users to set the namespace to '$', which 
would (almost) give you $variable.

>     - Support for the notion of a variable being unset (which is NOT the
>     same thing as NULL).
>
>
> My idea was that the currently unsupported combination of NOT NULL and
> no DEFAULT would mean "has to be assigned to a non-NULL value before it
> can be read from, or an exception is thrown".  Solves the most common
> use case and is backwards compatible.

That won't allow you to use a variable in multiple places though... is 
there a reason we couldn't support something like IS DEFINED and UNSET?
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Marko Tiikkaja

Date:

10 January 2017, 02:53:01

On Tue, Jan 10, 2017 at 12:47 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:

On 1/9/17 5:30 PM, Marko Tiikkaja wrote:

My idea was that the currently unsupported combination of NOT NULL and
no DEFAULT would mean "has to be assigned to a non-NULL value before it
can be read from, or an exception is thrown". Solves the most common
use case and is backwards compatible.

That won't allow you to use a variable in multiple places though... is there a reason we couldn't support something like IS DEFINED and UNSET?

I don't understand what your use case is. Could you demonstrate that with some code you'd write if these features were in?

Re: [HACKERS] merging some features from plpgsql2 project

From

Jim Nasby

Date:

10 January 2017, 02:53:45

On 1/9/17 5:12 PM, Merlin Moncure wrote:
> Agreed: If you want to break compatibility, pushing a new language is
> the better way than GUC.   If you got consensus on this, having both
> languages side by side supported for a while (maybe 4-5 releases) is
> they way to go, and finally the only language is frozen and moved to
> extension.  But this is a lot of work and aggravation, are you *sure*
> you can only get what you want with a full compatibility break?

FWIW, that work and aggravation part is what I hoped to avoid with GUCs.

I do think that whichever route we go, we're going to be stuck 
supporting the old version for a LONG time. A big part of why 
standard_conforming_strings was so ugly is users didn't have enough time 
to adjust. If we'd had that enabled by default for 4-5 releases it 
wouldn't have been nearly as much of an issue.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Jim Nasby

Date:

10 January 2017, 02:56:48

On 1/9/17 3:01 AM, Pavel Stehule wrote:
> You are forgot on function paramaters  - somebody can use a function
> argument  like FOUND, .. So auto variables should to be declared in most
> top namespace.

Right, that's why I said it was an alternative. I agree it would be 
better to just have 2 explicit namespaces: the top one being auto 
variables and the one below that being function arguments. The namespace 
below that would be the top-most *user* block.

Both of the pre-defined namespaces need the ability to change their 
name; I don't see any issue with using PRAGMA for that.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Jim Nasby

Date:

10 January 2017, 03:03:18

On 1/9/17 5:53 PM, Marko Tiikkaja wrote:
>         My idea was that the currently unsupported combination of NOT
>         NULL and
>         no DEFAULT would mean "has to be assigned to a non-NULL value
>         before it
>         can be read from, or an exception is thrown".  Solves the most
>         common
>         use case and is backwards compatible.
>
>
>     That won't allow you to use a variable in multiple places though...
>     is there a reason we couldn't support something like IS DEFINED and
>     UNSET?
>
>
> I don't understand what your use case is.  Could you demonstrate that
> with some code you'd write if these features were in?

One use case is NEW and OLD in triggers. Checking to see if one or the 
other is set is easier than checking TG_OP. It's also going to be faster 
(probably MUCH faster; IIRC the comparison currently happens via SPI).

Another case is selecting into a record:

EXECUTE ... INTO rec;
IF rec IS DEFINED THEN
ELSE  EXECUTE <something else> INTO rec;  IF rec IS DEFINED THEN
...

Perhaps DEFINED is not the best keyword. Ultimately I want to know if a 
variable has been assigned a value, as well as being able to mark a 
variable as unassigned (though arguably you might not need to be able to 
un-assign...).
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Marko Tiikkaja

Date:

10 January 2017, 03:07:26

On Tue, Jan 10, 2017 at 1:03 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:

On 1/9/17 5:53 PM, Marko Tiikkaja wrote:
My idea was that the currently unsupported combination of NOT
NULL and
no DEFAULT would mean "has to be assigned to a non-NULL value
before it
can be read from, or an exception is thrown". Solves the most
common
use case and is backwards compatible.

That won't allow you to use a variable in multiple places though...
is there a reason we couldn't support something like IS DEFINED and
UNSET?

I don't understand what your use case is. Could you demonstrate that
with some code you'd write if these features were in?

One use case is NEW and OLD in triggers. Checking to see if one or the other is set is easier than checking TG_OP. It's also going to be faster (probably MUCH faster; IIRC the comparison currently happens via SPI).

This sounds useless.

Another case is selecting into a record:

EXECUTE ... INTO rec;
IF rec IS DEFINED THEN
ELSE
EXECUTE <something else> INTO rec;
IF rec IS DEFINED THEN

And this a workaround for non-functional FOUND.

I can't get excited about this idea based on these examples.

Re: [HACKERS] merging some features from plpgsql2 project

From

Jim Nasby

Date:

10 January 2017, 04:02:48

On 1/9/17 6:07 PM, Marko Tiikkaja wrote:
>     One use case is NEW and OLD in triggers. Checking to see if one or
>     the other is set is easier than checking TG_OP. It's also going to
>     be faster (probably MUCH faster; IIRC the comparison currently
>     happens via SPI).
>
>
> This sounds useless.

I guess you've not written much non-trivial trigger code then... the 
amount of code duplication you end up with is quite ridiculous. It's 
also a good example of why treating this as an exception and trapping 
isn't a good solution either: you can already do that with triggers today.

Being able to check the existence of a variable is a very common idiom 
in other languages, so I'm don't see why plpgsql shouldn't have it.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

10 January 2017, 07:57:56

2017-01-10 2:02 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:

On 1/9/17 6:07 PM, Marko Tiikkaja wrote:
One use case is NEW and OLD in triggers. Checking to see if one or
the other is set is easier than checking TG_OP. It's also going to
be faster (probably MUCH faster; IIRC the comparison currently
happens via SPI).

This sounds useless.

I guess you've not written much non-trivial trigger code then... the amount of code duplication you end up with is quite ridiculous. It's also a good example of why treating this as an exception and trapping isn't a good solution either: you can already do that with triggers today.

Being able to check the existence of a variable is a very common idiom in other languages, so I'm don't see why plpgsql shouldn't have it.

In strongly typed language like PLpgSQL is DEFINE little bit strange. On second hand there are some elements of dynamic languages - record types and polymorphics parameters.

Some languages has reflection API - some other like Oberon some special statements - variable guards that allows safe casting and safe usage - it is not far what we do with Node API.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

Re: [HACKERS] merging some features from plpgsql2 project

From

Peter Eisentraut

Date:

10 January 2017, 07:59:20

On 1/7/17 6:39 AM, Pavel Stehule wrote:
> ROW_COUNT .. shortcut for GET DIAGNOSTICS row_count = ROW_COUNT. 

I don't see the point.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

10 January 2017, 08:06:16

2017-01-10 5:59 GMT+01:00 Peter Eisentraut <peter.eisentraut@2ndquadrant.com>:

On 1/7/17 6:39 AM, Pavel Stehule wrote:
> ROW_COUNT .. shortcut for GET DIAGNOSTICS row_count = ROW_COUNT.

I don't see the point.

A check how much rows was impacted by query is relative often task. So we can do this task more user friendly.

Second motivation - ROW_COUNT is working for static and for dynamic SQL - it can be partial replace of FOUND variable.

But now, when I am thinking about it - it can be strange for some users too. Pretty often we use implicit LIMIT for query execution. So ROW_COUNT can be probably different than users expecting.

Regards

Pavel

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] merging some features from plpgsql2 project

From

Peter Eisentraut

Date:

10 January 2017, 16:26:11

On 1/10/17 12:06 AM, Pavel Stehule wrote:
> A check how much rows was impacted by query is relative often task. So
> we can do this task more user friendly.
> 
> Second motivation - ROW_COUNT is working for static and for dynamic SQL
> - it can be partial replace of FOUND variable.

What is stopping anyone from claiming that their favorite diagnostic
item is also a relatively often task and request it to become an
automatic variable?  Where does it stop?

It's not like PL/pgSQL is the king of brevity.  Creating inconsistent
and arbitrary warts to save a few characters does not appear appealing.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

10 January 2017, 16:41:39

2017-01-10 14:26 GMT+01:00 Peter Eisentraut <peter.eisentraut@2ndquadrant.com>:

On 1/10/17 12:06 AM, Pavel Stehule wrote:
> A check how much rows was impacted by query is relative often task. So
> we can do this task more user friendly.
>
> Second motivation - ROW_COUNT is working for static and for dynamic SQL
> - it can be partial replace of FOUND variable.

What is stopping anyone from claiming that their favorite diagnostic
item is also a relatively often task and request it to become an
automatic variable? Where does it stop?

There is only two possible fields - ROW_COUNT and RESULT_OID. Result Oid is not almost unused today. So stop is ROW_COUNT

It's not like PL/pgSQL is the king of brevity. Creating inconsistent
and arbitrary warts to save a few characters does not appear appealing.

yes

Regards

Pavel

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] merging some features from plpgsql2 project

From

Marko Tiikkaja

Date:

10 January 2017, 16:44:22

On Tue, Jan 10, 2017 at 2:26 PM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:

It's not like PL/pgSQL is the king of brevity.

This is essentially saying "PL/PgSQL isn't perfect, so we shouldn't try and make it better". I hear this argument a lot, and as long as people keep rejecting improvements for this reason they can keep saying it. It's a self-fulfilling prophecy.

Re: [HACKERS] merging some features from plpgsql2 project

From

Merlin Moncure

Date:

11 January 2017, 17:37:13

On Tue, Jan 10, 2017 at 7:44 AM, Marko Tiikkaja <marko@joh.to> wrote:
> On Tue, Jan 10, 2017 at 2:26 PM, Peter Eisentraut
> <peter.eisentraut@2ndquadrant.com> wrote:
>>
>> It's not like PL/pgSQL is the king of brevity.
>
>
> This is essentially saying "PL/PgSQL isn't perfect, so we shouldn't try and
> make it better".  I hear this argument a lot, and as long as people keep
> rejecting improvements for this reason they can keep saying it.  It's a
> self-fulfilling prophecy.

Agreed.  But adding language features, especially syntactical ones,
demands prudence; there is good reason to limit keywords like that.
What about:
pgsql.rows
pgsql.found
pgsql.sqlerrm
etc
as automatic variables (I think this was suggested upthread).
Conflicts with existing structures is of course an issue but I bet it
could be worked out.

I also kinda disagree on the brevity point, or at least would like to
add some color.  SQL is verbose in the sense of "let's make everything
an english language sentence" but incredibly terse relative to other
language implementations of the same task.   Embedded SQL tends to be
uniformly clumsy due to all of the extra handling of errrors,
parameterization, etc.  This is why we write plpgsql naturally.

merlin

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

11 January 2017, 19:45:56

2017-01-11 15:37 GMT+01:00 Merlin Moncure <mmoncure@gmail.com>:

On Tue, Jan 10, 2017 at 7:44 AM, Marko Tiikkaja <marko@joh.to> wrote:
> On Tue, Jan 10, 2017 at 2:26 PM, Peter Eisentraut
> <peter.eisentraut@2ndquadrant.com> wrote:
>>
>> It's not like PL/pgSQL is the king of brevity.
>
>
> This is essentially saying "PL/PgSQL isn't perfect, so we shouldn't try and
> make it better". I hear this argument a lot, and as long as people keep
> rejecting improvements for this reason they can keep saying it. It's a
> self-fulfilling prophecy.

Agreed. But adding language features, especially syntactical ones,
demands prudence; there is good reason to limit keywords like that.
What about:

pgsql.rows
pgsql.found
pgsql.sqlerrm
etc
as automatic variables (I think this was suggested upthread).
Conflicts with existing structures is of course an issue but I bet it
could be worked out.

Any implicit namespace can be problem. But we can continue in default unlabeled namespace for auto variables with possibility to specify this namespace explicitly.

Regards

Pavel

Re: [HACKERS] merging some features from plpgsql2 project

From

Peter Eisentraut

Date:

11 January 2017, 20:11:30

On 1/10/17 8:44 AM, Marko Tiikkaja wrote:
> On Tue, Jan 10, 2017 at 2:26 PM, Peter Eisentraut
> <peter.eisentraut@2ndquadrant.com
> <mailto:peter.eisentraut@2ndquadrant.com>> wrote:
> 
>     It's not like PL/pgSQL is the king of brevity.  
> 
> 
> This is essentially saying "PL/PgSQL isn't perfect, so we shouldn't try
> and make it better".  I hear this argument a lot, and as long as people
> keep rejecting improvements for this reason they can keep saying it. 
> It's a self-fulfilling prophecy.

I'm not making that argument.  But if the plan here is that PL/pgSQL is
too verbose, let's make it less verbose, then maybe, but let's see a
more complete plan for that.

The current syntax was chosen because it is SQL-compatible.  Adding
redundant syntax to save a few characters without any new functionality
(performance, resource usage, safety, etc.) is a weak argument in the
overall scheme of things.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] merging some features from plpgsql2 project

From

Merlin Moncure

Date:

11 January 2017, 22:53:07

On Wed, Jan 11, 2017 at 11:11 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> The current syntax was chosen because it is SQL-compatible.  Adding
> redundant syntax to save a few characters without any new functionality
> (performance, resource usage, safety, etc.) is a weak argument in the
> overall scheme of things.

Yeah -- exactly.  The few minor things that are not 100% SQL
compatible I find to be major headaches.  Incompatible usage of INTO
for example.

This thread has been going on for quite some time now and is starting
to become somewhat circular.   Perhaps we ought to organize the
various ideas and pain points presented in a wiki along with
conclusions, and in some cases if there is no solution that is
compatible with the current syntax.

merlin

Re: [HACKERS] merging some features from plpgsql2 project

From

Pavel Stehule

Date:

11 January 2017, 23:07:06

2017-01-11 20:53 GMT+01:00 Merlin Moncure <mmoncure@gmail.com>:

On Wed, Jan 11, 2017 at 11:11 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> The current syntax was chosen because it is SQL-compatible. Adding
> redundant syntax to save a few characters without any new functionality
> (performance, resource usage, safety, etc.) is a weak argument in the
> overall scheme of things.

Yeah -- exactly. The few minor things that are not 100% SQL
compatible I find to be major headaches. Incompatible usage of INTO
for example.

We not designed INTO usage in plpgsql - it is PL/SQL heritage.

PL/SQL = ADA + Oracle SQL; -- but sometimes the result is not perfect - Ada was not designed be integrated with SQL

This thread has been going on for quite some time now and is starting
to become somewhat circular. Perhaps we ought to organize the
various ideas and pain points presented in a wiki along with
conclusions, and in some cases if there is no solution that is
compatible with the current syntax.

There is a language that is much better integrated with SQL - SQL/PSM

http://postgres.cz/wiki/SQL/PSM_Manual

It is less verbose, but still verbose language. It is static typed language - so it can be bad for some people.

But due design based on SQL integration from base, there is less conflicts between SQL and PL.

Regards

Pavel

merlin

Re: [HACKERS] merging some features from plpgsql2 project

From

Robert Haas

Date:

11 January 2017, 23:57:25

On Mon, Jan 9, 2017 at 6:53 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> I do think that whichever route we go, we're going to be stuck supporting
> the old version for a LONG time. A big part of why
> standard_conforming_strings was so ugly is users didn't have enough time to
> adjust. If we'd had that enabled by default for 4-5 releases it wouldn't
> have been nearly as much of an issue.

/me boggles.  I think you are confused about the history here.
standard_conforming_strings had a generously long phase-in period.

- The E'' syntax and the standard_conforming_strings GUC were added in
PostgreSQL 8.0.  The only legal value of standard_conforming_strings
was "false".

- In PostgreSQL 8.1, it became possible to set
standard_conforming_strings to "true", but the default was still
"false".

- In PostgreSQL 9.1, the default was changed to "true".

So there 6 major release from the time the GUC was added and 5 from
the time it became mutable before the default was flipped.   We've now
had 5 more since the default was changed to "true".  (No, it's not
time to remove the GUC yet.  At least not in my opinion.)

One thing that made changing standard_conforming_strings particularly
painful was that it had knock-on effects on many language-specific
drivers not maintained by the core project (or just plain not
maintained).  I don't think the language changes being proposed here
for PL/pgsql would have the same kind of impact, but some of them
would make it significantly harder to migrate to PostgreSQL from
Oracle, which some people might see as an anti-goal (as per other
nearby threads on making that easier).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] merging some features from plpgsql2 project

From

Merlin Moncure

Date:

12 January 2017, 06:21:50

On Wed, Jan 11, 2017 at 2:57 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> - The E'' syntax and the standard_conforming_strings GUC were added in
> PostgreSQL 8.0.  The only legal value of standard_conforming_strings
> was "false".
>
> - In PostgreSQL 8.1, it became possible to set
> standard_conforming_strings to "true", but the default was still
> "false".
>
> - In PostgreSQL 9.1, the default was changed to "true".
>
> So there 6 major release from the time the GUC was added and 5 from
> the time it became mutable before the default was flipped.   We've now
> had 5 more since the default was changed to "true".  (No, it's not
> time to remove the GUC yet.  At least not in my opinion.)
>
> One thing that made changing standard_conforming_strings particularly
> painful was that it had knock-on effects on many language-specific
> drivers not maintained by the core project (or just plain not
> maintained).  I don't think the language changes being proposed here
> for PL/pgsql would have the same kind of impact, but some of them
> would make it significantly harder to migrate to PostgreSQL from
> Oracle, which some people might see as an anti-goal (as per other
> nearby threads on making that easier).

I don't think it's a simple matter of waiting N or N+M releases
(although I certainly did appreciate that we did it regardless).  It
comes down to this: there's just no way to release changes that break
a lot of code without breaking a lot of code.  It's all about
acknowledging that and judging it acceptable against the benefits you
get.   For posterity, with respect to conforming strings, SQL
injection is an absolute scourge of the computing world so on balance
we did the right thing.  Having said that, It's always good to do the
math and the calculation is primarily an economic one, I think,

merlin

Re: [HACKERS] merging some features from plpgsql2 project

From

Jim Nasby

Date:

12 January 2017, 21:42:25

On 1/11/17 12:07 PM, Pavel Stehule wrote:
> PL/SQL = ADA + Oracle SQL; -- but sometimes the result is not perfect -
> Ada was not designed be integrated with SQL
...
> There is a language that is much better integrated with SQL - SQL/PSM

I think it is worth considering ways to increase compatibility with 
plsql, as well as pulling PSM into core. The former would be to help 
migrating from Oracle; the latter would be to provide everyone a cleaner 
built-in PL. (IMHO a PLSQL equivalent could certainly be an external 
extension).
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)