Thread: plpgsql defensive mode

plpgsql defensive mode

From
Pavel Stehule
Date:
Hi

There was a long discussion about future of PLpgSQL.

I accept so Joel, Marko has good ideas based on probably strong experience from their domain. I can't accept their implementation and proposals as default for PLpgSQL now and in future. They try to mix wine and vodka concepts, and it has no good ends.

I understand to their proposal as restrictive subset of PLpgSQL. "restrictive subset" is not good name. We can implement some features without impact on syntax as block on function level. Marko likes defensive programming, so we can use a name "defensive_mode"

In this mode .. all DML statements should to return EXACTLY ONE row with exception CURSORs and FOR LOOP cycle where more rows is expected. But in this case we can raise a exception NODATA if there is no row.

In this mode late IO casting will be disabled. We can disallow implicit casting too.

We can talk what plpgsql warnings in this mode will be critical.

This mode can be enabled for function by option

#option defensive

or for all plpgsql functions by GUC

SET plpgsql.defensive = on

In this moment I don't see a necessity to change or enhance syntax.

I have no plan to use it as default, but anybody can do it simply by change one GUC in Postgres configuration. Defensive mode (strict defensive mode) is good strategy, but it is not for all.

Regards

Pavel

Re: plpgsql defensive mode

From
Joel Jacobson
Date:
On Sat, Sep 6, 2014 at 7:51 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> Hi
>
> There was a long discussion about future of PLpgSQL.
>
> I accept so Joel, Marko has good ideas based on probably strong experience
> from their domain. I can't accept their implementation and proposals as
> default for PLpgSQL now and in future. They try to mix wine and vodka
> concepts, and it has no good ends.
>
> I understand to their proposal as restrictive subset of PLpgSQL.
> "restrictive subset" is not good name. We can implement some features
> without impact on syntax as block on function level. Marko likes defensive
> programming, so we can use a name "defensive_mode"
>
> In this mode .. all DML statements should to return EXACTLY ONE row with
> exception CURSORs and FOR LOOP cycle where more rows is expected. But in
> this case we can raise a exception NODATA if there is no row.
>
> In this mode late IO casting will be disabled. We can disallow implicit
> casting too.
>
> We can talk what plpgsql warnings in this mode will be critical.
>
> This mode can be enabled for function by option
>
> #option defensive
>
> or for all plpgsql functions by GUC
>
> SET plpgsql.defensive = on
>
> In this moment I don't see a necessity to change or enhance syntax.
>
> I have no plan to use it as default, but anybody can do it simply by change
> one GUC in Postgres configuration. Defensive mode (strict defensive mode) is
> good strategy, but it is not for all.

+1 -- this would mean my original proposal would be possible, i.e. no
syntax change at all. But to solve this the proper way, and avoid a
long list of options/settings, it would be really nice being able to
define a new language, like "pltrustly", which sets the mix of
settings which are relevant for us, where this would be a setting
which is apparently not desirable for everybody.

I also hope all the other things listed in the wiki* are possible to
fix in PL/pgSQL 2 (or even better in PL/pgSQL, if possible).
Pavel, do you have any input on the other items on the wiki? Most of
them are problems which really ought to raise errors.

*) https://wiki.postgresql.org/wiki/Improving_PL/PgSQL_(September_2014)



Re: plpgsql defensive mode

From
Marko Tiikkaja
Date:
On 2014-09-06 7:51 AM, Pavel Stehule wrote:
> In this mode .. all DML statements should to return EXACTLY ONE row with
> exception CURSORs and FOR LOOP cycle where more rows is expected. But in
> this case we can raise a exception NODATA if there is no row.
>
> In this mode late IO casting will be disabled. We can disallow implicit
> casting too.
>
> We can talk what plpgsql warnings in this mode will be critical.
>
> This mode can be enabled for function by option
>
> #option defensive
>
> or for all plpgsql functions by GUC
>
> SET plpgsql.defensive = on
>
> In this moment I don't see a necessity to change or enhance syntax.

How do you run queries which affect more than one row in this mode? 
Because that's crucial as well.  We want something we can run 100% of 
our code on, but with a slightly more convenient syntax than PL/PgSQL 
provides right when coding defensively in the cases where exactly one 
row should be affected.


.marko



Re: plpgsql defensive mode

From
Joel Jacobson
Date:
> On 6 sep 2014, at 16:32, Marko Tiikkaja <marko@joh.to> wrote:
>
> How do you run queries which affect more than one row in this mode? Becau=
se that's crucial as well.  We want something we can run 100% of our code o=
n, but with a slightly more convenient syntax than PL/PgSQL provides right =
when coding defensively in the cases where exactly one row should be affect=
ed.

If we would have ORDER BY also for UPDATE/DELETE then one could just
order by something arbitrary to express multiple or zero rows are OK,
even if not being interested in the order.

>
>
> .marko

Re: plpgsql defensive mode

From
Marko Tiikkaja
Date:
On 2014-09-06 4:41 PM, Joel Jacobson wrote:
>> On 6 sep 2014, at 16:32, Marko Tiikkaja <marko@joh.to> wrote:
>>
>> How do you run queries which affect more than one row in this mode? Because that's crucial as well.  We want
somethingwe can run 100% of our code on, but with a slightly more convenient syntax than PL/PgSQL provides right when
codingdefensively in the cases where exactly one row should be affected.
 
>
> If we would have ORDER BY also for UPDATE/DELETE then one could just
> order by something arbitrary to express multiple or zero rows are OK,
> even if not being interested in the order.

Ugh.  That can't possibly end well.  Overriding the meaning of something 
that already works as standalone SQL statement sounds like a really bad 
idea.


.marko



Re: plpgsql defensive mode

From
Pavel Stehule
Date:



2014-09-06 16:31 GMT+02:00 Marko Tiikkaja <marko@joh.to>:
On 2014-09-06 7:51 AM, Pavel Stehule wrote:
In this mode .. all DML statements should to return EXACTLY ONE row with
exception CURSORs and FOR LOOP cycle where more rows is expected. But in
this case we can raise a exception NODATA if there is no row.

In this mode late IO casting will be disabled. We can disallow implicit
casting too.

We can talk what plpgsql warnings in this mode will be critical.

This mode can be enabled for function by option

#option defensive

or for all plpgsql functions by GUC

SET plpgsql.defensive = on

In this moment I don't see a necessity to change or enhance syntax.

How do you run queries which affect more than one row in this mode? Because that's crucial as well.  We want something we can run 100% of our code on, but with a slightly more convenient syntax than PL/PgSQL provides right when coding defensively in the cases where exactly one row should be affected.

you use a normal function. I don't expect, so it can be too often in your case.

Pavel
 


.marko

Re: plpgsql defensive mode

From
Marko Tiikkaja
Date:
On 2014-09-06 7:50 PM, Pavel Stehule wrote:
> 2014-09-06 16:31 GMT+02:00 Marko Tiikkaja <marko@joh.to>:
>> How do you run queries which affect more than one row in this mode?
>> Because that's crucial as well.  We want something we can run 100% of our
>> code on, but with a slightly more convenient syntax than PL/PgSQL provides
>> right when coding defensively in the cases where exactly one row should be
>> affected.
>>
>
> you use a normal function. I don't expect, so it can be too often in your
> case.

Then that doesn't really solve our problem.  Switching between two 
languages on a per-function basis, when both look exactly the same but 
have very different semantics would be a nightmare.


.marko



Re: plpgsql defensive mode

From
Pavel Stehule
Date:



2014-09-06 19:54 GMT+02:00 Marko Tiikkaja <marko@joh.to>:
On 2014-09-06 7:50 PM, Pavel Stehule wrote:
2014-09-06 16:31 GMT+02:00 Marko Tiikkaja <marko@joh.to>:
How do you run queries which affect more than one row in this mode?
Because that's crucial as well.  We want something we can run 100% of our
code on, but with a slightly more convenient syntax than PL/PgSQL provides
right when coding defensively in the cases where exactly one row should be
affected.


you use a normal function. I don't expect, so it can be too often in your
case.

Then that doesn't really solve our problem.  Switching between two languages on a per-function basis, when both look exactly the same but have very different semantics would be a nightmare.

It is maximum what is possible

use a different language instead

Pavel

 


.marko

Improving PL/PgSQL (was: Re: plpgsql defensive mode)

From
Marko Tiikkaja
Date:
On 2014-09-06 7:56 PM, Pavel Stehule wrote:
> 2014-09-06 19:54 GMT+02:00 Marko Tiikkaja <marko@joh.to>:
>> Then that doesn't really solve our problem.  Switching between two
>> languages on a per-function basis, when both look exactly the same but have
>> very different semantics would be a nightmare.
>>
>
> It is maximum what is possible
>
> use a different language instead

Sigh.

OK, let's try and forget the cardinality assertions we've been talking 
about in the other thread(s).  I seem to recall there being a generally 
welcoming atmosphere in the discussion about adding a set of pragmas (or 
options/whatever) to make some of PL/PgSQL's flaws go away, in a 
non-backwards compatible way.  From the list here: 
https://wiki.postgresql.org/wiki/Improving_PL/PgSQL_(September_2014) do 
you think at least some of those would be reasonable candidates for 
these pragmas?  Do you see others ones that are missing from this list?

Please also keep discussion about ASSERT in the thread for that, and the 
suggestion under "Single-row operations" out of this.


.marko



Re: plpgsql defensive mode

From
Joel Jacobson
Date:
> On 6 sep 2014, at 16:32, Marko Tiikkaja <marko@joh.to> wrote:
>
> How do you run queries which affect more than one row in this mode? Because that's crucial as well.  We want
somethingwe can run 100% of our code on, but with a slightly more convenient syntax than PL/PgSQL provides right when
codingdefensively in the cases where exactly one row should be affected. 

If we would have ORDER BY also for UPDATE/DELETE then one could just
order by something arbitrary to express multiple or zero rows are OK,
even if not being interested in the order.

>
>
> .marko



Re: Improving PL/PgSQL

From
Jan Wieck
Date:
On 09/06/2014 02:08 PM, Marko Tiikkaja wrote:
> On 2014-09-06 7:56 PM, Pavel Stehule wrote:
>> 2014-09-06 19:54 GMT+02:00 Marko Tiikkaja <marko@joh.to>:
>>> Then that doesn't really solve our problem.  Switching between two
>>> languages on a per-function basis, when both look exactly the same but have
>>> very different semantics would be a nightmare.
>>>
>>
>> It is maximum what is possible
>>
>> use a different language instead
>
> Sigh.
>
> OK, let's try and forget the cardinality assertions we've been talking
> about in the other thread(s).  I seem to recall there being a generally
> welcoming atmosphere in the discussion about adding a set of pragmas (or
> options/whatever) to make some of PL/PgSQL's flaws go away, in a
> non-backwards compatible way.  From the list here:
> https://wiki.postgresql.org/wiki/Improving_PL/PgSQL_(September_2014) do
> you think at least some of those would be reasonable candidates for
> these pragmas?  Do you see others ones that are missing from this list?
>
> Please also keep discussion about ASSERT in the thread for that, and the
> suggestion under "Single-row operations" out of this.

+1 for SELECT INTO throwing TOO_MANY_ROWS if there are more than one. 
Zero rows should be dealt with an IF NOT FOUND ... construct.

+1 for the number of result columns should match the expression list of 
SELECT INTO.

-1 on removal of implicit type casting. This needs to go into a #pragma 
or GUC. Too drastic of a backwards compatibility break.

-1 on the single row operations. This belongs into the main SQL engine 
as COMMAND CONSTRAINTS.

+1 on EXECUTE and FOUND, where applicable (DML statements only).

I do not recall why we decided to implement GET DIAGNOSTICS instead of 
an automatically set global ROW_COUNT variable. But there was a reason I 
believe and we should check the list archives for it.

+1 on the OUT alias.

-1 on the ASSERT as proposed. It would be too easy for application 
developers to abuse them to govern business logic and a DBA later 
turning off assertions for performance reasons.


Regards,
Jan

-- 
Jan Wieck
Senior Software Engineer
http://slony.info



Re: Improving PL/PgSQL (was: Re: plpgsql defensive mode)

From
Pavel Stehule
Date:



2014-09-06 20:08 GMT+02:00 Marko Tiikkaja <marko@joh.to>:
On 2014-09-06 7:56 PM, Pavel Stehule wrote:
2014-09-06 19:54 GMT+02:00 Marko Tiikkaja <marko@joh.to>:
Then that doesn't really solve our problem.  Switching between two
languages on a per-function basis, when both look exactly the same but have
very different semantics would be a nightmare.


It is maximum what is possible

use a different language instead

Sigh.

OK, let's try and forget the cardinality assertions we've been talking about in the other thread(s).  I seem to recall there being a generally welcoming atmosphere in the discussion about adding a set of pragmas (or options/whatever) to make some of PL/PgSQL's flaws go away, in a non-backwards compatible way.  From the list here: https://wiki.postgresql.org/wiki/Improving_PL/PgSQL_(September_2014) do you think at least some of those would be reasonable candidates for these pragmas?  Do you see others ones that are missing from this list?

Please also keep discussion about ASSERT in the thread for that, and the suggestion under "Single-row operations" out of this.

SELECT .. INTO vs. TOO_MANY_ROWS

+1 .. possible every where

Variable assignments

+1 .. only in "defensive mode"

Single-row operations

+1 .. only in "defensive mode" without special syntax

EXECUTE and FOUND

-1 .. it is emulation of PL/SQL behave.. so introduction can do too high unhappy surprise if somebody will migrate to Oracle -- the syntax is too similar 

OUT parameters

-1 .. to proposal .. It is in contradiction with current feature. Next it is nonsense. INTO clause should to contains only plpgsql variables - in 9.x Postgres there is not possible issue.
postgres=# create table x(a int, b int);
CREATE TABLE
postgres=# insert into x values(10,20);
INSERT 0 1
postgres=# create or replace function foo(out a int, out b int)
postgres-# returns record as $$
postgres$# begin
postgres$#   select x.a, x.b from x into a, b;
postgres$#   return;
postgres$# end;
postgres$# $$ language plpgsql;
CREATE FUNCTION
postgres=# select * from foo();
 a  | b 
----+----
 10 | 20
(1 row)


you can see, there is not any collision

Assertions

-1 to proposed syntax - I wrote about my reasons in other thread.

Regards

Pavel




.marko

Re: Improving PL/PgSQL (was: Re: plpgsql defensive mode)

From
Pavel Stehule
Date:



2014-09-06 21:47 GMT+02:00 Pavel Stehule <pavel.stehule@gmail.com>:



2014-09-06 20:08 GMT+02:00 Marko Tiikkaja <marko@joh.to>:
On 2014-09-06 7:56 PM, Pavel Stehule wrote:
2014-09-06 19:54 GMT+02:00 Marko Tiikkaja <marko@joh.to>:
Then that doesn't really solve our problem.  Switching between two
languages on a per-function basis, when both look exactly the same but have
very different semantics would be a nightmare.


It is maximum what is possible

use a different language instead

Sigh.

OK, let's try and forget the cardinality assertions we've been talking about in the other thread(s).  I seem to recall there being a generally welcoming atmosphere in the discussion about adding a set of pragmas (or options/whatever) to make some of PL/PgSQL's flaws go away, in a non-backwards compatible way.  From the list here: https://wiki.postgresql.org/wiki/Improving_PL/PgSQL_(September_2014) do you think at least some of those would be reasonable candidates for these pragmas?  Do you see others ones that are missing from this list?

Please also keep discussion about ASSERT in the thread for that, and the suggestion under "Single-row operations" out of this.

SELECT .. INTO vs. TOO_MANY_ROWS

+1 .. possible every where

 correction +1 .. in defensive mode due compatibility issues.

Variable assignments

+1 .. only in "defensive mode"

Single-row operations

+1 .. only in "defensive mode" without special syntax

EXECUTE and FOUND

-1 .. it is emulation of PL/SQL behave.. so introduction can do too high unhappy surprise if somebody will migrate to Oracle -- the syntax is too similar 

OUT parameters

-1 .. to proposal .. It is in contradiction with current feature. Next it is nonsense. INTO clause should to contains only plpgsql variables - in 9.x Postgres there is not possible issue.
postgres=# create table x(a int, b int);
CREATE TABLE
postgres=# insert into x values(10,20);
INSERT 0 1
postgres=# create or replace function foo(out a int, out b int)
postgres-# returns record as $$
postgres$# begin
postgres$#   select x.a, x.b from x into a, b;
postgres$#   return;
postgres$# end;
postgres$# $$ language plpgsql;
CREATE FUNCTION
postgres=# select * from foo();
 a  | b 
----+----
 10 | 20
(1 row)


you can see, there is not any collision

Assertions

-1 to proposed syntax - I wrote about my reasons in other thread.

Regards

Pavel




.marko


Re: Improving PL/PgSQL

From
Marko Tiikkaja
Date:
On 2014-09-06 9:47 PM, Pavel Stehule wrote:
> -1 .. to proposal .. It is in contradiction with current feature.

Which feature would that be?

> Next it
> is nonsense. INTO clause should to contains only plpgsql variables - in 9.x
> Postgres there is not possible issue.
> postgres=# create table x(a int, b int);
> CREATE TABLE
> postgres=# insert into x values(10,20);
> INSERT 0 1
> postgres=# create or replace function foo(out a int, out b int)
> postgres-# returns record as $$
> postgres$# begin
> postgres$#   select x.a, x.b from x into a, b;
> postgres$#   return;
> postgres$# end;
> postgres$# $$ language plpgsql;
> CREATE FUNCTION
> postgres=# select * from foo();
>   a  | b
> ----+----
>   10 | 20
> (1 row)
>
>
> you can see, there is not any collision

No, not if you do assignments only.  But if you also use them as 
parameters in plans at some point in the function, there will be collisions.

But I consider that secondary.  I think the bigger improvement is that 
it's clear what assigning to  OUT.foo  does when reading the code.


.marko