Re: Missing semicolumn in anonymous plpgsql block does not raise syntax error - Mailing list pgsql-bugs

From Pavel Stehule
Subject Re: Missing semicolumn in anonymous plpgsql block does not raise syntax error
Date
Msg-id CAFj8pRCEenYWkRb0toQOOhMVmVni3bz-wmVMFWsbHa__YLXRHg@mail.gmail.com
Whole thread Raw
In response to Re: Missing semicolumn in anonymous plpgsql block does not raise syntax error  (Pavel Stehule <pavel.stehule@gmail.com>)
List pgsql-bugs


po 10. 6. 2024 v 13:56 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
Hi

út 4. 6. 2024 v 8:39 odesílatel Mor Lehr <mor.lehr@deel.com> napsal:
How about inventing an opt-in strict mode
 
That can be useful at the session level, because we use anonymous blocks quite often.
I assume if such a setting existed - we would have used it in the original scenario.

Thanks again,
-Mor

On Mon, Jun 3, 2024 at 9:12 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:


po 3. 6. 2024 v 18:46 odesílatel Erik Wienhold <ewie@ewie.name> napsal:
On 2024-06-03 00:18 +0200, Tom Lane wrote:
> "David G. Johnston" <david.g.johnston@gmail.com> writes:
> > I think you just wrote the equivalent of:
> > l_cnt := (select 1 as delete from foo3 where id=1);
> > Which is a valid query.
>
> Still another example of the folly of letting AS be optional.
> I don't suppose we can ever undo that though.

How about inventing an opt-in strict mode (like in Perl or JavaScript)
that prevents certain footguns?  For example, disallowing bare column
labels.

That could be enabled for the current session or transaction:

    SET strict_parsing = { on | off };

Or just for individual routines:

    CREATE PROCEDURE myproc()
        SET strict_parsing = { on | off }
        LANGUAGE plpgsql ...


Probably it is not bad idea - it can be generally useful

But I think it is better to introduce a new entry for plpgsql expressions in gram.y.

Unfortunately it is not a compatible change. Years ago was popular to use a pattern

a := tab.a FROM tab

instead correct

a := (SELECT tab.a FROM tab)

or

SELECT tab.a FROM tab INTO a;

Regards

Pavel

I wrote an experimental patch that enforces strict syntax for PL/pgSQL expressions. This patch is a proof concept and shows impacts of the change (check-world tests passed)

Unfortunately it introduces break compatibility - with this patch the odd syntax (that is undocumented) is not supported. Something like

DECLARE _x int := x FROM foo WHERE id = _arg;

should not be written in strict mode;

This patch very well fix reported issue:

(2024-06-10 12:06:43) postgres=# CREATE TABLE foo3(id serial PRIMARY key, txt text);
CREATE TABLE
(2024-06-10 12:07:13) postgres=# INSERT INTO foo3 (txt) VALUES ('aaa'),('bbb');
INSERT 0 2
(2024-06-10 12:07:33) postgres=# DO $$
DECLARE
    l_cnt int;
BEGIN
    l_cnt := 1
    DELETE FROM foo3 WHERE id=1;
END; $$;
ERROR:  syntax error at or near "DELETE"
LINE 11:     DELETE FROM foo3 WHERE id=1;
             ^

Personally, I think it can be a strong step forward (we can use deeper integration of plpgsql with SQL parser which was not possible before).

Because it introduces a compatibility break I don't propose to use it by default. I see two possible ways, how this check can be used:

1. we can use plpgsql.extra_errors ( https://www.postgresql.org/docs/current/plpgsql-development-tips.html#PLPGSQL-EXTRA-CHECKS ) and introduce a check named (maybe) strict_expr_syntax. Because in this case the warning cannot be raised, then this check cannot be used in plpgsql.extra_warnings. I think it can work nicely.

2. if @1 will not be possible, the minimalist implementation can be based on a new public entry to SQL parser.  In this case, I can do the proposed check at least from plpgsql_check.

Do you see any other possibilities?

This patch is just a proof concept. I didn't implement any mechanism to switch from default mode to strict mode (I don't propose strict mode as default) now.

I think it can increase the robustness of plpgsql, on the other hand it introduces compatibility breaks and I understand related problems.

The change of definition of PLpgSQL_Expr has an impact on OPEN and CASE
PLpgSQL statements that use multicolumn results.

Note - the SQL/PSM standard allow syntax

SET var = (SELECT col FROM tab) 

as shorter variant of 

SELECT col INTO var FROM tab ;

so

var := (SELECT col FROM tab)

is +/- ANSI/SQL syntax. It is not my invention. The subquery is used as a guard against returning multiple rows.

The proprietary Postgre syntax is a little bit faster - 80us x 95 us, doesn't raise an exception when returning more rows (I am not sure if this is any benefit or not - it is possibly dangerous, but it can reduce the necessity of subtransaction in some patterns). Instead of proprietary syntax, SELECT INTO can be used for these cases.

Regards

Pavel



Regards

Pavel
 

 
--
Erik


pgsql-bugs by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: Missing semicolumn in anonymous plpgsql block does not raise syntax error
Next
From: Floris Van Nee
Date:
Subject: RE: error "can only drop stats once" brings down database