Thread: Missing semicolumn in anonymous plpgsql block does not raise syntax error

Hi,

I would like to report a potential bug in postgres 15.4, also reproduced on 15.6.

The exact sequence of steps:
Connect to a postgres 15.4 database and run the following statements:

CREATE TABLE foo3(id serial PRIMARY key, txt text);

INSERT INTO foo3 (txt) VALUES ('aaa'),('bbb');

DO $$

DECLARE

l_cnt int;

BEGIN

l_cnt := 1

DELETE FROM foo3 WHERE id=1;

END; $$;


The output you got:

1. The script passes (no error message) even though there's a missing semicolon (;) after "l_cnt := 1"
2. The script doesn't actually delete the record from foo3


This caused us a production issue where we thought changes were applied (script passed successfully) but changes weren't actually applied.


If I move the line "l_cnt := 1" to after the DELETE statement like so:

DO $$

DECLARE

l_cnt int;

BEGIN

DELETE FROM foo3 WHERE id=1;

l_cnt := 1

END; $$;

I get the error - as expected:

SQL Error [42601]: ERROR: syntax error at end of input
  Position: 89


Furthermore, replacing the DELETE statement with an UPDATE statement in the original code does raise an error:

DO $$

DECLARE

l_cnt int;

BEGIN

l_cnt := 1

UPDATE foo3 SET txt='ccc' WHERE id=1;

END; $$;

SQL Error [42601]: ERROR: syntax error at or near "foo3"
  Position: 62

But adding the semicolon - it works correctly with either UPDATE or DELETE.


I ran the original code using the following clients to make sure it's not a client problem:

1. psql

2. DBeaver using standard JDBC drivers

3. Flyway using JDBC drivers




Versions:

PostgreSQL 15.6 (Homebrew) on x86_64-apple-darwin23.2.0, compiled by Apple clang version 15.0.0 (clang-1500.1.0.2.5), 64-bit l - running locally on my MacBook


PostgreSQL 15.4 on aarch64-unknown-linux-gnu, compiled by aarch64-unknown-linux-gnu-gcc (GCC) 9.5.0, 64-bit - running on AWS RDS Aurora


Installed Extensions (On AWS RDS):

 oid       |extname                  |extowner|extnamespace|extrelocatable|extversion|extconfig   |extcondition|
----------+-------------------------+--------+------------+--------------+----------+------------+------------+
     16463|btree_gist               |      10|        2200|true          |1.7       |NULL        |NULL        |
1463651797|deel_password_check_rules|   16399|        2200|false         |1.0       |NULL        |NULL        |
     16464|fuzzystrmatch            |      10|        2200|true          |1.1       |NULL        |NULL        |
 958297705|pg_repack                |      10|        2200|false         |1.4.8     |NULL        |NULL        |
     16465|pg_stat_statements       |      10|        2200|true          |1.9       |NULL        |NULL        |
1463506085|pg_tle                   |      10|  1463506084|false         |1.1.1     |{1463506117}|{""}        |
     16467|pg_trgm                  |      10|        2200|true          |1.6       |NULL        |NULL        |
     16468|pgcrypto                 |      10|        2200|true          |1.3       |NULL        |NULL        |
     14498|plpgsql                  |      10|          11|false         |1.0       |NULL        |NULL        |
     16469|postgres_fdw             |      10|        2200|true          |1.1       |NULL        |NULL        |
     16470|tablefunc                |      10|        2200|true          |1.0       |NULL        |NULL        |
     16471|unaccent                 |      10|        2200|true          |1.1       |NULL        |NULL        |
     16472|uuid-ossp                |      10|        2200|true          |1.1       |NULL        |NULL        |


Please let me know what other information I can provide.


Thanks,

Mor

On Sunday, June 2, 2024, Mor Lehr <mor.lehr@deel.com> wrote:
Hi,

I would like to report a potential bug in postgres 15.4, also reproduced on 15.6.

The exact sequence of steps:
Connect to a postgres 15.4 database and run the following statements:

CREATE TABLE foo3(id serial PRIMARY key, txt text);

INSERT INTO foo3 (txt) VALUES ('aaa'),('bbb');

DO $$

DECLARE

l_cnt int;

BEGIN

l_cnt := 1

DELETE FROM foo3 WHERE id=1;

END; $$;


The output you got:

1. The script passes (no error message) even though there's a missing semicolon (;) after "l_cnt := 1"
2. The script doesn't actually delete the record from foo3



I think you just wrote the equivalent of:

l_cnt := (select 1 as delete from foo3 where id=1);

Which is a valid query.

David J.


Thanks for the prompt reply.
Can you please refer me to the section in the documentation that describes this behavior?
This (automatically interperting 1 as select 1) is totally an unexpected behavior for me.

Thanks again.
-Mor.


On Sun, Jun 2, 2024, 18:19 David G. Johnston <david.g.johnston@gmail.com> wrote:
On Sunday, June 2, 2024, Mor Lehr <mor.lehr@deel.com> wrote:
Hi,

I would like to report a potential bug in postgres 15.4, also reproduced on 15.6.

The exact sequence of steps:
Connect to a postgres 15.4 database and run the following statements:

CREATE TABLE foo3(id serial PRIMARY key, txt text);

INSERT INTO foo3 (txt) VALUES ('aaa'),('bbb');

DO $$

DECLARE

l_cnt int;

BEGIN

l_cnt := 1

DELETE FROM foo3 WHERE id=1;

END; $$;


The output you got:

1. The script passes (no error message) even though there's a missing semicolon (;) after "l_cnt := 1"
2. The script doesn't actually delete the record from foo3



I think you just wrote the equivalent of:

l_cnt := (select 1 as delete from foo3 where id=1);

Which is a valid query.

David J.


On Sunday, June 2, 2024, Mor Lehr <mor.lehr@deel.com> wrote:
Thanks for the prompt reply.
Can you please refer me to the section in the documentation that describes this behavior?
This (automatically interperting 1 as select 1) is totally an unexpected behavior for me.


“As explained previously, the expression in such a statement is evaluated by means of an SQL SELECT command sent to the main database engine.”

David J.

Thanks for the reference.
We learn new stuff every day.

You can close the case.
Thanks, Mor


On Sun, Jun 2, 2024, 18:31 David G. Johnston <david.g.johnston@gmail.com> wrote:
On Sunday, June 2, 2024, Mor Lehr <mor.lehr@deel.com> wrote:
Thanks for the prompt reply.
Can you please refer me to the section in the documentation that describes this behavior?
This (automatically interperting 1 as select 1) is totally an unexpected behavior for me.


“As explained previously, the expression in such a statement is evaluated by means of an SQL SELECT command sent to the main database engine.”

David J.

"David G. Johnston" <david.g.johnston@gmail.com> writes:
> I think you just wrote the equivalent of:
> l_cnt := (select 1 as delete from foo3 where id=1);
> Which is a valid query.

Still another example of the folly of letting AS be optional.
I don't suppose we can ever undo that though.

            regards, tom lane



On 2024-06-03 00:18 +0200, Tom Lane wrote:
> "David G. Johnston" <david.g.johnston@gmail.com> writes:
> > I think you just wrote the equivalent of:
> > l_cnt := (select 1 as delete from foo3 where id=1);
> > Which is a valid query.
> 
> Still another example of the folly of letting AS be optional.
> I don't suppose we can ever undo that though.

How about inventing an opt-in strict mode (like in Perl or JavaScript)
that prevents certain footguns?  For example, disallowing bare column
labels.

That could be enabled for the current session or transaction:

    SET strict_parsing = { on | off };

Or just for individual routines:

    CREATE PROCEDURE myproc()
        SET strict_parsing = { on | off }
        LANGUAGE plpgsql ...

-- 
Erik





po 3. 6. 2024 v 18:46 odesílatel Erik Wienhold <ewie@ewie.name> napsal:
On 2024-06-03 00:18 +0200, Tom Lane wrote:
> "David G. Johnston" <david.g.johnston@gmail.com> writes:
> > I think you just wrote the equivalent of:
> > l_cnt := (select 1 as delete from foo3 where id=1);
> > Which is a valid query.
>
> Still another example of the folly of letting AS be optional.
> I don't suppose we can ever undo that though.

How about inventing an opt-in strict mode (like in Perl or JavaScript)
that prevents certain footguns?  For example, disallowing bare column
labels.

That could be enabled for the current session or transaction:

    SET strict_parsing = { on | off };

Or just for individual routines:

    CREATE PROCEDURE myproc()
        SET strict_parsing = { on | off }
        LANGUAGE plpgsql ...


Probably it is not bad idea - it can be generally useful

But I think it is better to introduce a new entry for plpgsql expressions in gram.y.

Unfortunately it is not a compatible change. Years ago was popular to use a pattern

a := tab.a FROM tab

instead correct

a := (SELECT tab.a FROM tab)

or

SELECT tab.a FROM tab INTO a;

Regards

Pavel

 
--
Erik


How about inventing an opt-in strict mode
 
That can be useful at the session level, because we use anonymous blocks quite often.
I assume if such a setting existed - we would have used it in the original scenario.

Thanks again,
-Mor

On Mon, Jun 3, 2024 at 9:12 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:


po 3. 6. 2024 v 18:46 odesílatel Erik Wienhold <ewie@ewie.name> napsal:
On 2024-06-03 00:18 +0200, Tom Lane wrote:
> "David G. Johnston" <david.g.johnston@gmail.com> writes:
> > I think you just wrote the equivalent of:
> > l_cnt := (select 1 as delete from foo3 where id=1);
> > Which is a valid query.
>
> Still another example of the folly of letting AS be optional.
> I don't suppose we can ever undo that though.

How about inventing an opt-in strict mode (like in Perl or JavaScript)
that prevents certain footguns?  For example, disallowing bare column
labels.

That could be enabled for the current session or transaction:

    SET strict_parsing = { on | off };

Or just for individual routines:

    CREATE PROCEDURE myproc()
        SET strict_parsing = { on | off }
        LANGUAGE plpgsql ...


Probably it is not bad idea - it can be generally useful

But I think it is better to introduce a new entry for plpgsql expressions in gram.y.

Unfortunately it is not a compatible change. Years ago was popular to use a pattern

a := tab.a FROM tab

instead correct

a := (SELECT tab.a FROM tab)

or

SELECT tab.a FROM tab INTO a;

Regards

Pavel

 
--
Erik


Hi

út 4. 6. 2024 v 8:39 odesílatel Mor Lehr <mor.lehr@deel.com> napsal:
How about inventing an opt-in strict mode
 
That can be useful at the session level, because we use anonymous blocks quite often.
I assume if such a setting existed - we would have used it in the original scenario.

Thanks again,
-Mor

On Mon, Jun 3, 2024 at 9:12 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:


po 3. 6. 2024 v 18:46 odesílatel Erik Wienhold <ewie@ewie.name> napsal:
On 2024-06-03 00:18 +0200, Tom Lane wrote:
> "David G. Johnston" <david.g.johnston@gmail.com> writes:
> > I think you just wrote the equivalent of:
> > l_cnt := (select 1 as delete from foo3 where id=1);
> > Which is a valid query.
>
> Still another example of the folly of letting AS be optional.
> I don't suppose we can ever undo that though.

How about inventing an opt-in strict mode (like in Perl or JavaScript)
that prevents certain footguns?  For example, disallowing bare column
labels.

That could be enabled for the current session or transaction:

    SET strict_parsing = { on | off };

Or just for individual routines:

    CREATE PROCEDURE myproc()
        SET strict_parsing = { on | off }
        LANGUAGE plpgsql ...


Probably it is not bad idea - it can be generally useful

But I think it is better to introduce a new entry for plpgsql expressions in gram.y.

Unfortunately it is not a compatible change. Years ago was popular to use a pattern

a := tab.a FROM tab

instead correct

a := (SELECT tab.a FROM tab)

or

SELECT tab.a FROM tab INTO a;

Regards

Pavel

I wrote an experimental patch that enforces strict syntax for PL/pgSQL expressions. This patch is a proof concept and shows impacts of the change (check-world tests passed)

Unfortunately it introduces break compatibility - with this patch the odd syntax (that is undocumented) is not supported. Something like

DECLARE _x int := x FROM foo WHERE id = _arg;

should not be written in strict mode;

This patch very well fix reported issue:

(2024-06-10 12:06:43) postgres=# CREATE TABLE foo3(id serial PRIMARY key, txt text);
CREATE TABLE
(2024-06-10 12:07:13) postgres=# INSERT INTO foo3 (txt) VALUES ('aaa'),('bbb');
INSERT 0 2
(2024-06-10 12:07:33) postgres=# DO $$
DECLARE
    l_cnt int;
BEGIN
    l_cnt := 1
    DELETE FROM foo3 WHERE id=1;
END; $$;
ERROR:  syntax error at or near "DELETE"
LINE 11:     DELETE FROM foo3 WHERE id=1;
             ^

Personally, I think it can be a strong step forward (we can use deeper integration of plpgsql with SQL parser which was not possible before).

Because it introduces a compatibility break I don't propose to use it by default. I see two possible ways, how this check can be used:

1. we can use plpgsql.extra_errors ( https://www.postgresql.org/docs/current/plpgsql-development-tips.html#PLPGSQL-EXTRA-CHECKS ) and introduce a check named (maybe) strict_expr_syntax. Because in this case the warning cannot be raised, then this check cannot be used in plpgsql.extra_warnings. I think it can work nicely.

2. if @1 will not be possible, the minimalist implementation can be based on a new public entry to SQL parser.  In this case, I can do the proposed check at least from plpgsql_check.

Do you see any other possibilities?

This patch is just a proof concept. I didn't implement any mechanism to switch from default mode to strict mode (I don't propose strict mode as default) now.

I think it can increase the robustness of plpgsql, on the other hand it introduces compatibility breaks and I understand related problems.

The change of definition of PLpgSQL_Expr has an impact on OPEN and CASE
PLpgSQL statements that use multicolumn results.

Regards

Pavel
 

 
--
Erik


Attachment


po 10. 6. 2024 v 13:56 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
Hi

út 4. 6. 2024 v 8:39 odesílatel Mor Lehr <mor.lehr@deel.com> napsal:
How about inventing an opt-in strict mode
 
That can be useful at the session level, because we use anonymous blocks quite often.
I assume if such a setting existed - we would have used it in the original scenario.

Thanks again,
-Mor

On Mon, Jun 3, 2024 at 9:12 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:


po 3. 6. 2024 v 18:46 odesílatel Erik Wienhold <ewie@ewie.name> napsal:
On 2024-06-03 00:18 +0200, Tom Lane wrote:
> "David G. Johnston" <david.g.johnston@gmail.com> writes:
> > I think you just wrote the equivalent of:
> > l_cnt := (select 1 as delete from foo3 where id=1);
> > Which is a valid query.
>
> Still another example of the folly of letting AS be optional.
> I don't suppose we can ever undo that though.

How about inventing an opt-in strict mode (like in Perl or JavaScript)
that prevents certain footguns?  For example, disallowing bare column
labels.

That could be enabled for the current session or transaction:

    SET strict_parsing = { on | off };

Or just for individual routines:

    CREATE PROCEDURE myproc()
        SET strict_parsing = { on | off }
        LANGUAGE plpgsql ...


Probably it is not bad idea - it can be generally useful

But I think it is better to introduce a new entry for plpgsql expressions in gram.y.

Unfortunately it is not a compatible change. Years ago was popular to use a pattern

a := tab.a FROM tab

instead correct

a := (SELECT tab.a FROM tab)

or

SELECT tab.a FROM tab INTO a;

Regards

Pavel

I wrote an experimental patch that enforces strict syntax for PL/pgSQL expressions. This patch is a proof concept and shows impacts of the change (check-world tests passed)

Unfortunately it introduces break compatibility - with this patch the odd syntax (that is undocumented) is not supported. Something like

DECLARE _x int := x FROM foo WHERE id = _arg;

should not be written in strict mode;

This patch very well fix reported issue:

(2024-06-10 12:06:43) postgres=# CREATE TABLE foo3(id serial PRIMARY key, txt text);
CREATE TABLE
(2024-06-10 12:07:13) postgres=# INSERT INTO foo3 (txt) VALUES ('aaa'),('bbb');
INSERT 0 2
(2024-06-10 12:07:33) postgres=# DO $$
DECLARE
    l_cnt int;
BEGIN
    l_cnt := 1
    DELETE FROM foo3 WHERE id=1;
END; $$;
ERROR:  syntax error at or near "DELETE"
LINE 11:     DELETE FROM foo3 WHERE id=1;
             ^

Personally, I think it can be a strong step forward (we can use deeper integration of plpgsql with SQL parser which was not possible before).

Because it introduces a compatibility break I don't propose to use it by default. I see two possible ways, how this check can be used:

1. we can use plpgsql.extra_errors ( https://www.postgresql.org/docs/current/plpgsql-development-tips.html#PLPGSQL-EXTRA-CHECKS ) and introduce a check named (maybe) strict_expr_syntax. Because in this case the warning cannot be raised, then this check cannot be used in plpgsql.extra_warnings. I think it can work nicely.

2. if @1 will not be possible, the minimalist implementation can be based on a new public entry to SQL parser.  In this case, I can do the proposed check at least from plpgsql_check.

Do you see any other possibilities?

This patch is just a proof concept. I didn't implement any mechanism to switch from default mode to strict mode (I don't propose strict mode as default) now.

I think it can increase the robustness of plpgsql, on the other hand it introduces compatibility breaks and I understand related problems.

The change of definition of PLpgSQL_Expr has an impact on OPEN and CASE
PLpgSQL statements that use multicolumn results.

Note - the SQL/PSM standard allow syntax

SET var = (SELECT col FROM tab) 

as shorter variant of 

SELECT col INTO var FROM tab ;

so

var := (SELECT col FROM tab)

is +/- ANSI/SQL syntax. It is not my invention. The subquery is used as a guard against returning multiple rows.

The proprietary Postgre syntax is a little bit faster - 80us x 95 us, doesn't raise an exception when returning more rows (I am not sure if this is any benefit or not - it is possibly dangerous, but it can reduce the necessity of subtransaction in some patterns). Instead of proprietary syntax, SELECT INTO can be used for these cases.

Regards

Pavel



Regards

Pavel
 

 
--
Erik