Re: [HACKERS] plpgsql - additional extra checks - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: [HACKERS] plpgsql - additional extra checks
Date
Msg-id CAFj8pRAhxkLA7cS=x+uT8ZUrpXUWjdJuvjoGqkGJSm9xYb_7qw@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] plpgsql - additional extra checks  (Stephen Frost <sfrost@snowman.net>)
Responses Re: Re: [HACKERS] plpgsql - additional extra checks
List pgsql-hackers
Hi

2018-01-07 0:59 GMT+01:00 Stephen Frost <sfrost@snowman.net>:
Greetings Pavel,

* Pavel Stehule (pavel.stehule@gmail.com) wrote:
> 2017-11-30 3:44 GMT+01:00 Michael Paquier <michael.paquier@gmail.com>:
> > At least documentation needs patching, or this is going to generate
> > warnings on HEAD at compilation. I am moving this to next CF for lack
> > of reviews, and the status is waiting on author as this needs at least
> > a couple of doc fixes.
>
> fixed doc attached

Looks like this patch should have been in "Needs Review" state, not
"Waiting for author", since it does apply, build and pass make
check-world, but as I'm signed up to review it, I'll do so here:

diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
index 7d23ed437e..efa48bc13c 100644
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -4963,6 +4963,11 @@ a_output := a_output || $$ if v_$$ || referrer_keys.kind || $$ like '$$
     so you are advised to test in a separate development environment.
    </para>

+   <para>
+    The setting <varname>plpgsql.extra_warnings</varname> to <literal>all</literal> is a
+    good idea in developer or test environments.
+   </para>

Better language for this would be:

Setting <varname>plpgsql.extra_warnings</varname>, or
<varname>plpgsql.extra_errors</varname>, as appropriate, to
<literal>all</literal> is encouraged in development and/or testing
environments.

@@ -4979,6 +4984,30 @@ a_output := a_output || $$ if v_$$ || referrer_keys.kind || $$ like '$$
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term><varname>strict_multi_assignment</varname></term>
+    <listitem>
+     <para>
+      Some <application>PL/PgSQL</application> commands allows to assign a values to
+      more than one variable. The number of target variables should not be
+      equal to number of source values. Missing values are replaced by NULL
+      value, spare values are ignored. More times this situation signalize
+      some error.
+     </para>
+    </listitem>
+   </varlistentry>

Better language:

Some <application>PL/PgSQL</application> commands allow assigning values
to more than one variable at a time, such as SELECT INTO.  Typically,
the number of target variables and the number of source variables should
match, though <application>PL/PgSQL</application> will use NULL for
missing values and extra variables are ignored.  Enabling this check
will cause <application>PL/PgSQL</application> to throw a WARNING or
ERROR whenever the number of target variables and the number of source
variables are different.

+   <varlistentry>
+    <term><varname>too_many_rows</varname></term>
+    <listitem>
+     <para>
+      When result is assigned to a variable by <literal>INTO</literal> clause,
+      checks if query returns more than one row. In this case the assignment
+      is not deterministic usually - and it can be signal some issues in design.
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>

Better language:

Enabling this check will cause <application>PL/PgSQL</application> to
check if a given query returns more than one row when an
<literal>INTO</literal> clause is used.  As an INTO statement will only
ever use one row, having a query return multiple rows is generally
either inefficient and/or nondeterministic and therefore is likely an
error.

@@ -4997,6 +5026,34 @@ WARNING:  variable "f1" shadows a previously defined variable
 LINE 3: f1 int;
         ^
 CREATE FUNCTION
+</programlisting>
+
+  The another example shows the effect of <varname>plpgsql.extra_warnings</varname>
+  set to <varname>strict_multi_assignment</varname>:
+<programlisting>

Better language:

The below example shows the effects of setting
<varname>plpgsql.extra_warnings</varname> to
<varname>strict_multi_assignment</varname>:

diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index ec480cb0ba..0879e84cd2 100644
--- a/src/pl/plpgsql/src/pl_exec.c
+++ b/src/pl/plpgsql/src/pl_exec.c
@@ -3629,6 +3629,24 @@ exec_stmt_execsql(PLpgSQL_execstate *estate,
        long            tcount;
        int                     rc;
        PLpgSQL_expr *expr = stmt->sqlstmt;
+       bool            too_many_rows_check;
+       int                     too_many_rows_level;
+
+       if (plpgsql_extra_errors & PLPGSQL_XCHECK_TOOMANYROWS)
+       {
+               too_many_rows_check = true;
+               too_many_rows_level = ERROR;
+       }
+       else if (plpgsql_extra_warnings & PLPGSQL_XCHECK_TOOMANYROWS)
+       {
+               too_many_rows_check = true;
+               too_many_rows_level = WARNING;
+       }
+       else
+       {
+               too_many_rows_check = false;
+               too_many_rows_level = NOTICE;
+       }


I'm not sure why we need two variables here- couldn't we simply look at
too_many_rows_level?  eg: too_many_rows_level >= WARNING ? ...

Not as big a deal, but I would change it to be 'check_too_many_rows' as
a variable name too.

@@ -3678,7 +3696,7 @@ exec_stmt_execsql(PLpgSQL_execstate *estate,
         */
        if (stmt->into)
        {
-               if (stmt->strict || stmt->mod_stmt)
+               if (stmt->strict || stmt->mod_stmt || too_many_rows_check)
                        tcount = 2;
                else
                        tcount = 1;

The comment above this block needs updating for this change and, in
general, there's probably other pieces of code that this patch
changes where the comments should either be improved or ones added.


@@ -6033,12 +6051,48 @@ exec_move_row(PLpgSQL_execstate *estate,
                int                     t_natts;
                int                     fnum;
                int                     anum;
+               bool            strict_multiassignment_check;
+               int                     strict_multiassignment_level;
+
+               if (plpgsql_extra_errors & PLPGSQL_XCHECK_STRICTMULTIASSIGNMENT)
+               {
+                       strict_multiassignment_check = true;
+                       strict_multiassignment_level = ERROR;
+               }
+               else if (plpgsql_extra_warnings & PLPGSQL_XCHECK_STRICTMULTIASSIGNMENT)
+               {
+                       strict_multiassignment_check = true;
+                       strict_multiassignment_level = WARNING;
+               }
+               else
+               {
+                       strict_multiassignment_check = false;
+                       strict_multiassignment_level = NOTICE;
+               }

Same comments for this, more-or-less, as the above sections.

                if (HeapTupleIsValid(tup))
                        t_natts = HeapTupleHeaderGetNatts(tup->t_data);
                else
                        t_natts = 0;

+               if (strict_multiassignment_check)
+               {
+                       int             i;
+
+                       anum = 0;
+                       for (i = 0; i < td_natts; i++)
+                               if (!TupleDescAttr(tupdesc, i)->attisdropped)
+                                       anum++;
+
+                       if (anum != row->nfields)
+                       {
+                               ereport(strict_multiassignment_level,
+                                               (errcode(ERRCODE_DATATYPE_MISMATCH),
+                                                errmsg("Number of evaluated attributies (%d) does not match expected attributies (%d)",
+                                                                       anum, row->nfields)));
+                       }
+               }

I would have thought you'd incorporate this into the loop below instead
of adding a whole new section..?  At the least, this should include
better comments, and isn't there an issue here where you aren't
accounting for dropped columns in row structs?  See the comments in the
loop below this.

I'd suggest you try to construct a case which (incorrectly) throws a
warning for a row struct with a dropped column with your current patch
and then add that to the regression tests too, since it appears to have
been missed.

There, now it's in the correct Waiting for Author state. :)

thank you for comments. All should be fixed in attached patch

Regards

Pavel



Thanks!

Stephen

Attachment

pgsql-hackers by date:

Previous
From: Ryan Murphy
Date:
Subject: Re: Add default role 'pg_access_server_files'
Next
From: Simon Riggs
Date:
Subject: Re: [HACKERS] [PROPOSAL] Temporal query processing with range types