Thread: BUG #1329: Bug in IF-ELSEIF-ELSE construct

BUG #1329: Bug in IF-ELSEIF-ELSE construct

From
"PostgreSQL Bugs List"
Date:
The following bug has been logged online:

Bug reference:      1329
Logged by:          Rico Wind

Email address:      rw@rico-wind.dk

PostgreSQL version: 8.0 Beta

Operating system:   Windows XP, SP2

Description:        Bug in IF-ELSEIF-ELSE construct

Details:

Beta 1.
The following always returns 4:

IF from_date_param=period_begin AND until_date_param=period_end
THEN
    return 1;
ELSEIF from_date_param=period_begin
THEN
    return 2;
ELSEIF until_date_param=period_end
THEN
    return 3;
ELSE
    return 4;
END IF;

Whereas the following returns the right answer(not 4 each time). They should
be the same.
IF from_date_param=period_begin AND until_date_param=period_end
THEN
    return 1;
ELSE
    IF from_date_param = period_begin
    THEN
        return 2;
    END IF;

    IF until_date_param=period_end
    THEN
        return 3;
    END IF;
END IF;
RETURN 4;

Re: BUG #1329: Bug in IF-ELSEIF-ELSE construct

From
Tom Lane
Date:
"PostgreSQL Bugs List" <pgsql-bugs@postgresql.org> writes:
> Description:        Bug in IF-ELSEIF-ELSE construct

There is no ELSEIF construct.  Try ELSIF.

            regards, tom lane

Re: BUG #1329: Bug in IF-ELSEIF-ELSE construct

From
Neil Conway
Date:
Tom Lane wrote:
> There is no ELSEIF construct.

Sure, but it would be nice to throw a syntax error rather than silently
accepting the function. Unfortunately the way PL/PgSQL's parser works
doesn't make this very easy. (BTW, I think that fixing how we do parsing
would be one of the prime motivations for rewriting PL/PgSQL. One
possibility would be to integrate the PL/PgSQL parser into the main SQL
parser, although there may be a cleaner way to improve PL/PgSQL parsing.)

In any case, given this function:

create or replace function foo() returns int as
'
#option dump
begin
     if 5 > 5 then
         return 10;
     elseif 5 > 6 then
         return 15;
     else
         return 20;
     end if;
end;' language 'plpgsql';

We produce this parsetree: (helpfully dumped via the undocumented
"#option dump" feature)

Functions statements:
   2:BLOCK <<*unnamed*>>
   3:  IF 'SELECT  5 > 5' THEN
   4:    RETURN 'SELECT  10'
   5:    EXECSQL 'elseif 5 > 6 then 15 15'
       ELSE
   8:    RETURN 'SELECT  20'
       ENDIF
     END -- *unnamed*

One way to fix the specific bug reported here would be to add K_ELSEIF
to the PL/PgSQL lexer, and then throw a syntax error in the stmt_else
production. But that is a very limited fix: if the user specifies any
other word in the place of 'elseif', we will not throw a syntax error.

Another solution would be to teach the PL/PgSQL lexer to recognize the
initial tokens of every SQL statement (SELECT, UPDATE, and so forth).
Right now we just assume an unrecognized word must be the beginning of a
SQL statement; if we taught the lexer about the initial tokens of all
legal SQL statements, we could reject unrecognized words. But this is
kind of ugly as well, as it means duplicating the knowledge about what
constitutes a legal SQL statement in multiple places.

Alternatively, we could arrange to have the PL/PgSQL parser pass a block
of text it has identified as a possible SQL statement to the main SQL
parser; if it produces a syntax error, we can pass that syntax error
back to the user. I'm not sure if this would have any negative
ramifications, though.

Comments?

(BTW, another thing this example exposes is that we don't issue warnings
about trivially-dead-code, such as statements in a basic block that
follow a RETURN. This would probably be also worth doing.)

-Neil

plpgsql unreachable code (was BUG #1329: Bug in IF-ELSEIF-ELSE construct)

From
Neil Conway
Date:
Neil Conway wrote:
> (BTW, another thing this example exposes is that we don't issue warnings
> about trivially-dead-code, such as statements in a basic block that
> follow a RETURN. This would probably be also worth doing.)

Attached is a patch that implements this. Specifically, if there are any
statements in the same block that follow a RETURN, EXIT (without
condition) or RAISE EXCEPTION statement, we issue a warning at CREATE
FUNCTION time:

create function exit_warn() returns int as $$
declare x int;
begin
     x := 5;
     loop
         x := x + 1;
         exit;
         x := x + 2;
     end loop;
     x := x + 3;
     return x;
end;$$ language 'plpgsql';
WARNING:  assignment is unreachable, due to exit near line 6
CONTEXT:  compile of PL/pgSQL function "exit_warn" near line 7

No warning is issued if check_function_bodies is false.

AFAICS there is no current infrastructure for walking a PL/PgSQL
function's parse tree, so I did this manually (which is easy enough, of
course). In the future it might be a good idea to refactor this to use
something akin to the "walker" infrastructure in the backend (for one
thing the PL/PgSQL function dumping code could use this as well).

(BTW this patch is intended for 8.1, of course.)

-Neil
--- src/pl/plpgsql/src/pl_comp.c
+++ src/pl/plpgsql/src/pl_comp.c
@@ -130,6 +130,7 @@
 static void plpgsql_HashTableInsert(PLpgSQL_function *function,
                         PLpgSQL_func_hashkey *func_key);
 static void plpgsql_HashTableDelete(PLpgSQL_function *function);
+static void check_function(PLpgSQL_function *function);

 /*
  * This routine is a crock, and so is everyplace that calls it.  The problem
@@ -152,8 +153,10 @@
 /* ----------
  * plpgsql_compile        Make an execution tree for a PL/pgSQL function.
  *
- * If forValidator is true, we're only compiling for validation purposes,
- * and so some checks are skipped.
+ * If forValidator is true, we're only compiling for validation
+ * purposes; this means we skip some checks, as well as making some
+ * additional compile-time checks that we only want to do once (at
+ * function definition time), not very time the function is compiled.
  *
  * Note: it's important for this to fall through quickly if the function
  * has already been compiled.
@@ -293,7 +296,7 @@
      * Setup error traceback support for ereport()
      */
     plerrcontext.callback = plpgsql_compile_error_callback;
-    plerrcontext.arg = forValidator ? proc_source : (char *) NULL;
+    plerrcontext.arg = forValidator ? proc_source : NULL;
     plerrcontext.previous = error_context_stack;
     error_context_stack = &plerrcontext;

@@ -595,7 +598,7 @@
     plpgsql_add_initdatums(NULL);

     /*
-     * Now parse the functions text
+     * Now parse the function's text
      */
     parse_rc = plpgsql_yyparse();
     if (parse_rc != 0)
@@ -605,7 +608,7 @@
     pfree(proc_source);

     /*
-     * If that was successful, complete the functions info.
+     * If that was successful, complete the function's info.
      */
     function->fn_nargs = procStruct->pronargs;
     for (i = 0; i < function->fn_nargs; i++)
@@ -616,12 +619,22 @@
         function->datums[i] = plpgsql_Datums[i];
     function->action = plpgsql_yylval.program;

+    /*
+     * Perform whatever additional compile-time checks we can. Note
+     * that we only do this when validating the function; this is so
+     * that (a) we don't bother the user with warnings when the
+     * function is invoked (b) we don't take the performance hit of
+     * doing the analysis more than once per function definition.
+     */
+    if (forValidator)
+        check_function(function);
+
     /* Debug dump for completed functions */
     if (plpgsql_DumpExecTree)
         plpgsql_dumptree(function);

     /*
-     * add it to the hash table
+     * Add it to the hash table
      */
     plpgsql_HashTableInsert(function, hashkey);

@@ -664,8 +677,149 @@
                    plpgsql_error_funcname, plpgsql_error_lineno);
 }

+/*
+ * Emit a warning that the statement 'unreach' is unreachable, due to
+ * the effect of the preceding statement 'cause'.
+ */
+static void
+report_unreachable_stmt(PLpgSQL_stmt *unreach, PLpgSQL_stmt *cause)
+{
+    /*
+     * XXX: adjust the line number that is emitted along with the
+     * warning message. This is a kludge.
+     */
+    int old_lineno = plpgsql_error_lineno;
+    plpgsql_error_lineno = unreach->lineno;

+    elog(WARNING, "%s is unreachable, due to %s near line %d",
+         plpgsql_stmt_typename(unreach),
+         plpgsql_stmt_typename(cause),
+         cause->lineno);
+
+    plpgsql_error_lineno = old_lineno;
+}
+
 /*
+ * Given a list of PL/PgSQL statements, perform some compile-time
+ * checks on them.
+ *
+ * Note that we return as soon as we have emitted "unreachable"
+ * warnings for a given sequence of statements. So that given:
+ *
+ *    EXIT; EXIT; EXIT
+ *
+ * we will see the first "EXIT", issue warnings for the second and
+ * third unreachable EXIT statements, and then return, so that we
+ * don't issue bogus "unreachable" warnings when we see the second
+ * EXIT.
+ *
+ * XXX: currently we walk the PL/PgSQL execution tree by hand. It
+ * would probably be worth refactoring this to use something akin to
+ * the tree walker infrastructure in the backend.
+ */
+static void
+check_stmts(PLpgSQL_stmts *stmts)
+{
+    int i;
+
+    for (i = 0; i < stmts->stmts_used; i++)
+    {
+        PLpgSQL_stmt *stmt = stmts->stmts[i];
+        int j;
+
+        switch (stmt->cmd_type)
+        {
+            case PLPGSQL_STMT_RETURN:
+                for (j = i + 1; j < stmts->stmts_used; j++)
+                    report_unreachable_stmt(stmts->stmts[j], stmt);
+
+                return;
+            case PLPGSQL_STMT_EXIT:
+                {
+                    /*
+                     * If the EXIT statement has a conditional, it is
+                     * not guaranteed to exit the loop, so don't issue
+                     * a warning.
+                     */
+                    PLpgSQL_stmt_exit *exit_stmt = (PLpgSQL_stmt_exit *) stmt;
+                    if (exit_stmt->cond == NULL)
+                    {
+                        for (j = i + 1; j < stmts->stmts_used; j++)
+                            report_unreachable_stmt(stmts->stmts[j], stmt);
+
+                        return;
+                    }
+                }
+                break;
+            case PLPGSQL_STMT_RAISE:
+                {
+                    PLpgSQL_stmt_raise *raise_stmt = (PLpgSQL_stmt_raise *) stmt;
+                    /*
+                     * Only RAISE EXCEPTION (converted to elog_level =
+                     * ERROR by the parser) will exit the current
+                     * block.
+                     */
+                    if (raise_stmt->elog_level == ERROR)
+                    {
+                        for (j = i + 1; j < stmts->stmts_used; j++)
+                            report_unreachable_stmt(stmts->stmts[j], stmt);
+
+                        return;
+                    }
+                }
+                break;
+            case PLPGSQL_STMT_BLOCK:
+                {
+                    PLpgSQL_stmt_block *block_stmt = (PLpgSQL_stmt_block *) stmt;
+                    check_stmts(block_stmt->body);
+
+                    if (block_stmt->exceptions)
+                    {
+                        for (j = 0; j < block_stmt->exceptions->exceptions_used; j++)
+                            check_stmts(block_stmt->exceptions->exceptions[j]->action);
+
+                        return;
+                    }
+                }
+                break;
+            case PLPGSQL_STMT_IF:
+                check_stmts(((PLpgSQL_stmt_if *) stmt)->true_body);
+                check_stmts(((PLpgSQL_stmt_if *) stmt)->false_body);
+                break;
+            case PLPGSQL_STMT_LOOP:
+                check_stmts(((PLpgSQL_stmt_loop *) stmt)->body);
+                break;
+            case PLPGSQL_STMT_WHILE:
+                check_stmts(((PLpgSQL_stmt_while *) stmt)->body);
+                break;
+            case PLPGSQL_STMT_FORI:
+                check_stmts(((PLpgSQL_stmt_fori *) stmt)->body);
+                break;
+            case PLPGSQL_STMT_FORS:
+                check_stmts(((PLpgSQL_stmt_fors *) stmt)->body);
+                break;
+            case PLPGSQL_STMT_DYNFORS:
+                check_stmts(((PLpgSQL_stmt_dynfors *) stmt)->body);
+                break;
+            default:
+                /* do nothing */
+                break;
+        }
+    }
+}
+
+/*
+ * Issue warnings about various ill-advised constructs in the function
+ * body. We don't do very many compile-time checks at the moment, but
+ * a few is better than none...
+ */
+static void
+check_function(PLpgSQL_function *function)
+{
+    check_stmts(function->action->body);
+}
+
+/*
  * Fetch the argument names, if any, from the proargnames field of the
  * pg_proc tuple.  Results are palloc'd.
  */
--- src/test/regress/expected/plpgsql.out
+++ src/test/regress/expected/plpgsql.out
@@ -2002,3 +2002,111 @@
 DETAIL:  Key (f1)=(2) is not present in table "master".
 drop function trap_foreign_key(int);
 drop function trap_foreign_key_2();
+--
+-- issue warnings about unreachable code
+--
+create function exit_warn() returns int as $$
+declare x int;
+begin
+    x := 5;
+    loop
+        x := x + 1;
+        exit;
+        x := x + 2;
+    end loop;
+    x := x + 3;
+    return x;
+end;$$ language 'plpgsql';
+WARNING:  assignment is unreachable, due to exit near line 6
+CONTEXT:  compile of PL/pgSQL function "exit_warn" near line 7
+create function exit_no_warn() returns int as $$
+declare x int;
+begin
+    x := 5;
+    loop
+        x := x + 5;
+        exit when x > 20;
+        x := x - 1;
+    end loop;
+    x := x + 3;
+    return x;
+end;$$ language 'plpgsql';
+create function return_warn() returns int as $$
+declare
+    a int;
+    b int;
+begin
+    a := 3;
+    b := 2;
+    if a > b then
+        return 5;
+        begin
+            return 10;
+        end;
+    end if;
+    return 15;
+end;$$ language 'plpgsql';
+WARNING:  block variables initialization is unreachable, due to return near line 8
+CONTEXT:  compile of PL/pgSQL function "return_warn" near line 9
+create function return_warn2() returns int as $$
+begin
+    return 10;
+    return 15;
+    return 20;
+end;$$ language 'plpgsql';
+WARNING:  return is unreachable, due to return near line 2
+CONTEXT:  compile of PL/pgSQL function "return_warn2" near line 3
+WARNING:  return is unreachable, due to return near line 2
+CONTEXT:  compile of PL/pgSQL function "return_warn2" near line 4
+create function raise_warn() returns int as $$
+declare x int;
+begin
+    x := 10;
+    begin
+        raise exception 'some random exception';
+        return x;
+    exception
+        when RAISE_EXCEPTION then NULL;
+    end;
+
+    begin
+        raise exception 'foo';
+    exception
+        when RAISE_EXCEPTION then
+            return x;
+            x := x + 1; -- unreachable
+    end;
+
+    begin
+        raise notice 'not an exception';
+        return x;
+    end;
+end;$$ language 'plpgsql';
+WARNING:  return is unreachable, due to raise near line 5
+CONTEXT:  compile of PL/pgSQL function "raise_warn" near line 6
+-- note that we only want to emit warnings at CREATE FUNCTION time, not
+-- when the function is invoked.
+SELECT exit_warn();
+ exit_warn
+-----------
+         9
+(1 row)
+
+SELECT return_warn();
+ return_warn
+-------------
+           5
+(1 row)
+
+SELECT return_warn2();
+ return_warn2
+--------------
+           10
+(1 row)
+
+SELECT raise_warn();
+ raise_warn
+------------
+         10
+(1 row)
+
--- src/test/regress/sql/plpgsql.sql
+++ src/test/regress/sql/plpgsql.sql
@@ -1746,3 +1746,87 @@

 drop function trap_foreign_key(int);
 drop function trap_foreign_key_2();
+
+--
+-- issue warnings about unreachable code
+--
+create function exit_warn() returns int as $$
+declare x int;
+begin
+    x := 5;
+    loop
+        x := x + 1;
+        exit;
+        x := x + 2;
+    end loop;
+    x := x + 3;
+    return x;
+end;$$ language 'plpgsql';
+
+create function exit_no_warn() returns int as $$
+declare x int;
+begin
+    x := 5;
+    loop
+        x := x + 5;
+        exit when x > 20;
+        x := x - 1;
+    end loop;
+    x := x + 3;
+    return x;
+end;$$ language 'plpgsql';
+
+create function return_warn() returns int as $$
+declare
+    a int;
+    b int;
+begin
+    a := 3;
+    b := 2;
+    if a > b then
+        return 5;
+        begin
+            return 10;
+        end;
+    end if;
+    return 15;
+end;$$ language 'plpgsql';
+
+create function return_warn2() returns int as $$
+begin
+    return 10;
+    return 15;
+    return 20;
+end;$$ language 'plpgsql';
+
+create function raise_warn() returns int as $$
+declare x int;
+begin
+    x := 10;
+    begin
+        raise exception 'some random exception';
+        return x;
+    exception
+        when RAISE_EXCEPTION then NULL;
+    end;
+
+    begin
+        raise exception 'foo';
+    exception
+        when RAISE_EXCEPTION then
+            return x;
+            x := x + 1; -- unreachable
+    end;
+
+    begin
+        raise notice 'not an exception';
+        return x;
+    end;
+end;$$ language 'plpgsql';
+
+-- note that we only want to emit warnings at CREATE FUNCTION time, not
+-- when the function is invoked.
+SELECT exit_warn();
+SELECT return_warn();
+SELECT return_warn2();
+SELECT raise_warn();

Neil Conway <neilc@samurai.com> writes:
>> (BTW, another thing this example exposes is that we don't issue warnings
>> about trivially-dead-code, such as statements in a basic block that
>> follow a RETURN. This would probably be also worth doing.)

> Attached is a patch that implements this. Specifically, if there are any
> statements in the same block that follow a RETURN, EXIT (without
> condition) or RAISE EXCEPTION statement, we issue a warning at CREATE
> FUNCTION time:

I think it would be sufficient to warn about the statement immediately
following the RETURN, EXIT, etc.  The way you've got it could easily
bury the user in a mass of warning messages that don't really convey
any extra information.

You could possibly give two alternative messages:
    WARNING:  assignment is unreachable, due to exit near line 6
    WARNING:  assignment and following statement(s) are unreachable, due to exit near line 6
but I'm not sure that's worth the trouble.

Also, you must use ereport not elog for any user-facing error messages,
because elog messages aren't candidates for translation.

            regards, tom lane

Re: BUG #1329: Bug in IF-ELSEIF-ELSE construct

From
Tom Lane
Date:
Neil Conway <neilc@samurai.com> writes:
> Tom Lane wrote:
>> There is no ELSEIF construct.

> Sure, but it would be nice to throw a syntax error rather than silently
> accepting the function. Unfortunately the way PL/PgSQL's parser works
> doesn't make this very easy.

Actually, the simplest solution would be to just *allow* ELSEIF as a
variant spelling of ELSIF.  I cannot see any real good argument for
not doing that, and considering that we've seen two people make the
same mistake in the past month, my interest in doing it is increasing.

> (BTW, I think that fixing how we do parsing
> would be one of the prime motivations for rewriting PL/PgSQL.

Yeah, if we could think of a way.  Copying the main grammar a la ecpg is
definitely not the way :-(

> Alternatively, we could arrange to have the PL/PgSQL parser pass a block
> of text it has identified as a possible SQL statement to the main SQL
> parser; if it produces a syntax error, we can pass that syntax error
> back to the user. I'm not sure if this would have any negative
> ramifications, though.

This seems like the most appropriate answer to me; I was thinking of
doing that earlier when Fabien and I were fooling with plpgsql error
reporting, but didn't get around to it.

As long as you only do this in check_function_bodies mode it seems
safe enough.  One possible problem (if it's done towards the end of
parsing as you've suggested for dead-code checking) is that a syntax
error in a SQL statement might confuse the plpgsql parser into
misparsing statement boundaries, which could lead to a plpgsql parse
error much further down, such as a "missing END" at the end of the
function.  The error would be more useful if reported immediately
after the putative SQL statement is parsed.  Not sure if that's
hard or not.  (The same remark applies to dead code checking, now
that I think about it.)

            regards, tom lane

Re: BUG #1329: Bug in IF-ELSEIF-ELSE construct

From
Neil Conway
Date:
On Sat, 2004-11-27 at 12:55 -0500, Tom Lane wrote:
> This seems like the most appropriate answer to me; I was thinking of
> doing that earlier when Fabien and I were fooling with plpgsql error
> reporting, but didn't get around to it.

Attached is a patch that implements a rough draft of this (it also
includes an improved version of the "unreachable code" patch that
includes your suggested fixes). Two questions about the patch:

(1) It doesn't report syntax errors in unreachable code. I suppose this
ought to be done, right?

(2) The syntax error message is wrong (we print a character offset and
query context that is relative to the CREATE FUNCTION statement, not the
individual SQL statement we're executing). I fooled around a bit with
defining a custom ereport() callback to print the right line number and
query context, but I couldn't get it right. Do you have any guidance on
the proper way to do this.

> As long as you only do this in check_function_bodies mode it seems
> safe enough.  One possible problem (if it's done towards the end of
> parsing as you've suggested for dead-code checking) is that a syntax
> error in a SQL statement might confuse the plpgsql parser into
> misparsing statement boundaries, which could lead to a plpgsql parse
> error much further down, such as a "missing END" at the end of the
> function.  The error would be more useful if reported immediately
> after the putative SQL statement is parsed.  Not sure if that's
> hard or not.  (The same remark applies to dead code checking, now
> that I think about it.)

In the case of dead code checking, I don't think it matters. Doing the
syntax check in gram.y might be a better idea, I'll take a look doing
that...

-Neil


Attachment

Re: BUG #1329: Bug in IF-ELSEIF-ELSE construct

From
Tom Lane
Date:
Neil Conway <neilc@samurai.com> writes:
> (2) The syntax error message is wrong (we print a character offset and
> query context that is relative to the CREATE FUNCTION statement, not the
> individual SQL statement we're executing). I fooled around a bit with
> defining a custom ereport() callback to print the right line number and
> query context, but I couldn't get it right. Do you have any guidance on
> the proper way to do this.

Hmm ... I was about to say the SQL function validator already does this
(see sql_function_parse_error_callback in pg_proc.c), but it has the
advantage that there's a one-to-one correspondence between the string it
sends to the main parser and a substring of the original function text.
In plpgsql that's not true because of (a) substitution of parameter
symbols for names and (b) the liberties that the plpgsql lexer takes
with whitespace and eliding comments.

You might be best off just to strive for output like this:

    ERROR: syntax error at or near...
    QUERY: select ...
    CONTEXT: compile of plpgsql function "frob" at SQL statement line 12

which ought to be relatively easy to get.

BTW, don't forget to check SQL expressions (eg, the condition of an IF)
as well as SQL statements.  In the case of EXECUTE, you can check
the expression that gives rise to the text string.

>> The error would be more useful if reported immediately
>> after the putative SQL statement is parsed.  Not sure if that's
>> hard or not.  (The same remark applies to dead code checking, now
>> that I think about it.)

> In the case of dead code checking, I don't think it matters.

My thought was that a dead-code error could well indicate a problem
along the lines of a missing begin or end, and that flagging the
dead-code error would probably provide a much closer pointer to the
true problem than barfing at the end of the input when we realize
we have unmatched begin/end structure.  (Especially since the barf
will likely take the form of a nonspecific "syntax error" message.
Anytime you can do better than that, you're ahead.)  Similarly, checking
expressions immediately will help report mismatched-parenthesis problems
in a more useful way than we do now.

            regards, tom lane

Re: BUG #1329: Bug in IF-ELSEIF-ELSE construct

From
Tom Lane
Date:
Awhile back I wrote:
>> As long as you only do this in check_function_bodies mode it seems
>> safe enough.  One possible problem (if it's done towards the end of
>> parsing as you've suggested for dead-code checking) is that a syntax
>> error in a SQL statement might confuse the plpgsql parser into
>> misparsing statement boundaries, which could lead to a plpgsql parse
>> error much further down, such as a "missing END" at the end of the
>> function.  The error would be more useful if reported immediately
>> after the putative SQL statement is parsed.  Not sure if that's
>> hard or not.  (The same remark applies to dead code checking, now
>> that I think about it.)

A real-world example of why this would be useful can be seen at
http://archives.postgresql.org/pgsql-novice/2004-12/msg00223.php

The problem is a missing semicolon just before an IF construct.  If
the putative PERFORM were SQL-parsed right away, the user could see
what had been taken as the body of the PERFORM and would be able to
figure out his mistake.  If we continue plpgsql-parsing it's very
hard to see how we avoid generating an error that leads the user
to look in the wrong place.

            regards, tom lane