Thread: plpgsql versus long ELSIF chains

plpgsql versus long ELSIF chains

From
Tom Lane
Date:
Some off-list discussion found that the cause of this problem:
http://archives.postgresql.org/pgsql-general/2011-10/msg00879.php
was an attempt to write a plpgsql IF-ELSIF-ELSIF-ELSIF-ELSIF-...-ELSE
statement with five thousand branches.  Putting aside the wisdom of
doing that, it seems like the parser ought not fall over when you do.
The reason it's doing so is that plpgsql's grammar handles ELSIF with a
rule like this:

stmt_else: K_ELSIF expr_until_then proc_sect stmt_else

This is called right recursion in the Bison manual (as opposed to
left recursion, where a nonterminal has itself as the *first*
item in its own expansion), and the manual specifically says you
should always avoid right recursion in favor of left recursion,
else you are vulnerable to parser stack overflow.  Duh.

So, looking a bit more closely into why it's done that way, it's
because ELSIF was shoehorned into a parsetree representation that
was only meant for plain IF-THEN-ELSE.  The generated parse tree
will have a new level of PLpgSQL_stmt_if struct for each ELSIF.
That means that even if we fix the grammar, we're still at risk of
stack overflow at execution time because of recursion in pl_exec.c.

So really what needs to be done here is to make ELSIF chains explicit in
the parsetree representation, and handle them via looping not recursion
at runtime.  This will also make it a lot easier to do the grammar via
left-recursion.

So I'm going to go off and do that, but I wonder whether anyone thinks
this is sufficiently important to back-patch.  I'm inclined to think
that back-patching isn't a good idea, because changing the
representation of PLpgSQL_stmt_if will break (at least) EDB's plpgsql
debugger; ISTM the number of complaints isn't enough to warrant doing
that in released branches.

More generally, it seems like a good idea to check whether there are
any other unnecessary uses of right recursion in the core and plpgsql
grammars.  Unless somebody knows of an existing tool to do that,
I'm thinking of writing a little perl script to check for it (probably
by parsing the output of "bison -v") and adding the script to src/tools/
so we'll have it around to check again in the future.  Any objections?
        regards, tom lane


Re: plpgsql versus long ELSIF chains

From
"David E. Wheeler"
Date:
On Oct 27, 2011, at 9:18 AM, Tom Lane wrote:

> So I'm going to go off and do that, but I wonder whether anyone thinks
> this is sufficiently important to back-patch.  I'm inclined to think
> that back-patching isn't a good idea, because changing the
> representation of PLpgSQL_stmt_if will break (at least) EDB's plpgsql
> debugger; ISTM the number of complaints isn't enough to warrant doing
> that in released branches.

+1 to not back-patching. Seems like it doesn't come up all that often, right

Best,

David



Re: plpgsql versus long ELSIF chains

From
Heikki Linnakangas
Date:
On 27.10.2011 19:18, Tom Lane wrote:
> So really what needs to be done here is to make ELSIF chains explicit in
> the parsetree representation, and handle them via looping not recursion
> at runtime.  This will also make it a lot easier to do the grammar via
> left-recursion.
>
> So I'm going to go off and do that, but I wonder whether anyone thinks
> this is sufficiently important to back-patch.

This doesn't look like a bug to me, just an unoptimal implementation. So 
-1 for backpatching.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com