Thread: plpgsql plan changes causing failure after repeated invocation

plpgsql plan changes causing failure after repeated invocation

From

Merlin Moncure

Date:

11 November 2014, 14:16:09

I chased down a problem today where users were reporting sporadic
failures in the application.   Turns out, the function would work
exactly 5 times and then fail; this is on 9.2.  I think I understand
why this is happening and I'm skeptical it's a bug in postgres, but I
thought I'd socialize it.

What's happening here is a query structured like this, somewhat deep
into a pl/pgsql function:

SELECT row_to_json(q) FROM
( SELECT * FROM (   <complex_inner_query> ) q LEFT JOIN foo f ON   _plpgsql_var != 'xxx'   AND (     (_plpgsql_var =
'yyy'and q.data::int = foo.foo_id)     OR (_plpgsql_var = 'zzz' and q.data = _other_var)   )
 
) q;

What is happening, along with some triggers I don't completely
understand (this problem started hitting when I made an unrelated
change in the function) is that the cast (q.data::int) started to
fail.  In cases where _plpgsql_var is not 'yyy', the cast was getting
applied where previously it did not.

The workaround was simple, insert a case statement so that q.data::int
becomes CASE WHEN _plpgsql_var = 'yyy' THEN q.data::int ELSE NULL END.
That being said, it does bring up some interesting points.

*) relying on A being checked first in 'A OR B' is obviously not
trustworthy, and it shouldn't be.  Generally I assume the planner will
do the cheaper of the two first (along with some extra encouragement
to put it on the left side), but this can't be relied upon.

*) It's possible to write queries so that they will fail depending on
plan choice. This is not good, and should be avoided when possible
(the query isn't great I'll admit), but the interaction with execution
count is a little unpleasant.

merlin

Re: plpgsql plan changes causing failure after repeated invocation

From

Albe Laurenz

Date:

11 November 2014, 14:28:13

Merlin Moncure wrote:
> I chased down a problem today where users were reporting sporadic
> failures in the application.   Turns out, the function would work
> exactly 5 times and then fail; this is on 9.2.  I think I understand
> why this is happening and I'm skeptical it's a bug in postgres, but I
> thought I'd socialize it.
> 
> What's happening here is a query structured like this, somewhat deep
> into a pl/pgsql function:

[...]
>       (_plpgsql_var = 'yyy' and q.data::int = foo.foo_id)
[...]

> What is happening, along with some triggers I don't completely
> understand (this problem started hitting when I made an unrelated
> change in the function) is that the cast (q.data::int) started to
> fail.  In cases where _plpgsql_var is not 'yyy', the cast was getting
> applied where previously it did not.
> 
> The workaround was simple, insert a case statement so that q.data::int
> becomes CASE WHEN _plpgsql_var = 'yyy' THEN q.data::int ELSE NULL END.
> That being said, it does bring up some interesting points.
> 
> *) relying on A being checked first in 'A OR B' is obviously not
> trustworthy, and it shouldn't be.  Generally I assume the planner will
> do the cheaper of the two first (along with some extra encouragement
> to put it on the left side), but this can't be relied upon.
> 
> *) It's possible to write queries so that they will fail depending on
> plan choice. This is not good, and should be avoided when possible
> (the query isn't great I'll admit), but the interaction with execution
> count is a little unpleasant.

This must be the "custom plan" feature added in 9.2 switching over to
a generic plan after 5 executions.

But you are right, it is not nice.

Yours,
Laurenz Albe