Thread: BUG #2150: PL/Python function delcared STABLE gets run repeatedly

BUG #2150: PL/Python function delcared STABLE gets run repeatedly

From

"Aaron Swartz"

Date:

06 January 2006, 12:24:43

The following bug has been logged online:

Bug reference:      2150
Logged by:          Aaron Swartz
Email address:      me@aaronsw.com
PostgreSQL version: 8.1.1
Operating system:   Mac OS X
Description:        PL/Python function delcared STABLE gets run repeatedly
Details:

We have the PL/Python function get_foo(n) that returns an array. If this
function is declared IMMUTABLE, then:

SELECT * FROM tablenamehere WHERE id = any(get_foo(21));

returns quickly. If the function is declared STABLE, however, PostgreSQL
seems to incorrectly run it for every row in the table and it takes forever.

Re: BUG #2150: PL/Python function delcared STABLE gets run repeatedly

From

Tom Lane

Date:

06 January 2006, 12:42:44

"Aaron Swartz" <me@aaronsw.com> writes:
> We have the PL/Python function get_foo(n) that returns an array. If this
> function is declared IMMUTABLE, then:
> SELECT * FROM tablenamehere WHERE id = any(get_foo(21));
> returns quickly. If the function is declared STABLE, however, PostgreSQL
> seems to incorrectly run it for every row in the table and it takes forever.

= ANY isn't indexable in 8.1, so there's no obvious reason why either
version would be fast.  You're going to have to provide more detail if
you want help.  For starters, what does EXPLAIN ANALYZE show in each
case?  What's the schema of the table (\d output will do)?

            regards, tom lane

Re: BUG #2150: PL/Python function delcared STABLE gets run repeatedly

From

Tom Lane

Date:

06 January 2006, 13:43:09

Aaron Swartz <me@aaronsw.com> writes:
> The function we're running returns an array, and it is non-trivial to
> compute. We do a simple query with it like "SELECT * FROM
> tablename WHERE id = any(foo(21))").

> When the function is STABLE (or VOLATILE) the function is run
> to generate the array every time. If the function is IMMUTABLE,
> the array is computed only once for this query, as we'd expect,
> and the query is fast.

Oh, you are misunderstanding the point of IMMUTABLE/STABLE.
STABLE essentially gives the planner permission to use the function
in an indexscan qualification.  It does *not* cause any caching of
the function result in other contexts, which is what you seem to be
wishing would happen.

IMMUTABLE/STABLE/VOLATILE are promises from you to the system about
the behavior of the function, not promises from the system about
how it will choose to evaluate the function.

What I'd suggest is recasting the function to return a SETOF result
instead of an array, and then writing

SELECT * FROM tablename WHERE id IN (select * from foo(21))

This should get you a plan that will work reasonably well for you.

            regards, tom lane

Re: BUG #2150: PL/Python function delcared STABLE gets run repeatedly

From

"Jim C. Nasby"

Date:

06 January 2006, 15:43:33

On Fri, Jan 06, 2006 at 12:43:06PM -0500, Tom Lane wrote:
> Oh, you are misunderstanding the point of IMMUTABLE/STABLE.
> STABLE essentially gives the planner permission to use the function
> in an indexscan qualification.  It does *not* cause any caching of
> the function result in other contexts, which is what you seem to be
> wishing would happen.

Is caching of results for STABLE/IMMUTABLE functions called with a
constant something that would be reasonable to add? I certainly wish
this happened natively, but I've always hacked around it by

SELECT ... WHERE x = (SELECT foo(21))

so presumably some kind of query transform logic similar to what was
done for min/max might work.

I don't see anything about this on the TODO...
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: BUG #2150: PL/Python function delcared STABLE gets run repeatedly

From

Tom Lane

Date:

06 January 2006, 15:49:03

"Jim C. Nasby" <jnasby@pervasive.com> writes:
> Is caching of results for STABLE/IMMUTABLE functions called with a
> constant something that would be reasonable to add?

I tend to think not: the distributed overhead would outweigh the
advantages.  See previous discussions in the archives.

(However, the previous discussions have imagined an explicit cache
that stores function names, arguments, and results.  Your thought
of transforming the call into an InitPlan node might be interesting.
The trick is still to know which functions are expensive enough to
justify the overhead of an InitPlan.)

            regards, tom lane

Re: BUG #2150: PL/Python function delcared STABLE gets run repeatedly

From

"Jim C. Nasby"

Date:

06 January 2006, 15:56:42

On Fri, Jan 06, 2006 at 02:48:59PM -0500, Tom Lane wrote:
> "Jim C. Nasby" <jnasby@pervasive.com> writes:
> > Is caching of results for STABLE/IMMUTABLE functions called with a
> > constant something that would be reasonable to add?
>
> I tend to think not: the distributed overhead would outweigh the
> advantages.  See previous discussions in the archives.
>
> (However, the previous discussions have imagined an explicit cache
> that stores function names, arguments, and results.  Your thought
> of transforming the call into an InitPlan node might be interesting.
> The trick is still to know which functions are expensive enough to
> justify the overhead of an InitPlan.)
>

Oh, yeah, actually caching results between queries doesn't make sense; I
think that's pretty much a corner-case that people should be expected to
code up/handle themselves. I'm just looking for only executing the
function once per query.

Yeah, some heuristics to try and tweak between adding a seperate
planning node and just running the function a few times would be nice,
but I suspect that's overkill. I think it would be fine to just assume
that any time you're going to execute a function more than X number of
times (where X should be easy to deduce during query planning), you'll
be better off taking the cost to tweak the plan to only run it once.

How does the overhead of an InitPlan compare to the overhead of calling
a SQL function? Or plpgsql? Or one of the external function languages?
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461