Thread: Stable function being evaluated more than once in a single query

Stable function being evaluated more than once in a single query

From
Mark Liberman
Date:
Hi,

I've got a set-returning function,  defined as STABLE, that I reference twice
within a single query, yet appears to be evaluated via two seperate function
scans.  I created a simple query that calls the function below and joins the
results to itself (Note:  in case you wonder why I'd do such a query, it's
not my actual query, which is much more complex.  I just created this simple
query to try to test out the 'stable' behavior).


select proname,provolatile from pg_proc where proname = 'get_tran_filesize';
             proname        | provolatile
----------------------------+-------------
 get_tran_filesize          | s
(1 row)


explain analyze
select * from
 get_tran_filesize('2005-12-11 00:00:00-08','2006-01-11
15:58:33-08','{228226,228222,228210}');

                                                              QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------
 Function Scan on get_tran_filesize  (cost=0.00..12.50 rows=1000 width=40)
(actual time=49.522..49.524 rows=3 loops=1)
 Total runtime: 49.550 ms
(2 rows)


explain analyze
select * from
 get_tran_filesize('2005-12-11 00:00:00-08','2006-01-11
15:58:33-08','{228226,228222,228210}') gt,
 get_tran_filesize('2005-12-11 00:00:00-08','2006-01-11
15:58:33-08','{228226,228222,228210}') gt2
where gt.tran_id = gt2.tran_id;

                                                                      QUERY
PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=124.66..204.66 rows=5000 width=80) (actual
time=83.027..83.040 rows=3 loops=1)
   Merge Cond: ("outer".tran_id = "inner".tran_id)
   ->  Sort  (cost=62.33..64.83 rows=1000 width=40) (actual
time=40.250..40.251 rows=3 loops=1)
         Sort Key: gt.tran_id
         ->  Function Scan on get_tran_filesize gt  (cost=0.00..12.50
rows=1000 width=40) (actual time=40.237..40.237 rows=3 loops=1)
   ->  Sort  (cost=62.33..64.83 rows=1000 width=40) (actual
time=42.765..42.767 rows=3 loops=1)
         Sort Key: gt2.tran_id
         ->  Function Scan on get_tran_filesize gt2  (cost=0.00..12.50
rows=1000 width=40) (actual time=42.748..42.751 rows=3 loops=1)
 Total runtime: 83.112 ms
(9 rows)


If I do get this working, then my question is, if I reference this function
within a single query, but within seperate subqueries within the query, will
it be re-evaluated each time, or just once.  Basically, I'm not clear on the
definition of "surrounding query" in the following exerpt from the Postgreql
documentation:

A STABLE function cannot modify the database and is guaranteed to return the
same results given the same arguments for all calls within a single
surrounding query.

Thanks,

Mark

Re: Stable function being evaluated more than once in a single query

From
Tom Lane
Date:
Mark Liberman <mliberman@mixedsignals.com> writes:
> I've got a set-returning function,  defined as STABLE, that I reference twice
> within a single query, yet appears to be evaluated via two seperate function
> scans.

There is no guarantee, express or implied, that this won't be the case.

(Seems like we just discussed this a couple days ago...)

            regards, tom lane

Re: Stable function being evaluated more than once in a single query

From
"Jim C. Nasby"
Date:
On Wed, Jan 11, 2006 at 11:33:23PM -0500, Tom Lane wrote:
> Mark Liberman <mliberman@mixedsignals.com> writes:
> > I've got a set-returning function,  defined as STABLE, that I reference twice
> > within a single query, yet appears to be evaluated via two seperate function
> > scans.
>
> There is no guarantee, express or implied, that this won't be the case.
>
> (Seems like we just discussed this a couple days ago...)

Well, from 32.6:

"This category allows the optimizer to optimize away multiple calls of
the function within a single query."

That could certainly be read as indicating that if the function is used
twice in one query it could be optimized to one call.

Is the issue that the optimizer won't combine two function calls (ie:
SELECT foo(..) ... WHERE foo(..)), or is it that sometimes it won't make
the optimization (maybe depending on the query plan, for example)?
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: Stable function being evaluated more than once in a single query

From
Tom Lane
Date:
"Jim C. Nasby" <jnasby@pervasive.com> writes:
> Is the issue that the optimizer won't combine two function calls (ie:
> SELECT foo(..) ... WHERE foo(..)), or is it that sometimes it won't make
> the optimization (maybe depending on the query plan, for example)?

What the STABLE category actually does is give the planner permission to
use the function within an indexscan qualification, eg,
    WHERE indexed_column = f(42)
Since an indexscan involves evaluating the comparison expression just
once and using its value to search the index, this would be incorrect
if the expression's value might change from row to row.  (For VOLATILE
functions, we assume that the correct behavior is the naive SQL
semantics of actually computing the WHERE clause at each candidate row.)

There is no function cache and no checking for duplicate expressions.
I think we do check for duplicate aggregate expressions, but not
anything else.

            regards, tom lane

Re: Stable function being evaluated more than once in a single query

From
"Jim C. Nasby"
Date:
Adding -docs...

On Fri, Jan 13, 2006 at 07:27:28PM -0500, Tom Lane wrote:
> "Jim C. Nasby" <jnasby@pervasive.com> writes:
> > Is the issue that the optimizer won't combine two function calls (ie:
> > SELECT foo(..) ... WHERE foo(..)), or is it that sometimes it won't make
> > the optimization (maybe depending on the query plan, for example)?
>
> What the STABLE category actually does is give the planner permission to
> use the function within an indexscan qualification, eg,
>     WHERE indexed_column = f(42)
> Since an indexscan involves evaluating the comparison expression just
> once and using its value to search the index, this would be incorrect
> if the expression's value might change from row to row.  (For VOLATILE
> functions, we assume that the correct behavior is the naive SQL
> semantics of actually computing the WHERE clause at each candidate row.)
>
> There is no function cache and no checking for duplicate expressions.
> I think we do check for duplicate aggregate expressions, but not
> anything else.

In that case I'd say that the sSTABLE section of 32.6 should be changed
to read:

A STABLE function cannot modify the database and is guaranteed to
return the same results given the same arguments for all calls within a
single surrounding query. This category gives the planner permission to
use the function within an indexscan qualification. (Since an indexscan
involves evaluating the comparison expression just once and using its
value to search the index, this would be incorrect if the expression's
value might change from row to row.) There is no function cache and no
checking for duplicate expressions.

I can provide a patch to that effect if it's easier...

On a related note, would it be difficult to recognize multiple calls of
the same function in one query? ISTM that would be a win for all but the
most trivial of functions...
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: Stable function being evaluated more than once in a single

From
Bruce Momjian
Date:
Here is updated documentation for STABLE.  I just changed a few words
for clarification.

---------------------------------------------------------------------------

Jim C. Nasby wrote:
> Adding -docs...
>
> On Fri, Jan 13, 2006 at 07:27:28PM -0500, Tom Lane wrote:
> > "Jim C. Nasby" <jnasby@pervasive.com> writes:
> > > Is the issue that the optimizer won't combine two function calls (ie:
> > > SELECT foo(..) ... WHERE foo(..)), or is it that sometimes it won't make
> > > the optimization (maybe depending on the query plan, for example)?
> >
> > What the STABLE category actually does is give the planner permission to
> > use the function within an indexscan qualification, eg,
> >     WHERE indexed_column = f(42)
> > Since an indexscan involves evaluating the comparison expression just
> > once and using its value to search the index, this would be incorrect
> > if the expression's value might change from row to row.  (For VOLATILE
> > functions, we assume that the correct behavior is the naive SQL
> > semantics of actually computing the WHERE clause at each candidate row.)
> >
> > There is no function cache and no checking for duplicate expressions.
> > I think we do check for duplicate aggregate expressions, but not
> > anything else.
>
> In that case I'd say that the sSTABLE section of 32.6 should be changed
> to read:
>
> A STABLE function cannot modify the database and is guaranteed to
> return the same results given the same arguments for all calls within a
> single surrounding query. This category gives the planner permission to
> use the function within an indexscan qualification. (Since an indexscan
> involves evaluating the comparison expression just once and using its
> value to search the index, this would be incorrect if the expression's
> value might change from row to row.) There is no function cache and no
> checking for duplicate expressions.
>
> I can provide a patch to that effect if it's easier...
>
> On a related note, would it be difficult to recognize multiple calls of
> the same function in one query? ISTM that would be a win for all but the
> most trivial of functions...
> --
> Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
> Pervasive Software      http://pervasive.com    work: 512-231-6117
> vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
Index: doc/src/sgml/xfunc.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/xfunc.sgml,v
retrieving revision 1.109
diff -c -c -r1.109 xfunc.sgml
*** doc/src/sgml/xfunc.sgml    29 Nov 2005 01:46:54 -0000    1.109
--- doc/src/sgml/xfunc.sgml    19 Jan 2006 22:43:58 -0000
***************
*** 899,911 ****
       <para>
        A <literal>STABLE</> function cannot modify the database and is
        guaranteed to return the same results given the same arguments
!       for all calls within a single surrounding query.  This category
!       allows the optimizer to optimize away multiple calls of the function
!       within a single query.  In particular, it is safe to use an expression
!       containing such a function in an index scan condition.  (Since an
!       index scan will evaluate the comparison value only once, not once at
!       each row, it is not valid to use a <literal>VOLATILE</> function in
!       an index scan condition.)
       </para>
      </listitem>
      <listitem>
--- 899,911 ----
       <para>
        A <literal>STABLE</> function cannot modify the database and is
        guaranteed to return the same results given the same arguments
!       for all rows within a single statement. This category allows the
!       optimizer to optimize multiple calls of the function to a single
!       call. In particular, it is safe to use an expression containing
!       such a function in an index scan condition. (Since an index scan
!       will evaluate the comparison value only once, not once at each
!       row, it is not valid to use a <literal>VOLATILE</> function in an
!       index scan condition.)
       </para>
      </listitem>
      <listitem>