Re: Invalid optimization of VOLATILE function in WHERE clause? - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: Invalid optimization of VOLATILE function in WHERE clause?
Date
Msg-id 5059E40B020000250004A521@gw.wicourts.gov
Whole thread Raw
In response to Re: Invalid optimization of VOLATILE function in WHERE clause?  ("ktm@rice.edu" <ktm@rice.edu>)
Responses Re: Invalid optimization of VOLATILE function in WHERE clause?  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Re: Invalid optimization of VOLATILE function in WHERE clause?  ("David Johnston" <polobo@yahoo.com>)
List pgsql-hackers
"ktm@rice.edu" <ktm@rice.edu> wrote:
> On Wed, Sep 19, 2012 at 02:39:12PM -0500, Kevin Grittner wrote:
>> In another thread, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> 2. Apply the WHERE condition to each row from 1, and drop rows
>>> that don't pass it.
>>  
>> People expect that the results will be consistent with this
>> model, even if the implementation is optimized "under the
>> covers".  I think correct semantics should trump performance
>> here.
> It seems like this is what happens here except that the function
> is evaluated once for the WHERE and not once per ROW. Both of
> these meet the criterion for 2 above and Tom's earlier comments
> both hold.
There really needs to be some way to specify that when an expression
is evaluated for each row in a set, a function used within that
expression is not optimized away for some rows.  Fortunately we have
a way:
http://www.postgresql.org/docs/9.2/interactive/sql-createfunction.html
| VOLATILE indicates that the function value can change even within
| a single table scan, so no optimizations can be made. Relatively
| few database functions are volatile in this sense; some examples
| are random(), [...]
The behavior in the OP's query would certainly be sane if the
function were not VOLATILE; as it is, I have a hard time seeing this
as anything but a bug.

There is a workaround, if you don't mind ugly:
CREATE FUNCTION random_really_i_mean_it(dummy int) RETURNS double precision LANGUAGE plpgsql VOLATILE
AS $$
BEGIN -- no need to reference dummy parameter RETURN random();
END;
$$;

WITH source AS (   SELECT i FROM generate_series(1,10) AS i
)
SELECT   i
FROM   source, (       SELECT           count(*) AS _n       FROM source   ) AS _stats
WHERE   random_really_i_mean_it(i) < 5::DOUBLE PRECISION/_n;
-Kevin



pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: [v9.3] Extra Daemons (Re: elegant and effective way for running jobs inside a database)
Next
From: "Kevin Grittner"
Date:
Subject: Re: Invalid optimization of VOLATILE function in WHERE clause?