Thread: spurious function execution in prepared statements.

spurious function execution in prepared statements.

From
"Merlin Moncure"
Date:
OK, I have a situation that might be a performance problem, a bug, or an
unavoidable consequence of using prepared statements.  The short version
is that I am getting function executions for rows not returned in a
result set when they are in a prepared statement.

In other words, I have a query:
select f(t.c) from t where [boolean expr on t] limit 1;

because of the limit phrase, obviously, at most one record is returned
and f executes at most once regardless of the plan used (in practice,
sometimes index, sometimes seq_scan.

Now, if the same query is executed as a prepared statement,
prepare ps(...) as select f(t.c) from t where [expr] limit 1;
execute ps;

now, if ps ends up using a index scan on t, everything is ok.  However,
if ps does a seqscan, f executes for every row on t examined until the
[expr] criteria is met.  Is this a bug?  If necessary I should be able
to set up a reproducible example.  The easy workaround is to not use
prepared statements in these situations, but I need to be able to
guarantee that f only executes once (even if that means exploring
subqueries).

Merlin

Re: [HACKERS] spurious function execution in prepared statements.

From
Michael Adler
Date:
On Thu, Sep 30, 2004 at 09:45:51AM -0400, Merlin Moncure wrote:
> Now, if the same query is executed as a prepared statement,
> prepare ps(...) as select f(t.c) from t where [expr] limit 1;
> execute ps;
>
> now, if ps ends up using a index scan on t, everything is ok.  However,
> if ps does a seqscan, f executes for every row on t examined until the
> [expr] criteria is met.  Is this a bug?  If necessary I should be able
> to set up a reproducible example.  The easy workaround is to not use
> prepared statements in these situations, but I need to be able to
> guarantee that f only executes once (even if that means exploring
> subqueries).


Here's another workaround that may let you use a prepared statement:

prepare ps(...) as
select f(c) from (select c from t where [expr] limit 1) as t1

-Mike

Re: spurious function execution in prepared statements.

From
Stephan Szabo
Date:
On Thu, 30 Sep 2004, Merlin Moncure wrote:

> OK, I have a situation that might be a performance problem, a bug, or an
> unavoidable consequence of using prepared statements.  The short version
> is that I am getting function executions for rows not returned in a
> result set when they are in a prepared statement.
>
> In other words, I have a query:
> select f(t.c) from t where [boolean expr on t] limit 1;

An actual boolean expr on t? Or on a column in t?

> because of the limit phrase, obviously, at most one record is returned
> and f executes at most once regardless of the plan used (in practice,
> sometimes index, sometimes seq_scan.
>
> Now, if the same query is executed as a prepared statement,
> prepare ps(...) as select f(t.c) from t where [expr] limit 1;
> execute ps;
>
> now, if ps ends up using a index scan on t, everything is ok.  However,
> if ps does a seqscan, f executes for every row on t examined until the
> [expr] criteria is met.  Is this a bug?  If necessary I should be able
> to set up a reproducible example.  The easy workaround is to not use
> prepared statements in these situations, but I need to be able to
> guarantee that f only executes once (even if that means exploring
> subqueries).

I think a reproducible example would be good. Simple attempts to duplicate
this on 8.0b2 have failed for me, unless I'm using order by.

Re: [HACKERS] spurious function execution in prepared statements.

From
Tom Lane
Date:
"Merlin Moncure" <merlin.moncure@rcsonline.com> writes:
> now, if ps ends up using a index scan on t, everything is ok.  However,
> if ps does a seqscan, f executes for every row on t examined until the
> [expr] criteria is met.  Is this a bug?

Works for me.

regression=# create function f(int) returns int as '
regression'# begin
regression'#   raise notice ''f(%)'', $1;
regression'#   return $1;
regression'# end' language plpgsql;
CREATE FUNCTION
regression=# select f(unique2) from tenk1 where unique2%2 = 1 limit 2;
NOTICE:  f(1)
NOTICE:  f(3)
 f
---
 1
 3
(2 rows)

regression=# prepare ps as
regression-# select f(unique2) from tenk1 where unique2%2 = 1 limit 2;
PREPARE
regression=# execute ps;
NOTICE:  f(1)
NOTICE:  f(3)
 f
---
 1
 3
(2 rows)

regression=#

You sure you aren't using f() in the WHERE clause?

            regards, tom lane

Re: spurious function execution in prepared statements.

From
"Merlin Moncure"
Date:
Stephan Szabo wrote:
> On Thu, 30 Sep 2004, Merlin Moncure wrote:
>
> > OK, I have a situation that might be a performance problem, a bug,
or an
> > unavoidable consequence of using prepared statements.  The short
version
> > is that I am getting function executions for rows not returned in a
> > result set when they are in a prepared statement.
> An actual boolean expr on t? Or on a column in t?
[...]
> I think a reproducible example would be good. Simple attempts to
duplicate
> this on 8.0b2 have failed for me, unless I'm using order by.

Note: I confirmed that breaking out the 'where' part of the query into
subquery suppresses the behavior.

Here is the actual query:
select lock_cuid(id), *
    from data3.wclaim_line_file
    where wcl_vin_no >= '32-MHAB-C-X-7243' and
        (wcl_vin_no >  '32-MHAB-C-X-7243' or  wcl_claim_no >=
001) and
        (wcl_vin_no >  '32-MHAB-C-X-7243' or  wcl_claim_no >
001 or  id >  2671212)
    order by wcl_vin_no, wcl_claim_no, id
    limit 1


Here is the prepared statement declaration:
prepare data3_read_next_wclaim_line_file_1_lock (character varying,
numeric, int8, numeric)
    as select lock_cuid(id), *
    from data3.wclaim_line_file
    where wcl_vin_no >= $1 and
        (wcl_vin_no >  $1 or  wcl_claim_no >= $2) and
        (wcl_vin_no >  $1 or  wcl_claim_no >  $2 or  id >  $3)
    order by wcl_vin_no, wcl_claim_no, id limit $4


Here is the plan when it runs lock_cuid repeatedly (aside: disabling
seqscans causes an index plan, but that's not the point):

esp=# explain execute data3_read_next_wclaim_line_file_1_lock
('32-MHAB-C-X-7243', 001, 2671212, 1);


           QUERY PLAN

------------------------------------------------------------------------
----------------------------
------------------------------------------------------------------------
----------------------------
--------------------------------
 Limit  (cost=13108.95..13162.93 rows=21592 width=260)
   ->  Sort  (cost=13108.95..13162.93 rows=21592 width=260)
         Sort Key: wcl_vin_no, wcl_claim_no, id
         ->  Seq Scan on wclaim_line_file  (cost=0.00..11554.52
rows=21592 width=260)
               Filter: (((wcl_vin_no)::text >= ($1)::text) AND
(((wcl_vin_no)::text > ($1)::text) OR
 ((wcl_claim_no)::numeric >= $2)) AND (((wcl_vin_no)::text > ($1)::text)
OR ((wcl_claim_no)::numeric
 > $2) OR ((id)::bigint > $3)))
(5 rows)

Re: spurious function execution in prepared statements.

From
Tom Lane
Date:
"Merlin Moncure" <merlin.moncure@rcsonline.com> writes:
> Here is the actual query:
> select lock_cuid(id), *
> ...
>     order by wcl_vin_no, wcl_claim_no, id
>     limit 1

Looks like Stephan made the right guess.

Logically the LIMIT executes after the ORDER BY, so the sorted result
has to be formed completely.  The fact that we are able to optimize
this in some cases does not represent a promise that we can do it in
all cases.  Ergo, it's not a bug.

            regards, tom lane