Thread: seq scan only when function not in subquery (bug?)

seq scan only when function not in subquery (bug?)

From
Brett McCormick
Date:
sas=> explain update user set usrid = 'aaaaaaaa' where usrseqid=usrseqid('zlb');

NOTICE:  QUERY PLAN:

Seq Scan on user  (cost=344.07 size=658 width=154)

EXPLAIN
sas=> explain update user set usrid = 'aaaaaaaa' where usrseqid=(select usrseqid('zlb'));
NOTICE:  QUERY PLAN:

Index Scan on user  (cost=1.05 size=1 width=154)
  InitPlan
    ->  Result  (cost=0.00 size=0 width=0)

EXPLAIN
sas=>


as you can see, it uses the index when the RHS of the comparison in the where
clause is a subquery, but a sequential scan when it isn't.  is this a bug?

Re: [HACKERS] seq scan only when function not in subquery (bug?)

From
Vadim Mikheev
Date:
Brett McCormick wrote:
>
> sas=> explain update user set usrid = 'aaaaaaaa'
> where usrseqid=usrseqid('zlb');
>
> NOTICE:  QUERY PLAN:
>
> Seq Scan on user  (cost=344.07 size=658 width=154)

...

> as you can see, it uses the index when the RHS of the comparison
> in the where clause is a subquery, but a sequential scan when it
> isn't.  is this a bug?

Yes, and very old :)

This is from my recent posting:
---
Another issue - handling of functions with constant args
in queries - for query

select * from T where A = upper ('bbb')

function upper ('bbb') will be executed for each tuple in T!
More of that - if there is index on T(A) then this index will
not be used for this query!
Obviously, upper ('bbb') should be executed (by Executor, not
parser/planner) once: new Param type (PARAM_EXEC) implemented
for subselects could help here too...
---

Actually, this is easy to fix...

Vadim

Re: [HACKERS] seq scan only when function not in subquery (bug?)

From
Brett McCormick
Date:
On Tue, 16 June 1998, at 10:32:39, Vadim Mikheev wrote:

> Another issue - handling of functions with constant args
> in queries - for query
>
> select * from T where A = upper ('bbb')
>
> function upper ('bbb') will be executed for each tuple in T!
> More of that - if there is index on T(A) then this index will
> not be used for this query!
> Obviously, upper ('bbb') should be executed (by Executor, not
> parser/planner) once: new Param type (PARAM_EXEC) implemented
> for subselects could help here too...
> ---
>
> Actually, this is easy to fix...

I was going to reply to this but never did -- how do you tell if it
needs to be executed once per query or once per tuple?  What if you
wanted to call a function which returned a different value for each
tuple, like random()?

Re: [HACKERS] seq scan only when function not in subquery (bug?)

From
dg@illustra.com (David Gould)
Date:
> On Tue, 16 June 1998, at 10:32:39, Vadim Mikheev wrote:
>
> > Another issue - handling of functions with constant args
> > in queries - for query
> >
> > select * from T where A = upper ('bbb')
> >
> > function upper ('bbb') will be executed for each tuple in T!
> > More of that - if there is index on T(A) then this index will
> > not be used for this query!
> > Obviously, upper ('bbb') should be executed (by Executor, not
> > parser/planner) once: new Param type (PARAM_EXEC) implemented
> > for subselects could help here too...
> > ---
> >
> > Actually, this is easy to fix...
>
> I was going to reply to this but never did -- how do you tell if it
> needs to be executed once per query or once per tuple?  What if you
> wanted to call a function which returned a different value for each
> tuple, like random()?

To make this work, you need an attribute in the functions table (and
internal info about the function) that tells if the function is "variant"
or not. A variant function can return different results with the same
arguments eg random(), or has side effects. A non variant function returns
the same result for the same arguments and has no side-effects.

If you have a non-variant function, then the easy way to optimize it is
to memoize the arguments and result of the last time you called it. Then
the next time you want to call it, check if the arguments are the same and
if so, merely return the previously saved result instead of calling the
function.

Example:

create function city_from_zipcode(integer) returns varchar not variant;

select name, street, city_from_zipcode(zipcode), zipcode
   from (select * from customers order by zipcode);

If customers was sorted by zipcode, this would only call city_from_zipcode()
each time the zipcode changed instead of for each row.

It would also cover the case of "function('constant');

-dg

David Gould           dg@illustra.com            510.628.3783 or 510.305.9468
Informix Software                      300 Lakeside Drive   Oakland, CA 94612
 - A child of five could understand this!  Fetch me a child of five.