Thread: pass in array to function for use by where clause? how optimize?

pass in array to function for use by where clause? how optimize?

From

Anish Kejariwal

Date:

13 April 2011, 03:58:41

<p class="p1">I have the following query:<p class="p1">select seta.store_id, avg(seta.sales) avg_seta, avg(setb.sales)
avg_setb<pclass="p1">from<p class="p1">(select store_id, avg(sales) sales<p class="p1">from store where group_id
in(10,11,12)<pclass="p1">group by store_id<p class="p1">) seta,<p class="p1">(select  store_id, avg(sales) sales<p
class="p1">fromstore where group_id in(13,14,15)<p class="p1">group by store_id<p class="p1">) setb<p class="p1">where
seta.store_id= setb.store_id;<p class="p2"><br /><p class="p1">I want to have this query in a function, so that I can
passin arrays for the group IDs.  I tried the following, but it's much too slow.  I would query the following via:<p
class="p1">select* from store_avg('{10,11,12}','{13,14,15}');<p class="p2"><br /><p class="p1">create or replace
functionstore_avg () returns setof store_avg_type as <p class="p1">$$<p class="p1">select seta.store_id,
avg(seta.sales)avg_seta, avg(setb.sales) avg_setb<p class="p1">from<p class="p1">(select store_id, avg(sales) sales<p
class="p1">fromstore<p class="p1">where group_id in(select $1[i] from generate_subscripts($1, 1) g(i))<p
class="p1">groupby store_id<p class="p1">) seta,<p class="p1">(select  store_id, avg(sales) sales<p class="p1">from
store<pclass="p1">where group_id in(select $1[i] from generate_subscripts($1, 1) g(i))<p class="p1">group by store_id<p
class="p1">)setb<p class="p1">where seta.store_id = setb.store_id;<p class="p1">$$ language 'sql';<p class="p2"><br
/><pclass="p1">The above are of course fake queries, but in my much more complex case, is 10 seconds when I have the
group_idshard code, and takes 55 seconds when using the gneerate_subscripts.  My assumption, is that optimizer doesn't
workwell with generate_subscripts.  What is the best way to do this?  Should I do this as plpgsql function, and somehow
definethe set of ints at the beginning of the function?  How would i do this? <p class="p1"><br /><p class="p1">Thanks
somuch!  I appreciate your help.<p class="p2"><br /><p class="p1">Anish

Re: pass in array to function for use by where clause? how optimize?

From

Tom Lane

Date:

13 April 2011, 12:15:09

Anish Kejariwal <anishkej@gmail.com> writes:
> (select store_id, avg(sales) sales
> from store
> where group_id in(select $1[i] from generate_subscripts($1, 1) g(i))

Seems like a pretty brute-force way to deal with the array.  Try

where group_id = any($1)

Not sure if it'll be a great deal faster, but it's at least easier to
write.
        regards, tom lane

Re: pass in array to function for use by where clause? how optimize?

From

Rob Sargent

Date:

13 April 2011, 12:48:57


On 04/13/2011 09:09 AM, Tom Lane wrote:
> Anish Kejariwal<anishkej@gmail.com>  writes:
>> (select store_id, avg(sales) sales
>> from store
>> where group_id in(select $1[i] from generate_subscripts($1, 1) g(i))
>
> Seems like a pretty brute-force way to deal with the array.  Try
>
> where group_id = any($1)
>
> Not sure if it'll be a great deal faster, but it's at least easier to
> write.
>
>             regards, tom lane
>
Would adding a subservient function which actually uses the value of the 
iteration (group_id) as a single parameter be of any use?

rjs

Re: pass in array to function for use by where clause? how optimize?

From

Tom Lane

Date:

13 April 2011, 13:16:17

Rob Sargent <robjsargent@gmail.com> writes:
> On 04/13/2011 09:09 AM, Tom Lane wrote:
>> Anish Kejariwal<anishkej@gmail.com>  writes:
>>> (select store_id, avg(sales) sales
>>> from store
>>> where group_id in(select $1[i] from generate_subscripts($1, 1) g(i))

>> Seems like a pretty brute-force way to deal with the array.  Try
>> where group_id = any($1)
>> Not sure if it'll be a great deal faster, but it's at least easier to
>> write.

> Would adding a subservient function which actually uses the value of the 
> iteration (group_id) as a single parameter be of any use?

Well, it'd be unlikely to solve the OP's actual complaint, which was
performance.  Hiding the semantics from the planner via a function
(which generally looks like a black box to the planner) is even worse
than hiding the semantics in a sub-SELECT, which I think is probably
the root cause of the performance issue here.
        regards, tom lane

Re: pass in array to function for use by where clause? how optimize?

From

Anish Kejariwal

Date:

13 April 2011, 16:04:01

Thanks Tom! I've never used the any function before. It totally fixed the issue. It's easier to code, makes more sense, and it fixed my performance issue.

My guess is that the optimizer does not how to deal with generate_subscripts??

Thanks!!

Anish

On Wed, Apr 13, 2011 at 9:15 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Rob Sargent <robjsargent@gmail.com> writes:
> On 04/13/2011 09:09 AM, Tom Lane wrote:
>> Anish Kejariwal<anishkej@gmail.com> writes:
>>> (select store_id, avg(sales) sales
>>> from store
>>> where group_id in(select $1[i] from generate_subscripts($1, 1) g(i))

>> Seems like a pretty brute-force way to deal with the array. Try
>> where group_id = any($1)
>> Not sure if it'll be a great deal faster, but it's at least easier to
>> write.

> Would adding a subservient function which actually uses the value of the
> iteration (group_id) as a single parameter be of any use?

Well, it'd be unlikely to solve the OP's actual complaint, which was
performance. Hiding the semantics from the planner via a function
(which generally looks like a black box to the planner) is even worse
than hiding the semantics in a sub-SELECT, which I think is probably
the root cause of the performance issue here.

regards, tom lane

--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql