Thread: pass in array to function for use by where clause? how optimize?
<p class="p1">I have the following query:<p class="p1">select seta.store_id, avg(seta.sales) avg_seta, avg(setb.sales) avg_setb<pclass="p1">from<p class="p1">(select store_id, avg(sales) sales<p class="p1">from store where group_id in(10,11,12)<pclass="p1">group by store_id<p class="p1">) seta,<p class="p1">(select store_id, avg(sales) sales<p class="p1">fromstore where group_id in(13,14,15)<p class="p1">group by store_id<p class="p1">) setb<p class="p1">where seta.store_id= setb.store_id;<p class="p2"><br /><p class="p1">I want to have this query in a function, so that I can passin arrays for the group IDs. I tried the following, but it's much too slow. I would query the following via:<p class="p1">select* from store_avg('{10,11,12}','{13,14,15}');<p class="p2"><br /><p class="p1">create or replace functionstore_avg () returns setof store_avg_type as <p class="p1">$$<p class="p1">select seta.store_id, avg(seta.sales)avg_seta, avg(setb.sales) avg_setb<p class="p1">from<p class="p1">(select store_id, avg(sales) sales<p class="p1">fromstore<p class="p1">where group_id in(select $1[i] from generate_subscripts($1, 1) g(i))<p class="p1">groupby store_id<p class="p1">) seta,<p class="p1">(select store_id, avg(sales) sales<p class="p1">from store<pclass="p1">where group_id in(select $1[i] from generate_subscripts($1, 1) g(i))<p class="p1">group by store_id<p class="p1">)setb<p class="p1">where seta.store_id = setb.store_id;<p class="p1">$$ language 'sql';<p class="p2"><br /><pclass="p1">The above are of course fake queries, but in my much more complex case, is 10 seconds when I have the group_idshard code, and takes 55 seconds when using the gneerate_subscripts. My assumption, is that optimizer doesn't workwell with generate_subscripts. What is the best way to do this? Should I do this as plpgsql function, and somehow definethe set of ints at the beginning of the function? How would i do this? <p class="p1"><br /><p class="p1">Thanks somuch! I appreciate your help.<p class="p2"><br /><p class="p1">Anish
Anish Kejariwal <anishkej@gmail.com> writes: > (select store_id, avg(sales) sales > from store > where group_id in(select $1[i] from generate_subscripts($1, 1) g(i)) Seems like a pretty brute-force way to deal with the array. Try where group_id = any($1) Not sure if it'll be a great deal faster, but it's at least easier to write. regards, tom lane
On 04/13/2011 09:09 AM, Tom Lane wrote: > Anish Kejariwal<anishkej@gmail.com> writes: >> (select store_id, avg(sales) sales >> from store >> where group_id in(select $1[i] from generate_subscripts($1, 1) g(i)) > > Seems like a pretty brute-force way to deal with the array. Try > > where group_id = any($1) > > Not sure if it'll be a great deal faster, but it's at least easier to > write. > > regards, tom lane > Would adding a subservient function which actually uses the value of the iteration (group_id) as a single parameter be of any use? rjs
Rob Sargent <robjsargent@gmail.com> writes: > On 04/13/2011 09:09 AM, Tom Lane wrote: >> Anish Kejariwal<anishkej@gmail.com> writes: >>> (select store_id, avg(sales) sales >>> from store >>> where group_id in(select $1[i] from generate_subscripts($1, 1) g(i)) >> Seems like a pretty brute-force way to deal with the array. Try >> where group_id = any($1) >> Not sure if it'll be a great deal faster, but it's at least easier to >> write. > Would adding a subservient function which actually uses the value of the > iteration (group_id) as a single parameter be of any use? Well, it'd be unlikely to solve the OP's actual complaint, which was performance. Hiding the semantics from the planner via a function (which generally looks like a black box to the planner) is even worse than hiding the semantics in a sub-SELECT, which I think is probably the root cause of the performance issue here. regards, tom lane
Thanks Tom! I've never used the any function before. It totally fixed the issue. It's easier to code, makes more sense, and it fixed my performance issue.
My guess is that the optimizer does not how to deal with generate_subscripts??
Thanks!!
Anish
On Wed, Apr 13, 2011 at 9:15 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Rob Sargent <robjsargent@gmail.com> writes:
> On 04/13/2011 09:09 AM, Tom Lane wrote:
>> Anish Kejariwal<anishkej@gmail.com> writes:
>>> (select store_id, avg(sales) sales
>>> from store
>>> where group_id in(select $1[i] from generate_subscripts($1, 1) g(i))
>> Seems like a pretty brute-force way to deal with the array. Try
>> where group_id = any($1)
>> Not sure if it'll be a great deal faster, but it's at least easier to
>> write.> Would adding a subservient function which actually uses the value of theWell, it'd be unlikely to solve the OP's actual complaint, which was
> iteration (group_id) as a single parameter be of any use?
performance. Hiding the semantics from the planner via a function
(which generally looks like a black box to the planner) is even worse
than hiding the semantics in a sub-SELECT, which I think is probably
the root cause of the performance issue here.
regards, tom lane
--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql