Thread: where (x,y,z) in ((x1,y1, z1), (x1,y1, z1), (x1,y1, z1), (x2,y2, z2)) (not) optimized
where (x,y,z) in ((x1,y1, z1), (x1,y1, z1), (x1,y1, z1), (x2,y2, z2)) (not) optimized
From
Grzegorz Jaśkiewicz
Date:
Hey folks, I have question really for all mighty developers, but don't want to spam -hackers with it. why : select * from foo where X in (1,1,1,1,1,1,1,1) --- same values in search. or select * from foo where (x,y) in ((1,2),(1,2),(1,2),(1,2),(1,2),(1,2),(1,2)); never gets optimized by planner, etc ? Is it just not worth optimizing from pg side? I am sure, it would make sense to actually reorder these values, so that index/whatnot could pick it up faster. Just another one of those, 'why' (not) questions from my side. thanks. -- GJ
Re: where (x,y,z) in ((x1,y1, z1), (x1,y1, z1), (x1,y1, z1), (x2,y2, z2)) (not) optimized
From
Filip Rembiałkowski
Date:
2009/1/26 Grzegorz Jaśkiewicz <gryzman@gmail.com>
I would guess that optimizing silly-written queries was always a low-priority task...
IMHO this is good topic for -hackers list.. and probably not so hard to implement :)
BTW, test on CVS HEAD:
CREATE TABLE atest(id integer primary key);
insert into atest select x from generate_series(1,100000) x(x);
ANALYZE atest;
EXPLAIN ANALYZE SELECT * FROM atest where id in (1,2,3,3,3,3,3,3,3,5,5,5,5,5,5,5,5,5,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1);
EXPLAIN ANALYZE SELECT * FROM atest where id in (1,2,3,5);
shows that second query is 2.5 times faster than the first ( 0.170 ms / 0.070 ms).
Hey folks,
I have question really for all mighty developers, but don't want to
spam -hackers with it.
why :
select * from foo where X in (1,1,1,1,1,1,1,1) --- same values in search.
or select * from foo where (x,y) in ((1,2),(1,2),(1,2),(1,2),(1,2),(1,2),(1,2));
never gets optimized by planner, etc ?
I would guess that optimizing silly-written queries was always a low-priority task...
IMHO this is good topic for -hackers list.. and probably not so hard to implement :)
BTW, test on CVS HEAD:
CREATE TABLE atest(id integer primary key);
insert into atest select x from generate_series(1,100000) x(x);
ANALYZE atest;
EXPLAIN ANALYZE SELECT * FROM atest where id in (1,2,3,3,3,3,3,3,3,5,5,5,5,5,5,5,5,5,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1);
EXPLAIN ANALYZE SELECT * FROM atest where id in (1,2,3,5);
shows that second query is 2.5 times faster than the first ( 0.170 ms / 0.070 ms).
Is it just not worth optimizing from pg side? I am sure, it would make
sense to actually reorder these values, so that index/whatnot could
pick it up faster.
Just another one of those, 'why' (not) questions from my side.
thanks.
--
GJ
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
Filip Rembiałkowski
Re: where (x,y,z) in ((x1,y1, z1), (x1,y1, z1), (x1,y1, z1), (x2,y2, z2)) (not) optimized
From
Grzegorz Jaśkiewicz
Date:
On Mon, Jan 26, 2009 at 1:58 PM, Filip Rembiałkowski <plk.zuber@gmail.com> wrote: > 2009/1/26 Grzegorz Jaśkiewicz <gryzman@gmail.com> >> >> Hey folks, >> >> I have question really for all mighty developers, but don't want to >> spam -hackers with it. >> >> why : >> select * from foo where X in (1,1,1,1,1,1,1,1) --- same values in search. >> or select * from foo where (x,y) in >> ((1,2),(1,2),(1,2),(1,2),(1,2),(1,2),(1,2)); >> >> never gets optimized by planner, etc ? > > I would guess that optimizing silly-written queries was always a > low-priority task... > EXPLAIN ANALYZE SELECT * FROM atest where id in > (1,2,3,3,3,3,3,3,3,5,5,5,5,5,5,5,5,5,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1); > EXPLAIN ANALYZE SELECT * FROM atest where id in (1,2,3,5); > > shows that second query is 2.5 times faster than the first ( 0.170 ms / > 0.070 ms). the difference isn't so small than :) silly or not, sometimes you end up with such collection passed on in some silly languages. -- GJ
Re: where (x,y,z) in ((x1,y1, z1), (x1,y1, z1), (x1,y1, z1), (x2,y2, z2)) (not) optimized
From
Tom Lane
Date:
=?UTF-8?Q?Grzegorz_Ja=C5=9Bkiewicz?= <gryzman@gmail.com> writes: > why : > select * from foo where X in (1,1,1,1,1,1,1,1) --- same values in search. > never gets optimized by planner, etc ? We actually did that, before 8.0 or thereabouts. It was removed because it cost a lot of planner cycles for *every* query, and helped for only a small number of stupidly-written ones. In general, you can probably remove such duplicates on the client side more efficiently than the database could anyway (realizing that whatever it might do has to be datatype-independent). regards, tom lane