Thread: Where clause efficiency using "IN"
Just a quick general question: Can someone comment on using where clauses like: (sample = 2 OR sample = 3 OR sample = 4) as compared to sample in (2,3,4) in terms of efficiency? Thanks, Sean
Sean Davis <sdavis2@mail.nih.gov> writes: > Just a quick general question: Can someone comment on using where > clauses like: > (sample = 2 OR sample = 3 OR sample = 4) > as compared to > sample in (2,3,4) > in terms of efficiency? PG's parser expands the latter into the former, so there's no difference for us. This is probably not true on other DBMSes. regards, tom lane
* Sean Davis <sdavis2@mail.nih.gov> [2004-10-22 11:07:45 -0400]: > Just a quick general question: Can someone comment on using where > clauses like: > > (sample = 2 OR sample = 3 OR sample = 4) > > as compared to > > sample in (2,3,4) > > in terms of efficiency? The 'explain analyze' command is helpful for making these kinds of decisions. In 7.4.5 it's translated to an OR. xinu=# explain analyze select id from users where id in (14, 30); QUERY PLAN --------------------------------------------------------------------------------------------------------------- Seq Scan on users (cost=100000000.00..100000012.50 rows=2 width=8) (actual time=0.045..0.050 rows=2 loops=1) Filter: ((id = 14) OR (id = 30)) Total runtime: 0.079 ms (3 rows) xinu=# explain analyze select id from users where id = 14 or id = 30; QUERY PLAN --------------------------------------------------------------------------------------------------------------- Seq Scan on users (cost=100000000.00..100000012.50 rows=2 width=8) (actual time=0.044..0.049 rows=2 loops=1) Filter: ((id = 14) OR (id = 30)) Total runtime: 0.077 ms (3 rows) HTH, -- Steven Klassen - Lead Programmer Command Prompt, Inc. - http://www.commandprompt.com/ PostgreSQL Replication & Support Services, (503) 667-4564
On Fri, 2004-10-22 at 09:07, Sean Davis wrote: > Just a quick general question: Can someone comment on using where > clauses like: > > (sample = 2 OR sample = 3 OR sample = 4) > > as compared to > > sample in (2,3,4) > > in terms of efficiency? I believe they are pretty much the same, in that the planner will turn the in clause you have there into a series of ors. However, if it is a subselect, then the planner can use a hashed aggregate method, as long as the dataset can fit into memory, as defined by some setting, (I think it's sort_mem, but I'm not completely sure).