Thread: Where clause efficiency using "IN"

Where clause efficiency using "IN"

From

Sean Davis

Date:

22 October 2004, 16:06:53

Just a quick general question:  Can someone comment on using where
clauses like:

(sample = 2 OR sample = 3 OR sample = 4)

as compared to

sample in (2,3,4)

in terms of efficiency?

Thanks,
Sean

Re: Where clause efficiency using "IN"

From

Tom Lane

Date:

22 October 2004, 16:56:33

Sean Davis <sdavis2@mail.nih.gov> writes:
> Just a quick general question:  Can someone comment on using where
> clauses like:
>    (sample = 2 OR sample = 3 OR sample = 4)
> as compared to
>    sample in (2,3,4)
> in terms of efficiency?

PG's parser expands the latter into the former, so there's no difference
for us.  This is probably not true on other DBMSes.

            regards, tom lane

Re: Where clause efficiency using "IN"

From

Steven Klassen

Date:

22 October 2004, 17:00:14

* Sean Davis <sdavis2@mail.nih.gov> [2004-10-22 11:07:45 -0400]:

> Just a quick general question:  Can someone comment on using where
> clauses like:
>
> (sample = 2 OR sample = 3 OR sample = 4)
>
> as compared to
>
> sample in (2,3,4)
>
> in terms of efficiency?

The 'explain analyze' command is helpful for making these kinds of decisions. In 7.4.5 it's translated to an OR.

xinu=# explain analyze select id from users where id in (14, 30);
                                                  QUERY PLAN
---------------------------------------------------------------------------------------------------------------
 Seq Scan on users  (cost=100000000.00..100000012.50 rows=2 width=8) (actual time=0.045..0.050 rows=2 loops=1)
   Filter: ((id = 14) OR (id = 30))
 Total runtime: 0.079 ms
(3 rows)

xinu=# explain analyze select id from users where id = 14 or id = 30;
                                                  QUERY PLAN
---------------------------------------------------------------------------------------------------------------
 Seq Scan on users  (cost=100000000.00..100000012.50 rows=2 width=8) (actual time=0.044..0.049 rows=2 loops=1)
   Filter: ((id = 14) OR (id = 30))
 Total runtime: 0.077 ms
(3 rows)


HTH,

--
Steven Klassen - Lead Programmer
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Replication & Support Services, (503) 667-4564

Re: Where clause efficiency using "IN"

From

"Scott Marlowe"

Date:

22 October 2004, 19:54:42

On Fri, 2004-10-22 at 09:07, Sean Davis wrote:
> Just a quick general question:  Can someone comment on using where
> clauses like:
>
> (sample = 2 OR sample = 3 OR sample = 4)
>
> as compared to
>
> sample in (2,3,4)
>
> in terms of efficiency?

I believe they are pretty much the same, in that the planner will turn
the in clause you have there into a series of ors.  However, if it is a
subselect, then the planner can use a hashed aggregate method, as long
as the dataset can fit into memory, as defined by some setting, (I think
it's sort_mem, but I'm not completely sure).