Thread: OR or IN ?

OR or IN ?

From

Luca Ferrari

Date:

14 October 2008, 03:38:46

Hi all,
I've got a query with a long (>50) list of ORs, like the following:

SELECT colB, colC FROM table WHERE colA=X OR colA=Y OR colA=Z OR ....

Is there any difference in how postgresql manages the above query and the
following one?

SELECT colB, colC FROM table WHERE colA IN (X,Y,Z,...)

Which is the suggested index to use on colA to get better performances?

Thanks,
Luca

Re: OR or IN ?

From

"A. Kretschmer"

Date:

14 October 2008, 03:48:39

am  Tue, dem 14.10.2008, um  8:33:21 +0200 mailte Luca Ferrari folgendes:
> Hi all,
> I've got a query with a long (>50) list of ORs, like the following:
>
> SELECT colB, colC FROM table WHERE colA=X OR colA=Y OR colA=Z OR ....
>
> Is there any difference in how postgresql manages the above query and the
> following one?
>
> SELECT colB, colC FROM table WHERE colA IN (X,Y,Z,...)

Depends on the version: (same table foo)


8.1:
test=*# explain select * from foo where a in (1,2,3);
                     QUERY PLAN
-----------------------------------------------------
 Seq Scan on foo  (cost=0.00..47.45 rows=32 width=4)
   Filter: ((a = 1) OR (a = 2) OR (a = 3))


8.4devel:
test=# explain select * from foo where a in (1,2,3);
                     QUERY PLAN
-----------------------------------------------------
 Seq Scan on foo  (cost=0.00..43.00 rows=36 width=4)
   Filter: (a = ANY ('{1,2,3}'::integer[]))


As you can see, 8.1 rewrite the query to many ORs.



Andreas
--
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net

Re: OR or IN ?

From

Harald Fuchs

Date:

14 October 2008, 06:09:32

In article <20081014064831.GB22137@a-kretschmer.de>,
"A. Kretschmer" <andreas.kretschmer@schollglas.com> writes:

> am  Tue, dem 14.10.2008, um  8:33:21 +0200 mailte Luca Ferrari folgendes:
>> Hi all,
>> I've got a query with a long (>50) list of ORs, like the following:
>>
>> SELECT colB, colC FROM table WHERE colA=X OR colA=Y OR colA=Z OR ....
>>
>> Is there any difference in how postgresql manages the above query and the
>> following one?
>>
>> SELECT colB, colC FROM table WHERE colA IN (X,Y,Z,...)

> Depends on the version: (same table foo)

> 8.1:
> test=*# explain select * from foo where a in (1,2,3);
>                      QUERY PLAN
> -----------------------------------------------------
>  Seq Scan on foo  (cost=0.00..47.45 rows=32 width=4)
>    Filter: ((a = 1) OR (a = 2) OR (a = 3))

> 8.4devel:
> test=# explain select * from foo where a in (1,2,3);
>                      QUERY PLAN
> -----------------------------------------------------
>  Seq Scan on foo  (cost=0.00..43.00 rows=36 width=4)
>    Filter: (a = ANY ('{1,2,3}'::integer[]))

> As you can see, 8.1 rewrite the query to many ORs.

I think that "OR or IN" is the wrong question.  Where do those >50
values come from?  If they come from a DB operation, just include this
operation in your query.  Otherwise, you should ask yourself how many
values you might get: >50, >500, >5000?  There's a point where it's
probably more efficient to COPY all those values into an indexed
temporary table, ANALYZE it, and JOIN it into your query.

Re: OR or IN ?

From

Scott Ribe

Date:

17 October 2008, 11:06:39

Older versions of PG were inefficient with larger numbers of elements in an
IN query, and in fact would error out with something about lack of stack
space if you used very many (a few hundred IIRC).

8.x something was supposed to have improved that. Using 8.3 recently, after
an "oopsie" with some development data, I inadvertently confirmed that it
works and performance is not too bad with >34,000 items in an IN clause ;-)


--
Scott Ribe
scott_ribe@killerbytes.com
http://www.killerbytes.com/
(303) 722-0567 voice

Re: OR or IN ?

From

"Andrus"

Date:

17 October 2008, 11:25:50

> 8.x something was supposed to have improved that.

Not in 8.0
8.0 (and maybe newer releases also) throw dreaded  "Stack lenght limit
exceeded"

 message if IN contains large number or items.

Andrus.

Re: OR or IN ?

From

Luca Ferrari

Date:

17 October 2008, 11:29:24

On Friday 17 October 2008 Scott Ribe's cat, walking on the keyboard, wrote:
> Older versions of PG were inefficient with larger numbers of elements in an
> IN query, and in fact would error out with something about lack of stack
> space if you used very many (a few hundred IIRC).
>
> 8.x something was supposed to have improved that. Using 8.3 recently, after
> an "oopsie" with some development data, I inadvertently confirmed that it
> works and performance is not too bad with >34,000 items in an IN clause ;-)

Interesting, since my queries sometimes expand to a few thousands of ORs (or
INs), and I had in fact stack problems and I had to expand it to around 20
MB.....

Luca

Re: OR or IN ?

From

"Scott Marlowe"

Date:

17 October 2008, 11:32:18

On Fri, Oct 17, 2008 at 8:24 AM, Andrus <kobruleht2@hot.ee> wrote:
>> 8.x something was supposed to have improved that.
>
> Not in 8.0
> 8.0 (and maybe newer releases also) throw dreaded  "Stack lenght limit
> exceeded"
>
> message if IN contains large number or items.

Have y'all checked them there release notes?  I'm sure they tell the
tale. (sorry too much True Blood...)

Re: OR or IN ?

From

"Andrus"

Date:

17 October 2008, 12:32:28

>> 8.0 (and maybe newer releases also) throw dreaded  "Stack lenght limit
>> exceeded"
>>
>> message if IN contains large number or items.
>
> Have y'all checked them there release notes?  I'm sure they tell the
> tale. (sorry too much True Blood...)

No.  I need to support 8.0.
After those crashes I changed all IN clauses to

docid = ANY ({1,2,3,4,5,7,8,96,5445,.., 34534534})

this takes the same number of characters per id as IN clause.

I hope that this does not have any bad effects.

Andrus.

Re: OR or IN ?

From

Tom Lane

Date:

17 October 2008, 13:37:47

"Andrus" <kobruleht2@hot.ee> writes:
> No.  I need to support 8.0.
> After those crashes I changed all IN clauses to
> docid = ANY ({1,2,3,4,5,7,8,96,5445,.., 34534534})

I'm afraid that hardly counts as "supporting" 8.0 (or 8.1 for that
matter).  The syntax may work but it can't be indexed.

            regards, tom lane

Re: OR or IN ?

From

"Andrus"

Date:

17 October 2008, 14:01:45

>> No.  I need to support 8.0.
>> After those crashes I changed all IN clauses to
>> docid = ANY ({1,2,3,4,5,7,8,96,5445,.., 34534534})
>
> I'm afraid that hardly counts as "supporting" 8.0 (or 8.1 for that
> matter).  The syntax may work but it can't be indexed.

I'm sorry. It seems that Richard Huxton suggestion in this mailing list

SELECT * FROM generate_series(1,10) s
WHERE s = ANY (string_to_array('1,3,5,7', ',')::int[]);

works in 8.0
I hope that this works as well as new array syntax.

Andrus.