Thread: Performance of IN (...) vs. = ANY array[...]

Performance of IN (...) vs. = ANY array[...]

From
Benjamin Minshall
Date:
Greetings:

I'm running 8.1.4, and have noticed major differences in execution time
for plpgsql functions running queries that differ only in use of an
array such as:


slower_function( vals integer[] )
    [query] WHERE id = ANY vals;


faster_function( vals integer[] )
    vals_text := array_to_string( vals, ',' )
    EXECUTE '[query] WHERE id IN (' || vals_text || ')';


In general, there are about 10 integers in the lookup set on average and
50 max.

What are the advantages or disadvantages of using arrays in this
situation?  The = ANY array method makes plpgsql development cleaner,
but seems to really lack performance in certain cases.  What do you
recommend as the preferred method?

Thanks for your comments.

--
Benjamin Minshall <minshall@intellicon.biz>
Senior Developer -- Intellicon, Inc.
http://www.intellicon.biz

Attachment

Re: Performance of IN (...) vs. = ANY array[...]

From
Tom Lane
Date:
Benjamin Minshall <minshall@intellicon.biz> writes:
> What are the advantages or disadvantages of using arrays in this
> situation?  The = ANY array method makes plpgsql development cleaner,
> but seems to really lack performance in certain cases.

In existing releases, the form with IN (list-of-scalar-constants)
can be optimized into indexscan(s), but = ANY (array) isn't.

8.2 will treat them equivalently (in fact, it converts IN (...) to
= ANY (ARRAY[...]) !).  So depending on your time horizon, you might
wish to stick with whichever is cleaner for your calling code.

            regards, tom lane