Hi,
I posted this to pgsql-performance a month ago, but have yet to receive a
response, so I am trying this list instead.
I am using PostgreSQL 7.4.7
I have a query looking something like this:
SELECT * FROM tbl WHERE (n = 0::smallint AND (string = 'Hello' OR string =
'Pineapple' OR string = 'Kitten' OR ..... string = 'Potato));
There is an index on (n, string) as well as an index on (string). The casting
of n allows use of the (n, string) index.
The problem is that once the number of 'OR' conditions goes past around 500,
postgresql will revert to doing a sequential scan with a filter. This kills
performance, as the table has 3 million rows. Setting enable_seqscan to OFF
does not affect the query plan.
I have solved the problem by breaking the query into sections, as follows:
SELECT * FROM tbl WHERE (n = 0 AND (string = 'Hello' OR string = 'Pineapple' OR
string = 'Kitten' OR ..... string = 'Piano'))
UNION
SELECT * FROM tbl WHERE (n = 0 AND (string = 'Panic' OR ... OR string =
'Potato));
where each subquery has no more then 200 conditions. Each subquery uses the
index, and then the results are unioned. This takes a few seconds to run,
whereas the sequential scan can take an hour (on a loaded system).
Is there any better solution?
Thanks,
Brian Herlihy