Problem with large number of index conditions - Mailing list pgsql-bugs

From Brian Herlihy
Subject Problem with large number of index conditions
Date
Msg-id 20050801033823.9135.qmail@web52308.mail.yahoo.com
Whole thread Raw
List pgsql-bugs
Hi,

I posted this to pgsql-performance a month ago, but have yet to receive a
response, so I am trying this list instead.

I am using PostgreSQL 7.4.7

I have a query looking something like this:

SELECT * FROM tbl WHERE (n = 0::smallint AND (string = 'Hello' OR string =
'Pineapple' OR string = 'Kitten' OR  .....  string = 'Potato));

There is an index on (n, string) as well as an index on (string).  The casting
of n allows use of the (n, string) index.

The problem is that once the number of 'OR' conditions goes past around 500,
postgresql will revert to doing a sequential scan with a filter.  This kills
performance, as the table has 3 million rows.  Setting enable_seqscan to OFF
does not affect the query plan.

I have solved the problem by breaking the query into sections, as follows:

SELECT * FROM tbl WHERE (n = 0 AND (string = 'Hello' OR string = 'Pineapple' OR
string = 'Kitten' OR  .....  string = 'Piano'))
UNION
SELECT * FROM tbl WHERE (n = 0 AND (string = 'Panic' OR ... OR string =
'Potato));

where each subquery has no more then 200 conditions.  Each subquery uses the
index, and then the results are unioned.  This takes a few seconds to run,
whereas the sequential scan can take an hour (on a loaded system).

Is there any better solution?

Thanks,
Brian Herlihy

pgsql-bugs by date:

Previous
From: Marko Ristola
Date:
Subject: Re: Libpq free bug? (Was: [ODBC] IIS Postgres ASP ADO - Problem)
Next
From: "Aaron Harsh"
Date:
Subject: BUG #1800: "unexpected chunk number" during pg_dump