Thread: expensive query

expensive query

From
Postgres SQL
Date:
I have to search a data base for a variable number terms, checking
certain fields against the terms, provided all the terms are matched,
while each may match any field, I want that row returned. My problem is,
that this query always takes too long -- and I need it done quickly.To illustrate, a fictitious similar query would
looklike this,
 
two terms (a & b), and four fields (one, two, three, four):
SELECT one, two, three, fourFROM sometableWHERE (one ~* 'a' OR two ~* 'a' OR three ~* 'a' OR four ~* 'a')AND (one ~*
'b'OR two ~* 'b' OR three ~* 'b' OR four ~* 'b'); 
 


to complicate things, when I perform this, it's actually a join between
three tables, one having 800+ rows. I intend to waste space and save time
by combining all this info into one table. But this still won't be very
fast, is there any way to make this fast? One thing I'm considering is
putting all of this data into one field so that only one text search per
term per record is made, so that instead of doing ~6400 text searches, I
only need ~1600, in the above example. Does this sound like it would help
any? Are there any other suggestions?





Re: [SQL] expensive query

From
Tom Lane
Date:
Postgres SQL <postgres@phoenix.isn.net> writes:
>     To illustrate, a fictitious similar query would look like this,
> two terms (a & b), and four fields (one, two, three, four):

>     SELECT one, two, three, four
>     FROM sometable
>     WHERE (one ~* 'a' OR two ~* 'a' OR three ~* 'a' OR four ~* 'a')
>     AND (one ~* 'b' OR two ~* 'b' OR three ~* 'b' OR four ~* 'b'); 

Hm.  This is going to be slow because the system has no alternative
but to examine every tuple and compute the WHERE expression on it.
What you need to make this fast is to make it possible to use an index
to narrow down the number of tuples that need to be looked at.  If all
the regexps were anchored left (~* '^a' etc) then an index on the text
field could be used to select out just the tuples starting with 'a'.
I imagine you don't want to restrict the regexps that much, though.

If you're looking for keywords, you could consider making a table
showing all the keywords appearing in each tuple, and then indexing
that table.  Also take a look at contrib/fulltextindex to see if you
can adapt its ideas to your needs.
        regards, tom lane