Re: [SQL] Techniques for quickly finding words in a phrase... - Mailing list pgsql-sql

From Tom Lane
Subject Re: [SQL] Techniques for quickly finding words in a phrase...
Date
Msg-id 21821.950310523@sss.pgh.pa.us
Whole thread Raw
In response to Techniques for quickly finding words in a phrase...  ("Saltsgaver, Scott" <scottsa@aiinet.com>)
Responses Re: [SQL] Techniques for quickly finding words in a phrase...  (om <o@mueschke.de>)
List pgsql-sql
"Saltsgaver, Scott" <scottsa@aiinet.com> writes:
> SELECT p.id, phrase FROM Phrase AS p, PhraseWords AS pw
>     WHERE
>         ((p.id = pw.id) AND word LIKE 'WAS%')
>         AND EXISTS (SELECT id FROM PhraseWords AS pw
>                 WHERE (p.id = pw.id) AND word LIKE 'WHIT%')
>         AND EXISTS (SELECT id FROM PhraseWords AS pw
>                 WHERE (p.id = pw.id) AND word LIKE 'SNOW%');

> For some reason, the select still takes > 1 minute on a fairly decent
> sized Linux box (500Mhz, 128MB ram).

Subselects are pretty inefficient in Postgres at present.  Try rewriting
it as a join:

SELECT p.id, phrase FROM Phrase AS p, PhraseWords AS pw1,PhraseWords AS pw2, PhraseWords AS pw3
WHERE p.id = pw1.id AND pw1.word LIKE 'WAS%'AND p.id = pw2.id AND pw2.word LIKE 'WHIT%'AND p.id = pw3.id AND pw3.word
LIKE'SNOW%';
 

(with the obvious adjustments depending on how many words in your
search phrase).

If you are using search phrases with more than half a dozen words,
you will probably need to enable GEQO planning to avoid spending
an unreasonable amount of time in planning the query.  (If 'explain'
itself starts to take a long time, you are seeing excessive plan time.)
        regards, tom lane


pgsql-sql by date:

Previous
From: "Saltsgaver, Scott"
Date:
Subject: Techniques for quickly finding words in a phrase...
Next
From: phil@Stimpy.netroedge.com
Date:
Subject: Type casting bool as int4?