From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Reid Thompson Sent: Thursday, October 28, 2010 12:57 PM To: steve@subwest.com Cc: Reid Thompson; pgsql-general@postgresql.org Subject: Re: [GENERAL] Full Text Search - Slow on common words
On Thu, 2010-10-28 at 12:08 -0700, sub3 wrote: > Hi, > > I have a small web page set up to search within my domain based on keywords. > One of the queries is: > SELECT page.id ts_rank_cd('{1.0, 1.0, 1.0, 1.0}',contFTI,q) FROM page, > to_tsquery('steve') as q WHERE contFTI @@ q > > My problem is: when someone puts in a commonly seen word, the system slows > down and takes a while because of the large amount of data being returned > (retrieved from the table) & processed by the rand_cd function. > > How does everyone else handle something like this? I can only think of 2 > possible solutions: > - change the query to search for the same terms at least twice in the same > document (can I do that?) > - limit any searches to x results before ranking & tell the user their > search criteria is too generic. > > Is there a better solution that I am missing? >
if the keyword is that common, is it really a keyword? Exclude it.
>>
This general idea is called a stopword list. You create a list of words that are so common that searching on them is counter-productive.