Re: lexemes in prefix search going through dictionary modifications - Mailing list pgsql-hackers

From Sushant Sinha
Subject Re: lexemes in prefix search going through dictionary modifications
Date
Msg-id 1320769528.2062.16.camel@dragflick
Whole thread Raw
In response to Re: lexemes in prefix search going through dictionary modifications  (Sushant Sinha <sushant354@gmail.com>)
Responses Re: lexemes in prefix search going through dictionary modifications
List pgsql-hackers
I think there is a need to provide prefix search to bypass
dictionaries.If you folks think that there is some credibility to such a
need then I can think about implementing it. How about an operator like
":#" that does this? The ":*" will continue to mean the same as
currently.

-Sushant.

On Tue, 2011-10-25 at 23:45 +0530, Sushant Sinha wrote:
> On Tue, 2011-10-25 at 19:27 +0200, Florian Pflug wrote:
> 
> > Assume, for example, that the postgres mailing list archive search used
> > tsearch (which I think it does, but I'm not sure). It'd then probably make
> > sense to add "postgres" to the list of stopwords, because it's bound to 
> > appear in nearly every mail. But wouldn't you want searched which include
> > 'postgres*' to turn up empty? Quite certainly not.
> 
> That improves recall for "postgres:*" query and certainly doesn't help
> other queries like "post:*". But more importantly it affects precision
> for all queries like "a:*", "an:*", "and:*", "s:*", 't:*', "the:*", etc
> (When that is the only search it also affects recall as no row matches
> an empty tsquery). Since stopwords are smaller, it means prefix search
> for a few characters is meaningless. And I would argue that is when the
> prefix search is more important -- only when you know a few characters.
> 
> 
> -Sushant




pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: heap vacuum & cleanup locks
Next
From: Christopher Browne
Date:
Subject: Re: Disable OpenSSL compression