Re: Include Lists for Text Search - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Include Lists for Text Search
Date
Msg-id 1189430879.4281.247.camel@ebony.site
Whole thread Raw
In response to Re: Include Lists for Text Search  (Oleg Bartunov <oleg@sai.msu.su>)
Responses Re: Include Lists for Text Search
List pgsql-hackers
On Mon, 2007-09-10 at 16:35 +0400, Oleg Bartunov wrote:
> On Mon, 10 Sep 2007, Simon Riggs wrote:
> 
> > On Mon, 2007-09-10 at 16:10 +0400, Oleg Bartunov wrote:
> >> On Mon, 10 Sep 2007, Simon Riggs wrote:
> >>
> >>> It seems possible to write your own functions to support various
> >>> possibilities with text search.
> >>>
> >>> One of the more common thoughts is to have a list of words that you
> >>> would like to include, i.e. the opposite of a stop word list.
> >>>
> >>> There are clear indications that indexing too many words is a problem
> >>> for both GIN and GIST. If people already know what they'll be looking
> >>> for and what they will never be looking for, it seems easier to supply
> >>> that list up front, rather than hide it behind lots of hand-crafted
> >>> code.
> >>>
> >>> Can we include that functionality now?
> >>
> >> This could be realized very easyly using dict_strict, which returns
> >> only known words, and mapping contains only this dictionary. So,
> >> feel free to write it and submit.
> >
> > So there isn't one yet, but you think it will be easy to write and that
> > we should call it dict_strict?
> 
> we have dict_synonym already and if your list is not big you'll be happy.

So I need to do something like

CREATE TEXT SEARCH DICTIONARY my_diction (   template = snowball,   synonym = include_only_these_words
);

which will then look for a file called include_only_these_words.syn?

I would prefer to be able to do something like this

CREATE TEXT SEARCH DICTIONARY my_diction (   template = snowball,   include = justthese
);
...which makes more sense to anyone reading it
and I also want to make the comparison case insensitive.

Would it be better to
1. include a new dictionary file (dict_strict, as you suggest)
2. a) allow case sensitivity as another option in dictionaries  b) allow "include" as another word for "stoplist", but
withthe
 
meaning reversed?

e.g.

CREATE TEXT SEARCH DICTIONARY my_diction (   template = snowball,   include = justthese,   case_sensitive = true
);

--  Simon Riggs 2ndQuadrant  http://www.2ndQuadrant.com



pgsql-hackers by date:

Previous
From: "Heikki Linnakangas"
Date:
Subject: Re: ispell dictionary broken in CVS HEAD ?
Next
From: Simon Riggs
Date:
Subject: Re: Include Lists for Text Search