On Mon, 2007-09-10 at 16:35 +0400, Oleg Bartunov wrote:
> On Mon, 10 Sep 2007, Simon Riggs wrote:
>
> > On Mon, 2007-09-10 at 16:10 +0400, Oleg Bartunov wrote:
> >> On Mon, 10 Sep 2007, Simon Riggs wrote:
> >>
> >>> It seems possible to write your own functions to support various
> >>> possibilities with text search.
> >>>
> >>> One of the more common thoughts is to have a list of words that you
> >>> would like to include, i.e. the opposite of a stop word list.
> >>>
> >>> There are clear indications that indexing too many words is a problem
> >>> for both GIN and GIST. If people already know what they'll be looking
> >>> for and what they will never be looking for, it seems easier to supply
> >>> that list up front, rather than hide it behind lots of hand-crafted
> >>> code.
> >>>
> >>> Can we include that functionality now?
> >>
> >> This could be realized very easyly using dict_strict, which returns
> >> only known words, and mapping contains only this dictionary. So,
> >> feel free to write it and submit.
> >
> > So there isn't one yet, but you think it will be easy to write and that
> > we should call it dict_strict?
>
> we have dict_synonym already and if your list is not big you'll be happy.
So I need to do something like
CREATE TEXT SEARCH DICTIONARY my_diction ( template = snowball, synonym = include_only_these_words
);
which will then look for a file called include_only_these_words.syn?
I would prefer to be able to do something like this
CREATE TEXT SEARCH DICTIONARY my_diction ( template = snowball, include = justthese
);
...which makes more sense to anyone reading it
and I also want to make the comparison case insensitive.
Would it be better to
1. include a new dictionary file (dict_strict, as you suggest)
2. a) allow case sensitivity as another option in dictionaries b) allow "include" as another word for "stoplist", but
withthe
meaning reversed?
e.g.
CREATE TEXT SEARCH DICTIONARY my_diction ( template = snowball, include = justthese, case_sensitive = true
);
-- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com