Thread: tsearch2 word separators

tsearch2 word separators

From
Sushant Sinha
Date:
A document may contain date in the traditional format. For example it
may contain '11/1/2007'. It will be useful if we can directly search for
year in a document. However, the 'default' tsearch2 parser  does not
break down integers separated by '/'. So I my search for '2007' will not
match tsvector for '11/1/2007'. Here is an example

cmsdb=# select to_tsvector('default', '11/1/2007');
  to_tsvector
----------------
 '11/1/2007':1

I think this can be easily fixed if we use '/' as a word separator. Is
there an way to specify word separators in tsearch2 module?

Thank you,
-Sushant.


Re: tsearch2 word separators

From
Oleg Bartunov
Date:
On Thu, 13 Mar 2008, Sushant Sinha wrote:

> A document may contain date in the traditional format. For example it
> may contain '11/1/2007'. It will be useful if we can directly search for
> year in a document. However, the 'default' tsearch2 parser  does not
> break down integers separated by '/'. So I my search for '2007' will not
> match tsvector for '11/1/2007'. Here is an example
>
> cmsdb=# select to_tsvector('default', '11/1/2007');
>  to_tsvector
> ----------------
> '11/1/2007':1
>
> I think this can be easily fixed if we use '/' as a word separator. Is
> there an way to specify word separators in tsearch2 module?

no, you may write your own dictionary (dict_dates ?) or use our
dict_regex (http://vo.astronet.ru/arxiv/dict_regex.html).

>
> Thank you,
> -Sushant.
>
>
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83