Thread: tsearch2 & dictionaries - possible problem

tsearch2 & dictionaries - possible problem

From
Ivan Voras
Date:
hello,

I think I have a problem with tsearch2 configuration I'm trying to use.
I have created a text search configuration as:

--
CREATE TEXT SEARCH DICTIONARY hr_ispell (
    TEMPLATE = ispell,
    DictFile = 'hr',
    AffFile = 'hr',
    StopWords = 'hr'
);

CREATE TEXT SEARCH CONFIGURATION public.ts2hr (COPY=pg_catalog.english);

ALTER TEXT SEARCH CONFIGURATION ts2hr
    ALTER MAPPING FOR asciiword, asciihword, hword_asciipart, word,
hword, hword_part
    WITH hr_ispell;

SET default_text_search_config = 'public.ts2hr';
--

and here are some queries:

--
cms=> select to_tsvector('voras vorasom');
 to_tsvector
-------------

(1 row)

cms=> SET default_text_search_config = 'simple';
SET
cms=> select to_tsvector('voras vorasom');
      to_tsvector
-----------------------
 'voras':1 'vorasom':2
(1 row)

cms=> SET default_text_search_config = 'ts2hr';
SET
cms=> select to_tsvector('voras vorasom');
 to_tsvector
-------------

(1 row)

cms=> select to_tsvector('kiša kiši');
 to_tsvector
-------------
 'kiša':1,2
(1 row)
--

The good news is that the text search configuration is actually used
(the 'kiša kiši') example but apparently on an uncommon word,
to_tsvector() returns nothing (the 'voras vorasom' example).

Is there something wrong in the configuration? I would definitely not
want unknown words to be ignored.

Re: tsearch2 & dictionaries - possible problem

From
Oleg Bartunov
Date:
Ivan,

did you found your misunderstooding ? You forget how dictionaries work.
You need to put some dictionary, which recognize anything, like simple, or
stemmer dictionary to recognize 'unknown' word. Look into documentation.

Oleg
On Wed, 2 Jun 2010, Ivan Voras wrote:

> hello,
>
> I think I have a problem with tsearch2 configuration I'm trying to use.
> I have created a text search configuration as:
>
> --
> CREATE TEXT SEARCH DICTIONARY hr_ispell (
>    TEMPLATE = ispell,
>    DictFile = 'hr',
>    AffFile = 'hr',
>    StopWords = 'hr'
> );
>
> CREATE TEXT SEARCH CONFIGURATION public.ts2hr (COPY=pg_catalog.english);
>
> ALTER TEXT SEARCH CONFIGURATION ts2hr
>    ALTER MAPPING FOR asciiword, asciihword, hword_asciipart, word,
> hword, hword_part
>    WITH hr_ispell;
>
> SET default_text_search_config = 'public.ts2hr';
> --
>
> and here are some queries:
>
> --
> cms=> select to_tsvector('voras vorasom');
> to_tsvector
> -------------
>
> (1 row)
>
> cms=> SET default_text_search_config = 'simple';
> SET
> cms=> select to_tsvector('voras vorasom');
>      to_tsvector
> -----------------------
> 'voras':1 'vorasom':2
> (1 row)
>
> cms=> SET default_text_search_config = 'ts2hr';
> SET
> cms=> select to_tsvector('voras vorasom');
> to_tsvector
> -------------
>
> (1 row)
>
> cms=> select to_tsvector('kiЪЪa kiЪЪi');
> to_tsvector
> -------------
> 'kiЪЪa':1,2
> (1 row)
> --
>
> The good news is that the text search configuration is actually used
> (the 'kiЪЪa kiЪЪi') example but apparently on an uncommon word,
> to_tsvector() returns nothing (the 'voras vorasom' example).
>
> Is there something wrong in the configuration? I would definitely not
> want unknown words to be ignored.
>
>
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83