Thread: tsearch - v2 new dict

tsearch - v2 new dict

From
sector119@mail.ru
Date:
Hi
I try to add new dict,but I've get an error

$mars->{sector119}:~ % ls -la /usr/local/pgsql/share/ukrainian*
-rw-r--r--    1 root     root        59504   2  2000
/usr/local/pgsql/share/ukrainian.aff
-rw-r--r--    1 root     root      1355320   2  2000
/usr/local/pgsql/share/ukrainian.dict
lrwxrwxrwx    1 root     root           14  13 09:23
/usr/local/pgsql/share/ukrainian.hash -> ukrainian.dict
-rw-r--r--    1 root     root          699  13 17:14
/usr/local/pgsql/share/ukrainian.stop


test=# SELECT * from pg_ts_cfg where id=4;
 id | ts_name | prs_name | locale
 ----+---------+----------+--------
   4 | uk      | default  | uk_UA

test=# SELECT * from pg_ts_cfgmap where ts_name='uk';
 ts_name |  lex_alias  | dict_name
 ---------+-------------+-----------
  uk      | file        | {simple}
  uk      | lhword      | {uk_stem}
  uk      | lpart_hword | {uk_stem}
  uk      | lword       | {uk_stem}
  uk      | uint        | {simple}
  uk      | version     | {simple}
(6 rows)

test=# SELECT * from pg_ts_dict where dict_id=6;
dict_id         | 6
dict_name       | uk_stem
dict_init       | 17632
dict_initoption |
DictFile="/usr/local/pgsql/share/ukrainian.hash",
AffFile="/usr/local/pgsql/share/ukrainian.aff",
StopFile="/usr/local/pgsql/share/ukrainian.stop
dict_lemmatize  | 17633
dict_comment    | Ukrainian Stemmer. Snowball.

test=# SELECT txt2txtidx('uk','alot of words in ukrainian');
ERROR:  Unexpected end of line

Why I get this error message?

If I did something wrong, please say me what I have to change!

Thank you!
--
WBR, sector119

Re: tsearch - v2 new dict

From
Oleg Bartunov
Date:
You mixed stemmer and morphology ! These are two different dictionaries.

btw, I suggest you using 'ua'  instead of 'uk' :-)


On Fri, 13 Jun 2003 sector119@mail.ru wrote:

> Hi
> I try to add new dict,but I've get an error
>
> $mars->{sector119}:~ % ls -la /usr/local/pgsql/share/ukrainian*
> -rw-r--r--    1 root     root        59504   2  2000
> /usr/local/pgsql/share/ukrainian.aff
> -rw-r--r--    1 root     root      1355320   2  2000
> /usr/local/pgsql/share/ukrainian.dict
> lrwxrwxrwx    1 root     root           14  13 09:23
> /usr/local/pgsql/share/ukrainian.hash -> ukrainian.dict
> -rw-r--r--    1 root     root          699  13 17:14
> /usr/local/pgsql/share/ukrainian.stop
>
>
> test=# SELECT * from pg_ts_cfg where id=4;
>  id | ts_name | prs_name | locale
>  ----+---------+----------+--------
>    4 | uk      | default  | uk_UA
>
> test=# SELECT * from pg_ts_cfgmap where ts_name='uk';
>  ts_name |  lex_alias  | dict_name
>  ---------+-------------+-----------
>   uk      | file        | {simple}
>   uk      | lhword      | {uk_stem}
>   uk      | lpart_hword | {uk_stem}
>   uk      | lword       | {uk_stem}
>   uk      | uint        | {simple}
>   uk      | version     | {simple}
> (6 rows)
>
> test=# SELECT * from pg_ts_dict where dict_id=6;
> dict_id         | 6
> dict_name       | uk_stem
> dict_init       | 17632
> dict_initoption |
> DictFile="/usr/local/pgsql/share/ukrainian.hash",
> AffFile="/usr/local/pgsql/share/ukrainian.aff",
> StopFile="/usr/local/pgsql/share/ukrainian.stop
> dict_lemmatize  | 17633
> dict_comment    | Ukrainian Stemmer. Snowball.
>
> test=# SELECT txt2txtidx('uk','alot of words in ukrainian');
> ERROR:  Unexpected end of line
>
> Why I get this error message?
>
> If I did something wrong, please say me what I have to change!
>
> Thank you!
>

    Regards,
        Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

Re: tsearch - v2 new dict

From
Teodor Sigaev
Date:
> DictFile="/usr/local/pgsql/share/ukrainian.hash",
> AffFile="/usr/local/pgsql/share/ukrainian.aff",
> StopFile="/usr/local/pgsql/share/ukrainian.stop

Forgot " at the end

--
Teodor Sigaev                                  E-mail: teodor@sigaev.ru


Re: tsearch - v2 new dict

From
sector119@mail.ru
Date:
On Fri, 13 Jun 2003 18:58:19 +0400 (MSD)
Oleg Bartunov <oleg@sai.msu.su> wrote:

> You mixed stemmer and morphology ! These are two different
> dictionaries.

ispell_ua - it's morphology dictionary,yes? I add this dictionary :)
And now I have to add stemmer dictionary :) but how? how am I able to do
that, or where am I able to read about that?
+:)

because without it I've got en error:
SELECT txt2txtidx('ua','a lot of ukrainian words');
ERROR:  No dictionary

> btw, I suggest you using 'ua'  instead of 'uk' :-)
ok :) I change uk -> ua :)

test=# SELECT * FROM pg_ts_cfgmap WHERE ts_name = 'ua';
ts_name |  lex_alias  |      dict_name
---------+-------------+---------------------
ua      | file        | {simple}
ua      | lhword      | {ispell_ua,ua_stem}
ua      | lpart_hword | {ispell_ua,ua_stem}
ua      | lword       | {ispell_ua,ua_stem}
ua      | uint        | {simple}
ua      | version     | {simple}

test=# SELECT * FROM pg_ts_dict WHERE dict_id = 6;
dict_id         | 6
dict_name       | ispell_ua
dict_init       | 17632
dict_initoption |
DictFile="/usr/local/pgsql/share/ukrainian.hash",AffFile="/usr/local/pgsql/share/ukrainian.aff",StopFile="/usr/local/pgsql/share/ukrainian.stop"
dict_lemmatize  | 17633
dict_comment    | Ukrainian ispell

--
WBR, sector119

Re: tsearch - v2 new dict

From
Oleg Bartunov
Date:
Have you read

http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/tsearchV2-intro.txt

I don't see you have added 'ua' configuration into pg_ts_cfg

    Oleg

On Tue, 17 Jun 2003 sector119@mail.ru wrote:

> On Fri, 13 Jun 2003 18:58:19 +0400 (MSD)
> Oleg Bartunov <oleg@sai.msu.su> wrote:
>
> > You mixed stemmer and morphology ! These are two different
> > dictionaries.
>
> ispell_ua - it's morphology dictionary,yes? I add this dictionary :)
> And now I have to add stemmer dictionary :) but how? how am I able to do
> that, or where am I able to read about that?
> +:)
>
> because without it I've got en error:
> SELECT txt2txtidx('ua','a lot of ukrainian words');
> ERROR:  No dictionary
>
> > btw, I suggest you using 'ua'  instead of 'uk' :-)
> ok :) I change uk -> ua :)
>
> test=# SELECT * FROM pg_ts_cfgmap WHERE ts_name = 'ua';
> ts_name |  lex_alias  |      dict_name
> ---------+-------------+---------------------
> ua      | file        | {simple}
> ua      | lhword      | {ispell_ua,ua_stem}
> ua      | lpart_hword | {ispell_ua,ua_stem}
> ua      | lword       | {ispell_ua,ua_stem}
> ua      | uint        | {simple}
> ua      | version     | {simple}
>
> test=# SELECT * FROM pg_ts_dict WHERE dict_id = 6;
> dict_id         | 6
> dict_name       | ispell_ua
> dict_init       | 17632
> dict_initoption |
DictFile="/usr/local/pgsql/share/ukrainian.hash",AffFile="/usr/local/pgsql/share/ukrainian.aff",StopFile="/usr/local/pgsql/share/ukrainian.stop"
> dict_lemmatize  | 17633
> dict_comment    | Ukrainian ispell
>
>

    Regards,
        Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

Re: tsearch - v2 new dict

From
sector119@mail.ru
Date:
Yes, I have read tsearchV2-intro.txt
but id do not understand how to add stemmer dictionary :(

test=# SELECT * FROM pg_ts_cfg;
id |     ts_name     | prs_name |    locale
----+-----------------+----------+--------------
1 | default         | default  | C
2 | default_russian | default  | ru_RU.KOI8-R
3 | simple          | default  |
4 | ua              | default  | uk_UA

--
WBR, sector119

Re: tsearch - v2 new dict

From
Oleg Bartunov
Date:
On Tue, 17 Jun 2003 sector119@mail.ru wrote:

> Yes, I have read tsearchV2-intro.txt
> but id do not understand how to add stemmer dictionary :(

is't something different from snowball ?

>
> test=# SELECT * FROM pg_ts_cfg;
> id |     ts_name     | prs_name |    locale
> ----+-----------------+----------+--------------
> 1 | default         | default  | C
> 2 | default_russian | default  | ru_RU.KOI8-R
> 3 | simple          | default  |
> 4 | ua              | default  | uk_UA

btw, uk_UA probably should ne ua_UA :)



>
>

    Regards,
        Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

Re: tsearch - v2 new dict

From
Teodor Sigaev
Date:
sector119@mail.ru wrote:
> Yes, I have read tsearchV2-intro.txt
> but id do not understand how to add stemmer dictionary :(
>

Other your message:

 > test=# SELECT * FROM pg_ts_cfgmap WHERE ts_name = 'ua';
 > ts_name |  lex_alias  |      dict_name
 > ---------+-------------+---------------------
 > ua      | file        | {simple}
 > ua      | lhword      | {ispell_ua,ua_stem}
 > ua      | lpart_hword | {ispell_ua,ua_stem}
 > ua      | lword       | {ispell_ua,ua_stem}
 > ua      | uint        | {simple}
 > ua      | version     | {simple}
 >

Do you add ua_stem or not?

--
Teodor Sigaev                                  E-mail: teodor@sigaev.ru


Re: tsearch - v2 new dict

From
sector119@mail.ru
Date:
> btw, uk_UA probably should ne ua_UA :)

no :) ukrainian locale have to be uk_UA, that is why firstly I called
new dictionary like uk :)

--
WBR, sector119

Re: tsearch - v2 new dict

From
sector119@mail.ru
Date:
> Do you add ua_stem or not?

nope :( I do not know how to add it...
I have do it the same way as when I was adding ispell_ua dict?

--
WBR, sector119

Re: tsearch - v2 new dict

From
Teodor Sigaev
Date:

sector119@mail.ru wrote:
>>Do you add ua_stem or not?
>
>
> nope :( I do not know how to add it...
> I have do it the same way as when I was adding ispell_ua dict?
>

no

You should get (or write new one) from  snowball.tartarus.org.
Then place ua_stem.h and ua_stem.c in tsearch/snowball directory,
and edit tsearch/Makefile and tsearch/dict_snowball.c.

Unfortunally, you should do it yourself, I don't know ukrainian.


--
Teodor Sigaev                                  E-mail: teodor@sigaev.ru


Re: tsearch - v2 new dict

From
"Andrew J. Kopciuch"
Date:
On Tuesday 17 June 2003 09:06, Teodor Sigaev wrote:
> sector119@mail.ru wrote:
> >>Do you add ua_stem or not?
> >
> > nope :( I do not know how to add it...
> > I have do it the same way as when I was adding ispell_ua dict?
>
> no
>
> You should get (or write new one) from  snowball.tartarus.org.
> Then place ua_stem.h and ua_stem.c in tsearch/snowball directory,
> and edit tsearch/Makefile and tsearch/dict_snowball.c.
>
> Unfortunally, you should do it yourself, I don't know ukrainian.


There is apparently no stemming algo available yet.  So if you need one, you
will have to write it yourself.

In the meantime ... you could just use the simple stemming until you write a
ua_stem ... or someone else does.


Andy


Re: tsearch - v2 new dict

From
Sergei Levchenko
Date:
On Fri, 13 Jun 2003 18:58:19 +0400 (MSD)
Oleg Bartunov <oleg@sai.msu.su> wrote:

> You mixed stemmer and morphology ! These are two different dictionaries.

ispell_ua - it's morphology dictionary,yes? I add this dictionary :)
And now I have to add stemmer dictionary :) but how? how am I able to do that, or where am I able to read about that?
:)

because without it I've got en error:
SELECT txt2txtidx('ua','a lot of ukrainian words');
ERROR:  No dictionary

> btw, I suggest you using 'ua'  instead of 'uk' :-)
ok :) I change uk -> ua :)

test=# SELECT * FROM pg_ts_cfgmap WHERE ts_name = 'ua';
 ts_name |  lex_alias  |      dict_name
---------+-------------+---------------------
 ua      | file        | {simple}
 ua      | lhword      | {ispell_ua,ua_stem}
 ua      | lpart_hword | {ispell_ua,ua_stem}
 ua      | lword       | {ispell_ua,ua_stem}
 ua      | uint        | {simple}
 ua      | version     | {simple}

test=# SELECT * FROM pg_ts_dict WHERE dict_id = 6;
-[ RECORD 1
]---+--------------------------------------------------------------------------------------------------------------------------------------------------
dict_id         | 6
dict_name       | ispell_ua
dict_init       | 17632
dict_initoption |
DictFile="/usr/local/pgsql/share/ukrainian.hash",AffFile="/usr/local/pgsql/share/ukrainian.aff",StopFile="/usr/local/pgsql/share/ukrainian.stop"
dict_lemmatize  | 17633
dict_comment    | Ukrainian ispell

--
WBR, sector119