Thread: tsearch - v2 new dict
Hi I try to add new dict,but I've get an error $mars->{sector119}:~ % ls -la /usr/local/pgsql/share/ukrainian* -rw-r--r-- 1 root root 59504 2 2000 /usr/local/pgsql/share/ukrainian.aff -rw-r--r-- 1 root root 1355320 2 2000 /usr/local/pgsql/share/ukrainian.dict lrwxrwxrwx 1 root root 14 13 09:23 /usr/local/pgsql/share/ukrainian.hash -> ukrainian.dict -rw-r--r-- 1 root root 699 13 17:14 /usr/local/pgsql/share/ukrainian.stop test=# SELECT * from pg_ts_cfg where id=4; id | ts_name | prs_name | locale ----+---------+----------+-------- 4 | uk | default | uk_UA test=# SELECT * from pg_ts_cfgmap where ts_name='uk'; ts_name | lex_alias | dict_name ---------+-------------+----------- uk | file | {simple} uk | lhword | {uk_stem} uk | lpart_hword | {uk_stem} uk | lword | {uk_stem} uk | uint | {simple} uk | version | {simple} (6 rows) test=# SELECT * from pg_ts_dict where dict_id=6; dict_id | 6 dict_name | uk_stem dict_init | 17632 dict_initoption | DictFile="/usr/local/pgsql/share/ukrainian.hash", AffFile="/usr/local/pgsql/share/ukrainian.aff", StopFile="/usr/local/pgsql/share/ukrainian.stop dict_lemmatize | 17633 dict_comment | Ukrainian Stemmer. Snowball. test=# SELECT txt2txtidx('uk','alot of words in ukrainian'); ERROR: Unexpected end of line Why I get this error message? If I did something wrong, please say me what I have to change! Thank you! -- WBR, sector119
You mixed stemmer and morphology ! These are two different dictionaries. btw, I suggest you using 'ua' instead of 'uk' :-) On Fri, 13 Jun 2003 sector119@mail.ru wrote: > Hi > I try to add new dict,but I've get an error > > $mars->{sector119}:~ % ls -la /usr/local/pgsql/share/ukrainian* > -rw-r--r-- 1 root root 59504 2 2000 > /usr/local/pgsql/share/ukrainian.aff > -rw-r--r-- 1 root root 1355320 2 2000 > /usr/local/pgsql/share/ukrainian.dict > lrwxrwxrwx 1 root root 14 13 09:23 > /usr/local/pgsql/share/ukrainian.hash -> ukrainian.dict > -rw-r--r-- 1 root root 699 13 17:14 > /usr/local/pgsql/share/ukrainian.stop > > > test=# SELECT * from pg_ts_cfg where id=4; > id | ts_name | prs_name | locale > ----+---------+----------+-------- > 4 | uk | default | uk_UA > > test=# SELECT * from pg_ts_cfgmap where ts_name='uk'; > ts_name | lex_alias | dict_name > ---------+-------------+----------- > uk | file | {simple} > uk | lhword | {uk_stem} > uk | lpart_hword | {uk_stem} > uk | lword | {uk_stem} > uk | uint | {simple} > uk | version | {simple} > (6 rows) > > test=# SELECT * from pg_ts_dict where dict_id=6; > dict_id | 6 > dict_name | uk_stem > dict_init | 17632 > dict_initoption | > DictFile="/usr/local/pgsql/share/ukrainian.hash", > AffFile="/usr/local/pgsql/share/ukrainian.aff", > StopFile="/usr/local/pgsql/share/ukrainian.stop > dict_lemmatize | 17633 > dict_comment | Ukrainian Stemmer. Snowball. > > test=# SELECT txt2txtidx('uk','alot of words in ukrainian'); > ERROR: Unexpected end of line > > Why I get this error message? > > If I did something wrong, please say me what I have to change! > > Thank you! > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
> DictFile="/usr/local/pgsql/share/ukrainian.hash", > AffFile="/usr/local/pgsql/share/ukrainian.aff", > StopFile="/usr/local/pgsql/share/ukrainian.stop Forgot " at the end -- Teodor Sigaev E-mail: teodor@sigaev.ru
On Fri, 13 Jun 2003 18:58:19 +0400 (MSD) Oleg Bartunov <oleg@sai.msu.su> wrote: > You mixed stemmer and morphology ! These are two different > dictionaries. ispell_ua - it's morphology dictionary,yes? I add this dictionary :) And now I have to add stemmer dictionary :) but how? how am I able to do that, or where am I able to read about that? +:) because without it I've got en error: SELECT txt2txtidx('ua','a lot of ukrainian words'); ERROR: No dictionary > btw, I suggest you using 'ua' instead of 'uk' :-) ok :) I change uk -> ua :) test=# SELECT * FROM pg_ts_cfgmap WHERE ts_name = 'ua'; ts_name | lex_alias | dict_name ---------+-------------+--------------------- ua | file | {simple} ua | lhword | {ispell_ua,ua_stem} ua | lpart_hword | {ispell_ua,ua_stem} ua | lword | {ispell_ua,ua_stem} ua | uint | {simple} ua | version | {simple} test=# SELECT * FROM pg_ts_dict WHERE dict_id = 6; dict_id | 6 dict_name | ispell_ua dict_init | 17632 dict_initoption | DictFile="/usr/local/pgsql/share/ukrainian.hash",AffFile="/usr/local/pgsql/share/ukrainian.aff",StopFile="/usr/local/pgsql/share/ukrainian.stop" dict_lemmatize | 17633 dict_comment | Ukrainian ispell -- WBR, sector119
Have you read http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/tsearchV2-intro.txt I don't see you have added 'ua' configuration into pg_ts_cfg Oleg On Tue, 17 Jun 2003 sector119@mail.ru wrote: > On Fri, 13 Jun 2003 18:58:19 +0400 (MSD) > Oleg Bartunov <oleg@sai.msu.su> wrote: > > > You mixed stemmer and morphology ! These are two different > > dictionaries. > > ispell_ua - it's morphology dictionary,yes? I add this dictionary :) > And now I have to add stemmer dictionary :) but how? how am I able to do > that, or where am I able to read about that? > +:) > > because without it I've got en error: > SELECT txt2txtidx('ua','a lot of ukrainian words'); > ERROR: No dictionary > > > btw, I suggest you using 'ua' instead of 'uk' :-) > ok :) I change uk -> ua :) > > test=# SELECT * FROM pg_ts_cfgmap WHERE ts_name = 'ua'; > ts_name | lex_alias | dict_name > ---------+-------------+--------------------- > ua | file | {simple} > ua | lhword | {ispell_ua,ua_stem} > ua | lpart_hword | {ispell_ua,ua_stem} > ua | lword | {ispell_ua,ua_stem} > ua | uint | {simple} > ua | version | {simple} > > test=# SELECT * FROM pg_ts_dict WHERE dict_id = 6; > dict_id | 6 > dict_name | ispell_ua > dict_init | 17632 > dict_initoption | DictFile="/usr/local/pgsql/share/ukrainian.hash",AffFile="/usr/local/pgsql/share/ukrainian.aff",StopFile="/usr/local/pgsql/share/ukrainian.stop" > dict_lemmatize | 17633 > dict_comment | Ukrainian ispell > > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
Yes, I have read tsearchV2-intro.txt but id do not understand how to add stemmer dictionary :( test=# SELECT * FROM pg_ts_cfg; id | ts_name | prs_name | locale ----+-----------------+----------+-------------- 1 | default | default | C 2 | default_russian | default | ru_RU.KOI8-R 3 | simple | default | 4 | ua | default | uk_UA -- WBR, sector119
On Tue, 17 Jun 2003 sector119@mail.ru wrote: > Yes, I have read tsearchV2-intro.txt > but id do not understand how to add stemmer dictionary :( is't something different from snowball ? > > test=# SELECT * FROM pg_ts_cfg; > id | ts_name | prs_name | locale > ----+-----------------+----------+-------------- > 1 | default | default | C > 2 | default_russian | default | ru_RU.KOI8-R > 3 | simple | default | > 4 | ua | default | uk_UA btw, uk_UA probably should ne ua_UA :) > > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
sector119@mail.ru wrote: > Yes, I have read tsearchV2-intro.txt > but id do not understand how to add stemmer dictionary :( > Other your message: > test=# SELECT * FROM pg_ts_cfgmap WHERE ts_name = 'ua'; > ts_name | lex_alias | dict_name > ---------+-------------+--------------------- > ua | file | {simple} > ua | lhword | {ispell_ua,ua_stem} > ua | lpart_hword | {ispell_ua,ua_stem} > ua | lword | {ispell_ua,ua_stem} > ua | uint | {simple} > ua | version | {simple} > Do you add ua_stem or not? -- Teodor Sigaev E-mail: teodor@sigaev.ru
> btw, uk_UA probably should ne ua_UA :) no :) ukrainian locale have to be uk_UA, that is why firstly I called new dictionary like uk :) -- WBR, sector119
> Do you add ua_stem or not? nope :( I do not know how to add it... I have do it the same way as when I was adding ispell_ua dict? -- WBR, sector119
sector119@mail.ru wrote: >>Do you add ua_stem or not? > > > nope :( I do not know how to add it... > I have do it the same way as when I was adding ispell_ua dict? > no You should get (or write new one) from snowball.tartarus.org. Then place ua_stem.h and ua_stem.c in tsearch/snowball directory, and edit tsearch/Makefile and tsearch/dict_snowball.c. Unfortunally, you should do it yourself, I don't know ukrainian. -- Teodor Sigaev E-mail: teodor@sigaev.ru
On Tuesday 17 June 2003 09:06, Teodor Sigaev wrote: > sector119@mail.ru wrote: > >>Do you add ua_stem or not? > > > > nope :( I do not know how to add it... > > I have do it the same way as when I was adding ispell_ua dict? > > no > > You should get (or write new one) from snowball.tartarus.org. > Then place ua_stem.h and ua_stem.c in tsearch/snowball directory, > and edit tsearch/Makefile and tsearch/dict_snowball.c. > > Unfortunally, you should do it yourself, I don't know ukrainian. There is apparently no stemming algo available yet. So if you need one, you will have to write it yourself. In the meantime ... you could just use the simple stemming until you write a ua_stem ... or someone else does. Andy
On Fri, 13 Jun 2003 18:58:19 +0400 (MSD) Oleg Bartunov <oleg@sai.msu.su> wrote: > You mixed stemmer and morphology ! These are two different dictionaries. ispell_ua - it's morphology dictionary,yes? I add this dictionary :) And now I have to add stemmer dictionary :) but how? how am I able to do that, or where am I able to read about that? :) because without it I've got en error: SELECT txt2txtidx('ua','a lot of ukrainian words'); ERROR: No dictionary > btw, I suggest you using 'ua' instead of 'uk' :-) ok :) I change uk -> ua :) test=# SELECT * FROM pg_ts_cfgmap WHERE ts_name = 'ua'; ts_name | lex_alias | dict_name ---------+-------------+--------------------- ua | file | {simple} ua | lhword | {ispell_ua,ua_stem} ua | lpart_hword | {ispell_ua,ua_stem} ua | lword | {ispell_ua,ua_stem} ua | uint | {simple} ua | version | {simple} test=# SELECT * FROM pg_ts_dict WHERE dict_id = 6; -[ RECORD 1 ]---+-------------------------------------------------------------------------------------------------------------------------------------------------- dict_id | 6 dict_name | ispell_ua dict_init | 17632 dict_initoption | DictFile="/usr/local/pgsql/share/ukrainian.hash",AffFile="/usr/local/pgsql/share/ukrainian.aff",StopFile="/usr/local/pgsql/share/ukrainian.stop" dict_lemmatize | 17633 dict_comment | Ukrainian ispell -- WBR, sector119