Thread: ispell dictionary broken in CVS HEAD ?
Hi there, seems something is broken in ispell dictionary (CVS HEAD). event=# CREATE TEXT SEARCH DICTIONARY en_ispell ( TEMPLATE = ispell, DictFile = english, AffFile = english, StopWords = english ); CREATE TEXT SEARCH DICTIONARY event=# select ts_lexize('en_ispell','stars'); ts_lexize ----------- But ispell does know 'stars' zen:~/app/pgsql/pgweb>ispell @(#) International Ispell Version 3.2.06 08/01/01 word: stars ok (derives from root STAR) Checked in tsearch2 (8.2.4): apod=# insert into pg_ts_dict (SELECT 'en_ispell', dict_init, 'DictFile="/usr/local/share/dicts/ispell/utf8/english-utf8.dict",' 'AffFile="/usr/local/share/dicts/ispell/utf8/english-utf8.aff",' 'StopFile="/usr/local/share/dicts/ispell/utf8/english-utf8.stop"',dict_lexize FROM pg_ts_dict WHERE dict_name = 'ispell_template'); apod=# select lexize('en_ispell','stars'); lexize -------- {star} Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
Oleg Bartunov wrote: > seems something is broken in ispell dictionary (CVS HEAD). > > event=# CREATE TEXT SEARCH DICTIONARY en_ispell ( > TEMPLATE = ispell, > DictFile = english, > AffFile = english, > StopWords = english > ); > CREATE TEXT SEARCH DICTIONARY > event=# select ts_lexize('en_ispell','stars'); > ts_lexize > ----------- > > > But ispell does know 'stars' Works for me, with the affix file from "iamerican" debian package, and a dictionary containing just "star/S". Which ispell files are you using? Can you tar them up and send them over? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
"Heikki Linnakangas" <heikki@enterprisedb.com> writes: > Oleg Bartunov wrote: >> seems something is broken in ispell dictionary (CVS HEAD). > Works for me, with the affix file from "iamerican" debian package, and a > dictionary containing just "star/S". Which ispell files are you using? Is anyone working on providing basic regression tests for the different dictionary types? Seems like the main stumbling block is providing usable configuration files. I don't know enough about ispell to understand what its config files look like. (There's a problem of missing documentation here, too...) regards, tom lane
Hmm, After renewing session I've got working ispell dictionary. I don't remember exactly if such behaviour is what we wanted. Oleg On Sun, 9 Sep 2007, Heikki Linnakangas wrote: > Oleg Bartunov wrote: >> seems something is broken in ispell dictionary (CVS HEAD). >> >> event=# CREATE TEXT SEARCH DICTIONARY en_ispell ( >> TEMPLATE = ispell, >> DictFile = english, >> AffFile = english, >> StopWords = english >> ); >> CREATE TEXT SEARCH DICTIONARY >> event=# select ts_lexize('en_ispell','stars'); >> ts_lexize >> ----------- >> >> >> But ispell does know 'stars' > > Works for me, with the affix file from "iamerican" debian package, and a > dictionary containing just "star/S". Which ispell files are you using? > Can you tar them up and send them over? > > Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
> Is anyone working on providing basic regression tests for the different > dictionary types? Seems like the main stumbling block is providing I'll do some tests for dictionaries, but it will be synthetic dictionary. Original ispell files is rather big, so I'll make rather simple and small one. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Tom Lane wrote: > I don't know enough about ispell to > understand what its config files look like. (There's a problem of > missing documentation here, too...) Yeah :(. The file format that ispell accepts is kind of ad hoc. It accepts hunspell and ispell and myspell variants, but only a subset of the full grammar (some stuff is not relevant for tsearch). A description of what exactly it's supposed to accept would be nice. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
> Is anyone working on providing basic regression tests for the different > dictionary types? Seems like the main stumbling block is providing I make some small tests (http://www.sigaev.ru/misc/ispell_samples.tgz). So, what is better practice to builtin it? Make it installable with regular procedure into share/tsearch_data or install they only for regression db? -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Teodor Sigaev <teodor@sigaev.ru> writes: >> Is anyone working on providing basic regression tests for the different >> dictionary types? Seems like the main stumbling block is providing > I make some small tests (http://www.sigaev.ru/misc/ispell_samples.tgz). So, > what is better practice to builtin it? Make it installable with regular > procedure into share/tsearch_data or install they only for regression db? You have to install them as part of the regular installation; pg_regress can't do it in the "make installcheck" case because it may not have write permissions on $SHAREDIR. regards, tom lane