Thread: Re: [OpenFTS-general] AW: tsearch2, ispell, utf-8 and german special characters
Re: [OpenFTS-general] AW: tsearch2, ispell, utf-8 and german special characters
From
"Markus Wollny"
Date:
Hi! > -----Ursprüngliche Nachricht----- > Von: openfts-general-admin@lists.sourceforge.net > [mailto:openfts-general-admin@lists.sourceforge.net] Im > Auftrag von Markus Wollny > Gesendet: Mittwoch, 21. Juli 2004 17:04 > An: Oleg Bartunov > Cc: pgsql-general@postgresql.org; > openfts-general@lists.sourceforge.net > Betreff: [OpenFTS-general] AW: [GENERAL] tsearch2, ispell, > utf-8 and german special characters > The issue with the unrecognized stop-word 'ein' which is > converted by to_tsvector to 'eint' remains however. Now > here's as much detail as I can provide: > > Ispell is Version 3.1.20 10/10/95, patch 1. I've just upgraded Ispell to the latest version (International Ispell Version 3.2.06 08/01/01), but that didn't help; bynow I think it might be something to do with a german language peculiarity or with something in the german dictionary.In german.med, there is an entry eint/EGPVWX So the ts_vector output is just a bit like a wrong guess. Doesn't it evaluate the stopword-list first before doing the lookupin the Ispell-dictionary? Kind regards Markus Wollny
On Wed, 21 Jul 2004, Markus Wollny wrote: > > Hi! > > > -----Urspr?ngliche Nachricht----- > > Von: openfts-general-admin@lists.sourceforge.net > > [mailto:openfts-general-admin@lists.sourceforge.net] Im > > Auftrag von Markus Wollny > > Gesendet: Mittwoch, 21. Juli 2004 17:04 > > An: Oleg Bartunov > > Cc: pgsql-general@postgresql.org; > > openfts-general@lists.sourceforge.net > > Betreff: [OpenFTS-general] AW: [GENERAL] tsearch2, ispell, > > utf-8 and german special characters > > > The issue with the unrecognized stop-word 'ein' which is > > converted by to_tsvector to 'eint' remains however. Now > > here's as much detail as I can provide: > > > > Ispell is Version 3.1.20 10/10/95, patch 1. > > I've just upgraded Ispell to the latest version (International Ispell Version 3.2.06 08/01/01), but that didn't help; bynow I think it might be something to do with a german language peculiarity or with something in the german dictionary.In german.med, there is an entry ispell itself don't used in tsearch2, only dict,aff files ! > > eint/EGPVWX > > So the ts_vector output is just a bit like a wrong guess. Doesn't it evaluate the stopword-list first before doing thelookup in the Ispell-dictionary? yes. There is very usefull function for debugging I always recommend to use - ts_debug. See my notes (http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_Notes) for examples. > > Kind regards > > Markus Wollny > > > ------------------------------------------------------- > This SF.Net email is sponsored by BEA Weblogic Workshop > FREE Java Enterprise J2EE developer tools! > Get your free copy of BEA WebLogic Workshop 8.1 today. > http://ads.osdn.com/?ad_idG21&alloc_id040&op?k > _______________________________________________ > OpenFTS-general mailing list > OpenFTS-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/openfts-general > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83