Thread: Tsearch2 / PG 8.2 Which stemmer files?

Tsearch2 / PG 8.2 Which stemmer files?

From
Hannes Dorbath
Date:
Which stemmer files is one supposed to use with 8.2 Tsearch2?

Trying to compile the output from Gendict with:

stem_UTF_8_german.c
stem_UTF_8_german.h

from:

http://snowball.tartarus.org/dist/libstemmer_c.tgz

gives:

http://hannes.imos.net/make.txt


Thanks!


--
Regards,
Hannes Dorbath

Re: Tsearch2 / PG 8.2 Which stemmer files?

From
Hannes Dorbath
Date:
On 07.12.2006 12:42, Hannes Dorbath wrote:
> Which stemmer files is one supposed to use with 8.2 Tsearch2?

Found an answer myself. Seems I need:

http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/tsearch_snowball_82.gz


--
Regards,
Hannes Dorbath

Re: Tsearch2 / PG 8.2 Which stemmer files?

From
Oleg Bartunov
Date:
Hannes,

please download patch tsearch_snowball_82.gz
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/
which updates API to snowball.

Oleg
On Thu, 7 Dec 2006, Hannes Dorbath wrote:

> Which stemmer files is one supposed to use with 8.2 Tsearch2?
>
> Trying to compile the output from Gendict with:
>
> stem_UTF_8_german.c
> stem_UTF_8_german.h
>
> from:
>
> http://snowball.tartarus.org/dist/libstemmer_c.tgz
>
> gives:
>
> http://hannes.imos.net/make.txt
>
>
> Thanks!
>
>
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

Re: Tsearch2 / PG 8.2 Which stemmer files?

From
Hannes Dorbath
Date:
Thank you Oleg.

I have a bit more trouble migrating from 8.1.5 TSearch2 + Gin/UTF-8 to
PG 8.2.

First I tried to use existing dict and affix files, which triggered that
oldFormat condition. So I tried to start from scratch. The thing I can't
get to work is compound word support for German again.

What I did:

1. OpenOffice Dictionary from http://j3e.de/hunspell/de_DE.zip
2. extract de_DE.dic
3. Run compound.pl on de_DE.dic
4. Put modified de_DE.dic back in the zip, run my2ispell on them
5. Convert both to UTF-8

Do I need to hack compound.pl to do something different, as the affix
format changed?

I'd really appreciate any hint.

Thanks!


On 07.12.2006 14:52, Oleg Bartunov wrote:
> please download patch tsearch_snowball_82.gz
> http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/
> which updates API to snowball.

--
Regards,
Hannes Dorbath