Re: [PROPOSAL] Improvements of Hunspell dictionaries support - Mailing list pgsql-hackers

From Emre Hasegeli
Subject Re: [PROPOSAL] Improvements of Hunspell dictionaries support
Date
Msg-id CAE2gYzwom3=11U9G8ZxMT5PLkZrwb12BWzxh4dB3HUd89FOSrg@mail.gmail.com
Whole thread Raw
In response to Re: [PROPOSAL] Improvements of Hunspell dictionaries support  (Artur Zakirov <a.zakirov@postgrespro.ru>)
Responses Re: [PROPOSAL] Improvements of Hunspell dictionaries support  (Artur Zakirov <a.zakirov@postgrespro.ru>)
List pgsql-hackers
Thank you for working on this.

I tried the patch with a Turkish dictionary [1] I could find on the
Internet.  It worked for some words, but not others:

> hasegeli=# create text search dictionary hunspell_tr (template = ispell, dictfile = tr, afffile = tr);
> CREATE TEXT SEARCH DICTIONARY
>
> hasegeli=# select ts_lexize('hunspell_tr', 'tilki'); -- The root "fox"
> -----------
> {tilki}
> (1 row)
>
> hasegeli=# select ts_lexize('hunspell_tr', 'tilkinin'); -- Genitive form, affix 3290
> ts_lexize
> -----------
> {tilki}
> (1 row)
>
> hasegeli=# select ts_lexize('hunspell_tr', 'tilkiler'); -- Plural form, affix 4371
> ts_lexize
> -----------
> {tilki}
> (1 row)
>
> hasegeli=# select ts_lexize('hunspell_tr', 'tilkiyi'); -- Accusative form, affix 2646
> ts_lexize
> -----------
>
> (1 row)

It seems to have something to do with the order of the affixes.  It
works, if I move affix 2646 to the beginning of the list.

[1] https://tr-spell.googlecode.com/files/dict_aff_5000_suffix_1130000_words.zip



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Transactions involving multiple postgres foreign servers
Next
From: Vitaly Burovoy
Date:
Subject: Extracting fields from 'infinity'::TIMESTAMP[TZ]