Re: [PROPOSAL] Improvements of Hunspell dictionaries support - Mailing list pgsql-hackers

From Artur Zakirov
Subject Re: [PROPOSAL] Improvements of Hunspell dictionaries support
Date
Msg-id 563C73F7.5000202@postgrespro.ru
Whole thread Raw
In response to [PROPOSAL] Improvements of Hunspell dictionaries support  (Artur Zakirov <a.zakirov@postgrespro.ru>)
Responses Re: [PROPOSAL] Improvements of Hunspell dictionaries support
Re: [PROPOSAL] Improvements of Hunspell dictionaries support
Re: [PROPOSAL] Improvements of Hunspell dictionaries support
List pgsql-hackers
Hello again!

Patches
=======

I had implemented support for FLAG long, FLAG num and AF parameters. I
attached patch to the e-mail (hunspell-dict.patch).

This patch allow to use Hunspell dictionaries listed in the previous
e-mail: ar, br_fr, ca, ca_valencia, en_ca, en_gb, en_us, fr, gl_es,
hu_hu, is, ne_np, nl_nl, si_lk.

The most part of changes was in spell.c in the affix file parsing code.
The following are dictionary structures changes:
- useFlagAliases and flagMode fields had been added to the IspellDict
struct;
- flagval array size had been increased from 256 to 65000;
- flag field of the AFFIX struct also had been increased.

I also had implemented a patch that fixes an error from the e-mail
http://www.postgresql.org/message-id/562E1073.8030805@postgrespro.ru
This patch just ignore that error.

Tests
=====

Extention test dictionaries for loading into PostgreSQL and for
normalizing with ts_lexize function can be downloaded from
https://dl.dropboxusercontent.com/u/15423817/HunspellDictTest.tar.gz

It would be nice if somebody can do additional tests of dictionaries of
well known languages. Because I do not know many of them.

Other Improvements
==================

There are also some parameters for compound words. But I am not sure
that we want use this parameters.

--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Attachment

pgsql-hackers by date:

Previous
From: Kyotaro HORIGUCHI
Date:
Subject: Re: SortSupport for UUID type
Next
From: Albe Laurenz
Date:
Subject: Re: [PATCH] RFC: Add length parameterised dmetaphone functions