On Fri, Apr 03, 2020 at 12:33:00PM +0900, Artur Zakirov wrote:
>Hello,
>
>On 4/2/2020 7:11 PM, PG Bug reporting form wrote:
>>postgres=# CREATE TEXT SEARCH DICTIONARY finnish_ispell ( TEMPLATE = ispell,
>>DictFile = fi_fi, AffFile = fi_fi, Stopwords = finnish);
>>ERROR: syntax error
>>CONTEXT: line 83 of configuration file
>>"/usr/pgsql-12/share/tsearch_data/fi_fi.affix": " I >
>>ALI\-
>>"
>
>Thank you for the email.
>
>It seems that here the backslash is used to escape the following
>character according to the comment for the following flag:
>
>>flag *E:
>> . > YLI # ylijohtaja
>> I > YLI\- # yli-inhimillinen
>
>Escaping character is valid for ispell format (see
>https://manpages.debian.org/testing/ispell/ispell.5.en.html):
>
>>Any character with special meaning to the parser can be changed to an uninterpreted token by backslashing it
>
>I've looked also for Hunspell finnish dictionary. But I didn't find
>any I found only some postgres extension:
>https://github.com/Houston-Inc/dict_voikko
>
>
>I think it is possible to fix the postgres parser. But I'm not sure
>should we do that.
>
I'm not sure if it's a valid ispell format (it might be, but I'm not
very good in reading the ispell manpage). But if it is, we should fix
the code to be able to read it.
>At first sight it is necessary to fix parse_affentry().
>
Right, that seems like the place to fix. It seems we don't expect '-'
(escaped) when in PAE_INREPL state. I wonder if there are other things
we fail to support ...
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services