Re: BUG #16337: Finnish Ispell dictionary cannot be created - Mailing list pgsql-bugs

From Artur Zakirov
Subject Re: BUG #16337: Finnish Ispell dictionary cannot be created
Date
Msg-id CAKNkYnxeHJJDkw3_s908oMgiv4pn0ODkqGXUxME0FvMDxhu0=g@mail.gmail.com
Whole thread Raw
In response to Re: BUG #16337: Finnish Ispell dictionary cannot be created  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: BUG #16337: Finnish Ispell dictionary cannot be created
List pgsql-bugs
On Fri, Apr 3, 2020 at 5:55 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> I'm not sure if it's a valid ispell format (it might be, but I'm not
> very good in reading the ispell manpage). But if it is, we should fix
> the code to be able to read it.

I attached the simple patch which fixes PAE_INREPL state.

I don't fully understand the ispell manpage either. I've looked the
ispell source code. They
use yacc for parsing. I'm not good at yacc but it seems that the
escape symbol is used
for all fields. But the patch fixes only PAE_INREPL state.

Also I did some tests with ispell utility. For simplicity I fixed the
.aff file in the following way:

flag *E:
    .           >     YLI
    .           >     YLI\-

And I got the following results:

word: ylijohdon
ok (derives from root JOHDON)

word: yli-johdon
ok (derives from root JOHDON)

word: yly-johdon
how about: yli-johdon

So hyphen escaping works. And results for PostgreSQL with the patch
and the .aff file
fix:

=# select ts_lexize('finnish_ispell', 'yli-johdon');
     ts_lexize
-------------------
 {johdon,johdossa}
=# select ts_lexize('finnish_ispell', 'ylijohdon');
     ts_lexize
-------------------
 {johdon,johdossa}

-- 
Artur

Attachment

pgsql-bugs by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: BUG #16112: large, unexpected memory consumption
Next
From: Michael Paquier
Date:
Subject: Re: [BUG] non archived WAL removed during production crash recovery