Re: BUG #15347: Unaccent for greek characters does not work - Mailing list pgsql-bugs

From Thomas Munro
Subject Re: BUG #15347: Unaccent for greek characters does not work
Date
Msg-id CAEepm=0RUhOuvQs2LQnFYzR4GWHtn6wUT9UaKi+vC0erKW4=dw@mail.gmail.com
Whole thread Raw
In response to Re: BUG #15347: Unaccent for greek characters does not work  (Tasos Maschalidis <TaS.O.S@hotmail.com>)
Responses Re: BUG #15347: Unaccent for greek characters does not work
Re: BUG #15347: Unaccent for greek characters does not work
List pgsql-bugs
On Fri, Aug 24, 2018 at 12:22 AM, Tasos Maschalidis <TaS.O.S@hotmail.com> wrote:
> return (codepoint.id >= ord('a') and codepoint.id <= ord('z')) or \
>            (codepoint.id >= ord('A') and codepoint.id <= ord('Z')) or \
>
>            (codepoint.id >= ord('α') and codepoint.id <= ord('ω')) or \
>            (codepoint.id >= ord('Α') and codepoint.id <= ord('Ω'))

Thank you.  Here it is in the form of a patch that I propose to commit
to PostgreSQL 12.  It adds 221 lines to unaccent.rules.  They look
sane to my untrained eye.  Do you agree?

Example of use:

postgres=# select unaccent('Θέμα: Re: BUG #15347: Unaccent for greek ...');
                   unaccent
----------------------------------------------
 Θεμα: Re: BUG #15347: Unaccent for greek ...
(1 row)

I wondered if the documentation might need a change, but it already
says something broad enough: "A more complete example, which is
directly useful for most European languages, can be found in
unaccent.rules, ...".

--
Thomas Munro
http://www.enterprisedb.com

Attachment

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #15342: pg_dump - XML with mixed content types generates invalid backup file
Next
From: Tasos Maschalidis
Date:
Subject: Re: BUG #15347: Unaccent for greek characters does not work