Re: BUG #15347: Unaccent for greek characters does not work - Mailing list pgsql-bugs

From Michael Paquier
Subject Re: BUG #15347: Unaccent for greek characters does not work
Date
Msg-id 20180828032040.GE29157@paquier.xyz
Whole thread Raw
In response to Re: BUG #15347: Unaccent for greek characters does not work  (Thomas Munro <thomas.munro@enterprisedb.com>)
Responses Re: BUG #15347: Unaccent for greek characters does not work  (Thomas Munro <thomas.munro@enterprisedb.com>)
List pgsql-bugs
On Tue, Aug 28, 2018 at 10:50:38AM +1200, Thomas Munro wrote:
> Fair criticism, here's a version with comments.

Thanks, that's way better in my opinion.  In the range of fancy things,
I have discovered today the python module unicodedata which can replace
for example 0x03b1 with ord("\N{GREEK SMALL LETTER ALPHA}"), leading to
perhaps more readable code.

Jokes apart, I would have preferred if you used directly the unicode
points as those are easier to look after in UnicodeData.txt, say
'\u03B1' for small alpha.  If you want to go with the hex code, it would
be a better reference to copy/paste directly the character name from
UnicodeData.txt as those are easier to search in the future, perhaps
with their unicode points:
- GREEK SMALL LETTER ALPHA
- GREEK SMALL LETTER OMEGA
- GREEK CAPITAL LETTER ALPHA
- GREEK CAPITAL LETTER OMEGA

Running generate_unaccent_rules.py, I get the same result for
unaccent.rules as you do.
--
Michael

Attachment

pgsql-bugs by date:

Previous
From: Michael Paquier
Date:
Subject: Re: BUG #15346: Replica fails to start after the crash
Next
From: Andres Freund
Date:
Subject: Re: BUG #15346: Replica fails to start after the crash