Re: BUG #13440: unaccent does not remove all diacritics - Mailing list pgsql-bugs

From Léonard Benedetti
Subject Re: BUG #13440: unaccent does not remove all diacritics
Date
Msg-id 56C4E11D.5050809@mlpo.fr
Whole thread Raw
In response to Re: BUG #13440: unaccent does not remove all diacritics  (Teodor Sigaev <teodor@sigaev.ru>)
Responses Re: BUG #13440: unaccent does not remove all diacritics
List pgsql-bugs
12/02/2016 17:44, Teodor Sigaev wrote :
> I'm inclining to commit this patch becouse it suggests more regular
> way to update unaccent rules. That is nice.
>
> But I have some notices:
> 1 Is it possible to do not restrict generator script to Python V2?
> Python V2, seems, will go away in near future, and it will not be
> comfortable to install V2 for a single task.

Yes I agree, it makes sense; the script was originally Python 2 but
Python 2 is legacy. Moreover, adapting the script for Python 3 seems
trivial.

> 2 As it's easy to see, nowhere in sources of pgsql there is no a UTF-8
> encoding, just ASCII. I don't see reason to make an exception for this
> script.

First of all, the majority of pgsql code is C, a language where default
encoding is not the same everywhere (may depend on the locale settings
or the compiler) so it is logical to use ASCII.

On the other hand, UTF-8 encoding for source code is *a feature of
Python 3* (to quote the documentation: “The default encoding for Python
source code is UTF-8”) so there is no possible ambiguity, and it will
not be a problem. That said, some non-ASCII characters may be removed
without prejudice from the source code of the script (I think in
particular to "“" and "”"). Nevertheless, for some comments, it would be
unfortunate (e.g. “# RegEx to parse rules (e.g. “Đ → D ; […]”)” or “# ℃
°C”).

>
> Thank you.
>

Thus, I propose to adapt the code to Python 3 (the encoding of the
script does not seem to be a problem for the above reasons). I try to do
it shortly.

Thank you for your feedback.

Léonard Benedetti




pgsql-bugs by date:

Previous
From: Jeff Frost
Date:
Subject: Re: BUG #13968: invalid page in block error
Next
From: Venkata Balaji N
Date:
Subject: Re: BUG #13962: transaction logs growing on standby