Re: BUG #15548: Unaccent does not remove combining diacritical characters - Mailing list pgsql-bugs

From Hugh Ranalli
Subject Re: BUG #15548: Unaccent does not remove combining diacritical characters
Date
Msg-id CAAhbUMMzPERSe3KfKKQfR4COJCZSrss1G7KRyUraYJyvrVyOUg@mail.gmail.com
Whole thread Raw
In response to Re: BUG #15548: Unaccent does not remove combining diacritical characters  (Thomas Munro <thomas.munro@enterprisedb.com>)
Responses Re: BUG #15548: Unaccent does not remove combining diacritical characters
List pgsql-bugs
On Mon, 17 Dec 2018 at 23:05, Thomas Munro <thomas.munro@enterprisedb.com> wrote:
+ʹ    '
+ʺ    "
+ʻ    '
+ʼ    '
+ʽ    '
+˂    <
+˃    >
+˄    ^
+ˆ    ^
+ˈ    '
+ˋ    `
+ː    :
+˖    +
+˗    -
+˜    ~
These aren't the combining codepoints. They're new substitutions defined in r34 of the Latin-ASCII transliteration file. I had wondered about those, too, and did some testing.

I don't think this is quite right. 

However, you are correct that something isn't write. In testing why I was getting a different output, I had reverted to the generate_unaccent_rules.py BEFORE my changes. And then I applied my update for the transliteration file format to the reverted version. The patch for generate_unaccent_rules should still be good, but the generated rules file didn't include the combining diacriticals. In generating that, I want to double check some of the additions before re-submitting. 

 On Mon, 17 Dec 2018 at 23:57, Michael Paquier <michael@paquier.xyz> wrote:
Could you also add some tests in contrib/unaccent/sql/unaccent.sql at
the same time?  That would be nice to check easily the extent of the
patches proposed on this thread.

That makes sense. I'm happy to do that. Let me look at that file and see how extensive the other changes (encoding and removal of special characters would be).

Hugh

pgsql-bugs by date:

Previous
From: Etsuro Fujita
Date:
Subject: Re: BUG #15552: Unexpected error in COPY to a foreign table in atransaction
Next
From: Luis Carril
Date:
Subject: Re: BUG #15552: Unexpected error in COPY to a foreign table in atransaction