Re: BUG #15548: Unaccent does not remove combining diacriticalcharacters - Mailing list pgsql-bugs

From Michael Paquier
Subject Re: BUG #15548: Unaccent does not remove combining diacriticalcharacters
Date
Msg-id 20181218060735.GL1532@paquier.xyz
Whole thread Raw
In response to Re: BUG #15548: Unaccent does not remove combining diacritical characters  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: BUG #15548: Unaccent does not remove combining diacritical characters
List pgsql-bugs
On Tue, Dec 18, 2018 at 12:36:02AM -0500, Tom Lane wrote:
> tl;dr: I think we should convert unaccent.sql and unaccent.out
> to UTF8 encoding.  Then, adding more test cases for this patch
> will be easy.

Do you think that we could also remove the non-ASCII characters from the
tests?  It would be easy enough to use E'\xNN' (utf8 hex) or such in
input, and show the output with bytea.  That's harder to read, still we
discussed about not using UTF-8 in the python script to allow folks with
simple terminals to touch the code the last time this was touched
(5e8d670) and the characters used could be documented as comments in the
tests.
--
Michael

Attachment

pgsql-bugs by date:

Previous
From: Michael Paquier
Date:
Subject: Re: BUG #15552: Unexpected error in COPY to a foreign table in atransaction
Next
From: Amit Langote
Date:
Subject: Re: BUG #15552: Unexpected error in COPY to a foreign table in atransaction