Home > mailing lists

Re: BUG #18362: unaccent rules and Old Greek text - Mailing list pgsql-bugs

From	Michael Paquier
Subject	Re: BUG #18362: unaccent rules and Old Greek text
Date	May 23 09:21:02
Msg-id	Zk7gTggGrBFnFwGl@paquier.xyz Whole thread Raw
In response to	Re: BUG #18362: unaccent rules and Old Greek text (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: BUG #18362: unaccent rules and Old Greek text
List	pgsql-bugs

Tree view

On Wed, May 22, 2024 at 12:47:37PM -0400, Robert Haas wrote:
> On Sat, May 18, 2024 at 5:37 AM Thomas Munro <thomas.munro@gmail.com> wrote:
>> And in the tests I now see that Michael had already figured that out!
>> I've included a kludge to remove that.  Someone should file a ticket with CLDR.

That was some time ago..  I was not sure back then how to handle that
with upstream data, so thanks for the bug report and the pointers.
I'll try to remember that.

> I think you should update the comment that says "a mistake?" to
> instead link to the CLDR issue that Peter filed. Other than that, I'm
> not sure this needs any other changes. I can't actually testify to the
> correctness of the Python code, but the results look sane so hey, why
> not?

+1 for the comment refresh in the test, keeping the test.

+                if src == "ℌ":
+                    # a mistake?
+                    continue

Perhaps this should use the codepoint rather than the non-ascii
character in the script.

Another thing would be to add some tests that cover the new characters
that get a conversion.  Just a few of them in the new ranges, checking
the recursive case with is_letter_with_marks() would be fine.
 --
Michael

Attachment

signature.asc

pgsql-bugs by date:

From: Noah Misch
Date: 23 May, 02:59:01
Subject: Re: relfrozenxid may disagree with row XIDs after 1ccc1e05ae

From: Önder Kalacı
Date: 23 May, 09:54:31
Subject: Re: BUG #18467: postgres_fdw (deparser) ignores LimitOption

Re: BUG #18362: unaccent rules and Old Greek text - Mailing list pgsql-bugs

Attachment

Previous

Next