Re: BUG #18362: unaccent rules and Old Greek text - Mailing list pgsql-bugs

From Peter Eisentraut
Subject Re: BUG #18362: unaccent rules and Old Greek text
Date
Msg-id e38dd877-3e76-47bd-8fa5-f079637c5616@eisentraut.org
Whole thread Raw
In response to Re: BUG #18362: unaccent rules and Old Greek text  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-bugs
On 18.05.24 11:36, Thomas Munro wrote:
>>> WARNING:  duplicate source strings, first one will be used
>>>
>>> so it will need to adjustments in how the rules are produced.
>>
>> OK. Does anyone want to look into that?
> 
> I think the problem is that the new "simple redirection" rule from the
> Unicode database produces some values that are also present in
> Latin-ASCII.xml, and these are all tolerated as long as the "from" and
> "to" strings both match, because we uniquify them as pairs.  But there
> is one pair where the "to" string is different, resulting in this
> clash:
> 
> ℌ      x
> ℌ      H
> 
> I think the first line might actually be a bug in CLDR data.  I dunno,
> but this just doesn't look right:
> 
> ℌ → x ; # 210C;BLACK-LETTER CAPITAL H (compat)
> 
> And in the tests I now see that Michael had already figured that out!
> I've included a kludge to remove that.  Someone should file a ticket with CLDR.

Done: https://unicode-org.atlassian.net/browse/CLDR-17656



pgsql-bugs by date:

Previous
From: PG Bug reporting form
Date:
Subject: BUG #18473: Problems deployment postgresql for windows
Next
From: Sandeep Thakkar
Date:
Subject: Re: Postgresql 16.3 installation error (setup file) on Windows 11