Re: Pre-proposal: unicode normalized text - Mailing list pgsql-hackers

From Peter Eisentraut
Subject Re: Pre-proposal: unicode normalized text
Date
Msg-id 205060b0-9e63-4025-93f8-c60ebae42aa7@eisentraut.org
Whole thread Raw
In response to Re: Pre-proposal: unicode normalized text  (Jeff Davis <pgsql@j-davis.com>)
List pgsql-hackers
On 06.10.23 19:22, Jeff Davis wrote:
> On Fri, 2023-10-06 at 09:58 +0200, Peter Eisentraut wrote:
>> If you want to be rigid about it, you also need to consider whether
>> the
>> Unicode version used by the ICU library in use matches the one used
>> by
>> the in-core tables.
> What problem are you concerned about here? I thought about it and I
> didn't see an obvious issue.
> 
> If the ICU unicode version is ahead of the Postgres unicode version,
> and no unassigned code points are used according to the Postgres
> version, then there's no problem.
> 
> And in the other direction, there might be some code points that are
> assigned according to the postgres unicode version but unassigned
> according to the ICU version. But that would be tracked by the
> collation version as you pointed out earlier, so upgrading ICU would be
> like any other ICU upgrade (with the same risks). Right?

It might be alright in this particular combination of circumstances. 
But in general if we rely on these tables for correctness (e.g., check 
that a string is normalized before passing it to a function that 
requires it to be normalized), we would need to consider this.  The 
correct fix would then probably be to not use our own tables but use 
some ICU function to achieve the desired task.




pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Pre-proposal: unicode normalized text
Next
From: "Drouvot, Bertrand"
Date:
Subject: Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag