Re: Pre-proposal: unicode normalized text - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: Pre-proposal: unicode normalized text
Date
Msg-id 5661a3b1cd8cf046d6b761c1bcf4eb82cb58397d.camel@j-davis.com
Whole thread Raw
In response to Re: Pre-proposal: unicode normalized text  (Peter Eisentraut <peter@eisentraut.org>)
List pgsql-hackers
On Wed, 2023-10-11 at 08:56 +0200, Peter Eisentraut wrote:
> On 11.10.23 03:08, Jeff Davis wrote:
> >    * unicode_is_valid(text): returns true if all codepoints are
> > assigned, false otherwise
>
> We need to be careful about precise terminology.  "Valid" has a
> defined
> meaning for Unicode.  A byte sequence can be valid or not as UTF-8. 
> But
> a string containing unassigned code points is not not-"valid" as
> Unicode.

Agreed. Perhaps "unicode_assigned()" is better?

> >    * unicode_version(): version of unicode Postgres is built with
> >    * icu_unicode_version(): version of Unicode ICU is built with
>
> This seems easy enough, but it's not clear what users would actually
> do
> with that.

Just there to make it visible. If it affects the semantics (which it
does currently for normalization) it seems wise to have some way to
access the version.

Regards,
    Jeff Davis




pgsql-hackers by date:

Previous
From: Mingyu Li
Date:
Subject: Re: [PoC] run SQL over ciphertext
Next
From: Amit Kapila
Date:
Subject: Re: [PoC] pg_upgrade: allow to upgrade publisher node