Home > mailing lists

Re: Pre-proposal: unicode normalized text - Mailing list pgsql-hackers

From	Jeff Davis
Subject	Re: Pre-proposal: unicode normalized text
Date	October 11, 2023 07:37:46
Msg-id	5661a3b1cd8cf046d6b761c1bcf4eb82cb58397d.camel@j-davis.com Whole thread Raw
In response to	Re: Pre-proposal: unicode normalized text (Peter Eisentraut <peter@eisentraut.org>)
List	pgsql-hackers

Tree view

On Wed, 2023-10-11 at 08:56 +0200, Peter Eisentraut wrote:
> On 11.10.23 03:08, Jeff Davis wrote:
> >    * unicode_is_valid(text): returns true if all codepoints are
> > assigned, false otherwise
>
> We need to be careful about precise terminology.  "Valid" has a
> defined
> meaning for Unicode.  A byte sequence can be valid or not as UTF-8. 
> But
> a string containing unassigned code points is not not-"valid" as
> Unicode.

Agreed. Perhaps "unicode_assigned()" is better?

> >    * unicode_version(): version of unicode Postgres is built with
> >    * icu_unicode_version(): version of Unicode ICU is built with
>
> This seems easy enough, but it's not clear what users would actually
> do
> with that.

Just there to make it visible. If it affects the semantics (which it
does currently for normalization) it seems wise to have some way to
access the version.

Regards,
    Jeff Davis

pgsql-hackers by date:

From: Mingyu Li
Date: 11 October 2023, 07:34:27
Subject: Re: [PoC] run SQL over ciphertext

From: Amit Kapila
Date: 11 October 2023, 07:43:08
Subject: Re: [PoC] pg_upgrade: allow to upgrade publisher node

Re: Pre-proposal: unicode normalized text - Mailing list pgsql-hackers

Previous

Next