On Wed, 2023-10-11 at 08:56 +0200, Peter Eisentraut wrote:
> On 11.10.23 03:08, Jeff Davis wrote:
> > * unicode_is_valid(text): returns true if all codepoints are
> > assigned, false otherwise
>
> We need to be careful about precise terminology. "Valid" has a
> defined
> meaning for Unicode. A byte sequence can be valid or not as UTF-8.
> But
> a string containing unassigned code points is not not-"valid" as
> Unicode.
Agreed. Perhaps "unicode_assigned()" is better?
> > * unicode_version(): version of unicode Postgres is built with
> > * icu_unicode_version(): version of Unicode ICU is built with
>
> This seems easy enough, but it's not clear what users would actually
> do
> with that.
Just there to make it visible. If it affects the semantics (which it
does currently for normalization) it seems wise to have some way to
access the version.
Regards,
Jeff Davis