Home > mailing lists

Re: Add standard collation UNICODE - Mailing list pgsql-hackers

From	Jeff Davis
Subject	Re: Add standard collation UNICODE
Date	March 4, 2023 21:29:54
Msg-id	a5fa78021c8684ef1fc4983a4c5102849c4d4c60.camel@j-davis.com Whole thread Raw
In response to	Add standard collation UNICODE (Peter Eisentraut <peter.eisentraut@enterprisedb.com>)
Responses	Re: Add standard collation UNICODE Re: Add standard collation UNICODE Re: Add standard collation UNICODE
List	pgsql-hackers

Tree view

On Wed, 2023-03-01 at 11:09 +0100, Peter Eisentraut wrote:

> When collation support was added to PostgreSQL, we added UCS_BASIC,
> since that could easily be mapped to the C locale.

Sorting by codepoint should be encoding-independent (i.e. decode to
codepoint first); but the C collation is just strcmp, which is
encoding-dependent. So is UCS_BASIC wrong today?

(Aside: I wonder whether we should differentiate between the libc
provider, which uses strcoll(), and the provider of non-localized
comparisons that just use strcmp(). That would be a better reflection
of what the code actually does.)

> With ICU support, we can provide the UNICODE collation, since it's
> just
> the root locale.

+1

>   I suppose one hesitation was that ICU was not a
> standard feature, so this would create variations in the default
> catalog
> contents, or something like that.

It looks like the way you've handled this is by inserting the collation
with collprovider=icu even if built without ICU support. I think that's
a new case, so we need to make sure it throws reasonable user-facing
errors.

I do like your approach though because, if someone is using a standard
collation, I think "not built with ICU" (feature not supported) is a
better error than "collation doesn't exist". It also effectively
reserves the name "unicode".

--
Jeff Davis
PostgreSQL Contributor Team - AWS

pgsql-hackers by date:

From: "Joel Jacobson"
Date: 04 March 2023, 21:27:05
Subject: Re: Missing free_var() at end of accum_sum_final()?

From: Tom Lane
Date: 04 March 2023, 21:56:30
Subject: Re: Date-time extraneous fields with reserved keywords

Re: Add standard collation UNICODE - Mailing list pgsql-hackers

Previous

Next