Re: Collation again here - Mailing list pgsql-general

From Adrian Klaver
Subject Re: Collation again here
Date
Msg-id f93ff424-dd25-464c-a0af-8187817ff1f5@aklaver.com
Whole thread Raw
In response to Re: Collation again here  (Rihad <grihad@gmail.com>)
List pgsql-general
On 1/8/26 05:18, Rihad wrote:
> On 1/8/26 4:48 PM, Dominique Devienne wrote:

> 
> Looking into pg_collation system table that collation has 
> collprovide="c". First I thought "c" meant libc, but this article states 
> that "c" means PG Internal provider, and libc would have been "l".
> 
> https://medium.com/@adarsh2801/understanding-collations-in- 
> postgresql-648e4fa333e1
> 
>  1. */PostgreSQL Internal Provider (‘c’) /*: Introduced in Postgres 15.
>     This built-in collation support is System/OS agnostic.
>  2. */System Library Provider (‘l’) : /*Uses GNU C library and hence is
>     OS locale dependent.
>  3. */ICU — International Components for Unicode (‘i’) : /*Uses ICU
>     library for unicode-aware collation.

This is what the docs are for:

https://www.postgresql.org/docs/current/catalog-pg-collation.html

"collprovider char

Provider of the collation: d = database default, b = builtin, c = libc, 
i = icu
"

And

https://www.postgresql.org/docs/current/locale.html#LOCALE-PROVIDERS

"
23.1.4. Locale Providers

A locale provider specifies which library defines the locale behavior 
for collations and character classifications.

The commands and tools that select the locale settings, as described 
above, each have an option to select the locale provider. Here is an 
example to initialize a database cluster using the ICU provider:

initdb --locale-provider=icu --icu-locale=en

See the description of the respective commands and programs for details. 
Note that you can mix locale providers at different granularities, for 
example use libc by default for the cluster but have one database that 
uses the icu provider, and then have collation objects using either 
provider within those databases.

Regardless of the locale provider, the operating system is still used to 
provide some locale-aware behavior, such as messages (see lc_messages).

The available locale providers are listed below:

builtin

     The builtin provider uses built-in operations. Only the C, C.UTF-8, 
and PG_UNICODE_FAST locales are supported for this provider.

     The C locale behavior is identical to the C locale in the libc 
provider. When using this locale, the behavior may depend on the 
database encoding.

     The C.UTF-8 locale is available only for when the database encoding 
is UTF-8, and the behavior is based on Unicode. The collation uses the 
code point values only. The regular expression character classes are 
based on the "POSIX Compatible" semantics, and the case mapping is the 
"simple" variant.

     The PG_UNICODE_FAST locale is available only when the database 
encoding is UTF-8, and the behavior is based on Unicode. The collation 
uses the code point values only. The regular expression character 
classes are based on the "Standard" semantics, and the case mapping is 
the "full" variant.

icu

     The icu provider uses the external ICU library. PostgreSQL must 
have been configured with support.

     ICU provides collation and character classification behavior that 
is independent of the operating system and database encoding, which is 
preferable if you expect to transition to other platforms without any 
change in results. LC_COLLATE and LC_CTYPE can be set independently of 
the ICU locale.
     Note

     For the ICU provider, results may depend on the version of the ICU 
library used, as it is updated to reflect changes in natural language 
over time.

libc

     The libc provider uses the operating system's C library. The 
collation and character classification behavior is controlled by the 
settings LC_COLLATE and LC_CTYPE, so they cannot be set independently.
     Note

     The same locale name may have different behavior on different 
platforms when using the libc provider.

"

Rather then some made up gibberish.

> 
> 
> We only have "i" & "c" in pg_collation. And we aren't using any of "i" 
> it seems. All this locale/encoding/collate stuff is too much for me to 
> handle, sorry)
> 
> So if we are using the internal (builtin) "c" provider how come the PG 
> 18.1 run on FreeBSD 13.5 version shows warnings that the system version 
> is 34.0?
> 
> The article must be wrong I guess.
> 
> Then upgrading 13.5 to 14.3 is our only option.
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



pgsql-general by date:

Previous
From: Dominique Devienne
Date:
Subject: Re: Collation again here
Next
From: "Daniel Verite"
Date:
Subject: Re: Collation again here