Laurenz Albe wrote:
> > So if you target Postgres 17+, C.UTF-8 from the builtin provider is
> > a better choice for UTF-8 databases than "C" .
>
> Yes, "builtin" and the "C" collation is the best default value.
But my point was that, no, it's not.
Let's show a concrete example with Postgres 18:
postgres=# create database dbc
template='template0'
locale_provider='builtin'
builtin_locale='C' ;
CREATE DATABASE
postgres=# \c dbc
You are now connected to database "dbc" as user "postgres".
dbc=# select upper('été');
upper
-------
éTé
(1 row)
It is not the correct uppercasing. On the other hand the "C.UTF-8"
locale, as opposed to "C", produces the correct result.
postgres=# create database dbcutf8
template='template0'
locale_provider='builtin'
builtin_locale='C.UTF-8' ;
CREATE DATABASE
postgres=# \c dbcutf8
You are now connected to database "dbcutf8" as user "postgres".
dbcutf8=# select upper('été');
upper
-------
ÉTÉ
(1 row)
Best regards,
--
Daniel Vérité
https://postgresql.verite.pro/