Jeff Davis wrote:
> If we special case locale=C, but do nothing for locale=fr_FR, then I'm
> not sure we've solved the problem. Andrew Gierth raised the issue here,
> which he called "maximally confusing":
>
> https://postgr.es/m/874jp9f5jo.fsf@news-spur.riddles.org.uk
>
> That's why I feel that we need to make locale apply to whatever the
> provider is, not just when it happens to be C.
While I agree that the LOCALE option in CREATE DATABASE is
counter-intuitive, I find it questionable that blending ICU
and libc locales into it helps that much with the user experience.
Trying the lastest v6-* patches applied on top of 722541ead1
(before the pgindent run), here are a few examples when I
don't think it goes well.
The OS is Ubuntu 22.04 (glibc 2.35, ICU 70.1)
initdb:
Using default ICU locale "fr".
Using language tag "fr" for ICU locale "fr".
The database cluster will be initialized with this locale configuration:
provider: icu
ICU locale: fr
LC_COLLATE: fr_FR.UTF-8
LC_CTYPE: fr_FR.UTF-8
LC_MESSAGES: fr_FR.UTF-8
LC_MONETARY: fr_FR.UTF-8
LC_NUMERIC: fr_FR.UTF-8
LC_TIME: fr_FR.UTF-8
The default database encoding has accordingly been set to "UTF8".
#1
postgres=# create database test1 locale='fr_FR.UTF-8';
NOTICE: using standard form "fr-FR" for ICU locale "fr_FR.UTF-8"
ERROR: new ICU locale (fr-FR) is incompatible with the ICU locale of the
template database (fr)
HINT: Use the same ICU locale as in the template database, or use template0
as template.
That looks like a fairly generic case that doesn't work seamlessly.
#2
postgres=# create database test2 locale='C.UTF-8' template='template0';
NOTICE: using standard form "en-US-u-va-posix" for ICU locale "C.UTF-8"
CREATE DATABASE
en-US-u-va-posix does not sort like C.UTF-8 in glibc 2.35, so
this interpretation is arguably not what a user would expect.
I would expect the ICU warning or error (icu_validation_level) to kick
in instead of that transliteration.
#3
$ grep french /etc/locale.alias
french fr_FR.ISO-8859-1
postgres=# create database test3 locale='french' template='template0'
encoding='LATIN1';
WARNING: ICU locale "french" has unknown language "french"
HINT: To disable ICU locale validation, set parameter icu_validation_level
to DISABLED.
CREATE DATABASE
In practice we're probably getting the "und" ICU locale whereas "fr" would
be appropriate.
I assume that we would find more cases like that if testing on many
operating systems.
Best regards,
--
Daniel Vérité
https://postgresql.verite.pro/
Twitter: @DanielVerite