Thread: LOCALE C.UTF-8 on EDB Windows v17 server

LOCALE C.UTF-8 on EDB Windows v17 server

From
Dominique Devienne
Date:
Hi.

For a long time, I was just doing a plain CREATE DATABASE, using
whatever local template1 was using.

Then when v17 came out, I wanted to force UTF-8 for the encoding,
*AND* use the new C.UTF-8 builtin locale, to not depend on libc on
Linux. (which forced me to use template0 explicitly). And since it was
builtin, I just assumed Windows would be fine too. But it's not
apparently not...

I'm using the official v17.5 EDB installer, for the Windows server.

The command I'm using (from a libpq trace) is:

create database "dd_v168" encoding 'UTF8' locale 'C.UTF-8'
locale_provider 'builtin' template template0

On Windows, I'm getting

2025-06-04 14:07:41.227419 B 155 ErrorResponse S "ERROR" V "ERROR" C
"42809" M "invalid LC_COLLATE locale name: "C.UTF-8"" H "If the locale
name is specific to ICU, use ICU_LOCALE." F "dbcommands.c" L "1057" R
"createdb" \x00

Given that v17.5 Windows returns

postgres=# select * from pg_collation where collname ilike '%utf%';
 oid | collname  | collnamespace | collowner | collprovider |
collisdeterministic | collencoding | collcollate | collctype |
colllocale | collicurules | collversion

-----+-----------+---------------+-----------+--------------+---------------------+--------------+-------------+-----------+------------+--------------+-------------
 811 | pg_c_utf8 |            11 |        10 | b            | t
           |            6 |             |           | C.UTF-8    |
         | 1
(1 row)

While Linux returns:

ddevienne=> select * from pg_collation where collname ilike '%utf%';
  oid  |    collname     | collnamespace | collowner | collprovider |
collisdeterministic | collencoding |   collcollate   |    collctype
| colllocale | collicurules | collversion

-------+-----------------+---------------+-----------+--------------+---------------------+--------------+-----------------+-----------------+------------+--------------+-------------
   811 | pg_c_utf8       |            11 |        10 | b            |
t                   |            6 |                 |
| C.UTF-8    |              | 1
 12351 | C.utf8          |            11 |        10 | c            |
t                   |            6 | C.utf8          | C.utf8
|            |              |
 12354 | en_AU.utf8      | ...
(17 rows)

I tried instead

create database "dd_v168" encoding 'UTF8' locale 'pg_c_utf8'
locale_provider 'builtin' template template0

but that's still KO with

2025-06-04 14:10:58.748508 B 157 ErrorResponse S "ERROR" V "ERROR" C
"42809" M "invalid LC_COLLATE locale name: "pg_c_utf8"" H "If the
locale name is specific to ICU, use ICU_LOCALE." F "dbcommands.c" L
"1057" R "createdb" \x00

First, given the above, I'm no longer sure what I was doing on Linux
was actually using the new built-in collation.

Second, I'm not confused between locales and collations!

What the command supposed to work portably with v17 on both Linux and
Windows server, that uses the new built-in collations (and locale???)

I'd appreciate some clarifications please. --DD



Re: LOCALE C.UTF-8 on EDB Windows v17 server

From
Laurenz Albe
Date:
On Wed, 2025-06-04 at 14:23 +0200, Dominique Devienne wrote:
> The command I'm using (from a libpq trace) is:
>
> create database "dd_v168" encoding 'UTF8' locale 'C.UTF-8'
> locale_provider 'builtin' template template0
>
> On Windows, I'm getting
>
> 2025-06-04 14:07:41.227419 B 155 ErrorResponse S "ERROR" V "ERROR" C
> "42809" M "invalid LC_COLLATE locale name: "C.UTF-8"" H "If the locale
> name is specific to ICU, use ICU_LOCALE." F "dbcommands.c" L "1057" R
> "createdb" \x00

Pilot error.  If you use "LOCALE_PROVIDER builtin", you have to specify
BUILTIN LOCALE too:

CREATE DATABASE b
   TEMPLATE template0
   LOCALE_PROVIDER builtin
   BUILTIN_LOCALE 'C.UTF-8'
   /* used for aspects other than collation and character type */
   LOCALE 'C';

Yours,
Laurenz Albe



Re: LOCALE C.UTF-8 on EDB Windows v17 server

From
Jeff Davis
Date:
On Wed, 2025-06-04 at 14:23 +0200, Dominique Devienne wrote:
> create database "dd_v168" encoding 'UTF8' locale 'C.UTF-8'
> locale_provider 'builtin' template template0

I believe what you want is:

  create database "dd_v168" encoding 'UTF8' builtin_locale 'C.UTF-8'
  locale_provider 'builtin' template template0

notice "builtin_locale" instead of "locale".

CREATE DATABASE options are confusing due to some historical reasons.
LOCALE gets expanded (effectively) into LC_COLLATE and LC_CTYPE, but
the ICU and builtin providers don't use those for collation. Therefore,
we have ICU_LOCALE and BUILTIN_LOCALE to specify the locales for those
providers. Sorry -- I know that's confusing, but I don't know a good
way to fix it without breaking previous commands.

> On Windows, I'm getting

I tried on windows and linux and got the same result.

> I tried instead
>
> create database "dd_v168" encoding 'UTF8' locale 'pg_c_utf8'
> locale_provider 'builtin' template template0

The previous name was correct. "pg_c_utf8" is the identifier for the
collation object in pg_collation.

> Second, I'm not confused between locales and collations!

Locales are the names as the provider understands them; the collation
is an object in pg_collation that points to that locale (along with
metadata). For instance, there's a collation named "en_US" in
pg_collation, which points to the libc provider and locale name
'en_US.utf8'. There's also a collation named "en-US-x-icu" in
pg_collation which points to the ICU provider and locale name 'en-US'.

In other words, the collation objects are a layer of indirection to
what the provider actually understands.

Regards,
    Jeff Davis




Re: LOCALE C.UTF-8 on EDB Windows v17 server

From
Dominique Devienne
Date:
On Thu, Jun 5, 2025 at 3:01 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> On Wed, 2025-06-04 at 14:23 +0200, Dominique Devienne wrote:
> > The command I'm using (from a libpq trace) is:
> >
> > create database "dd_v168" encoding 'UTF8' locale 'C.UTF-8'
> > locale_provider 'builtin' template template0
> >
> > On Windows, I'm getting
> >
> > 2025-06-04 14:07:41.227419 B 155 ErrorResponse S "ERROR" V "ERROR" C
> > "42809" M "invalid LC_COLLATE locale name: "C.UTF-8"" H "If the locale
> > name is specific to ICU, use ICU_LOCALE." F "dbcommands.c" L "1057" R
> > "createdb" \x00
>
> Pilot error.  If you use "LOCALE_PROVIDER builtin", you have to specify
> BUILTIN LOCALE too:
>
> CREATE DATABASE b
>    TEMPLATE template0
>    LOCALE_PROVIDER builtin
>    BUILTIN_LOCALE 'C.UTF-8'
>    /* used for aspects other than collation and character type */
>    LOCALE 'C';

Thanks Laurenz. Indeed, Using LOCALE vs BUILTIN_LOCALE matters.

On Linux, no error unlike on Windows (still inconsistent there IMHO),
but the result is slightly different for datcollate and datctype (C vs en_US),
while the same for datlocprovider and datlocale, what I looked at.

Thus I kinda persist that there *is* a portability issue here.

Also, note what the doc says:

If locale_provider is builtin, then locale or builtin_locale must be
specified and set to either C or C.UTF-8.

It clearly says "locale or builtin_locale", emphasis on the OR.

So two issues here.
1) the doc is wrong or misleading on this.
2) the same command works on Linux, but not Windows.

FWIW. --DD

C:\Users\ddevienne>psql service=pau17
psql (17.4, server 17.5)
ddevienne=> select version();
                                                 version
---------------------------------------------------------------------------------------------------------
 PostgreSQL 17.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0
20210514 (Red Hat 8.5.0-26), 64-bit
(1 row)

ddevienne=> create database "dd_v168" encoding 'UTF8' locale 'C.UTF-8'
ddevienne-> locale_provider 'builtin' template template0;
CREATE DATABASE
ddevienne=> select datlocprovider, datlocale, datcollate, datctype
from pg_database where datname = 'dd_v168';
 datlocprovider | datlocale | datcollate | datctype
----------------+-----------+------------+----------
 b              | C.UTF-8   | C.UTF-8    | C.UTF-8
(1 row)

ddevienne=> create database "dd_v168b" encoding 'UTF8' builtin_locale 'C.UTF-8'
ddevienne-> locale_provider 'builtin' template template0;
CREATE DATABASE
ddevienne=> select datlocprovider, datlocale, datcollate, datctype
from pg_database where datname = 'dd_v168b';
 datlocprovider | datlocale | datcollate  |  datctype
----------------+-----------+-------------+-------------
 b              | C.UTF-8   | en_US.UTF-8 | en_US.UTF-8
(1 row)



Re: LOCALE C.UTF-8 on EDB Windows v17 server

From
Dominique Devienne
Date:
On Thu, Jun 5, 2025 at 4:52 AM Jeff Davis <pgsql@j-davis.com> wrote:
> > On Windows, I'm getting
> I tried on windows and linux and got the same result.

Thanks Jeff, for your informative email. The command work,
just like it does on Linux. YET... The results are NOT the same!

C:\Users\ddevienne>psql service=my17
psql (17.4, server 17.5)

ddevienne=> select version();
                                 version
-------------------------------------------------------------------------
 PostgreSQL 17.5 on x86_64-windows, compiled by msvc-19.43.34808, 64-bit
(1 row)

ddevienne=> create database "dd_v168" encoding 'UTF8' locale 'C.UTF-8'
ddevienne-> locale_provider 'builtin' template template0;
ERROR:  invalid LC_COLLATE locale name: "C.UTF-8"
HINT:  If the locale name is specific to ICU, use ICU_LOCALE.

ddevienne=> create database "dd_v168" encoding 'UTF8' builtin_locale 'C.UTF-8'
ddevienne-> locale_provider 'builtin' template template0;
CREATE DATABASE
ddevienne=> select datlocprovider, datlocale, datcollate, datctype
from pg_database where datname = 'dd_v168';
 datlocprovider | datlocale | datcollate | datctype
----------------+-----------+------------+----------
 b              | C.UTF-8   | C          | C
(1 row)

Contrast that with the results on Linux, I just sent in response to
Laurenz's message, and that I'll repeat below:

ddevienne=> select version();
                                                 version
---------------------------------------------------------------------------------------------------------
 PostgreSQL 17.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0
20210514 (Red Hat 8.5.0-26), 64-bit
(1 row)
ddevienne=> create database "dd_v168b" encoding 'UTF8' builtin_locale 'C.UTF-8'
ddevienne-> locale_provider 'builtin' template template0;
CREATE DATABASE
ddevienne=> select datlocprovider, datlocale, datcollate, datctype
from pg_database where datname = 'dd_v168b';
 datlocprovider | datlocale | datcollate  |  datctype
----------------+-----------+-------------+-------------
 b              | C.UTF-8   | en_US.UTF-8 | en_US.UTF-8
(1 row)

Which means the same commands sometimes work, sometimes not, across platforms.
But when they work, they don't even yield the SAME results. Which
means it's not portable IMHO...

So... It is possible to have the SAME command on Windows and Linux,
which yields the SAME datcollate and datctype values???
So far, such a command eludes me, I'm afraid. --DD



Re: LOCALE C.UTF-8 on EDB Windows v17 server

From
Dominique Devienne
Date:
On Thu, Jun 5, 2025 at 11:07 AM Dominique Devienne <ddevienne@gmail.com> wrote:
> So... It is possible to have the SAME command on Windows and Linux,
> which yields the SAME datcollate and datctype values???
> So far, such a command eludes me, I'm afraid. --DD

So I tried to be explicit about lc_collate and lc_ctype too.
OK on Linux, KO on Windows...

Windows:
ddevienne=> create database "dd_v168b" encoding 'UTF8' builtin_locale
'C.UTF-8' lc_collate 'C.UTF-8' lc_ctype 'C.UTF-8'
ddevienne-> locale_provider 'builtin' template template0;
ERROR:  invalid LC_COLLATE locale name: "C.UTF-8"
HINT:  If the locale name is specific to ICU, use ICU_LOCALE.

Linux:
ddevienne=> create database "dd_v168c" encoding 'UTF8' builtin_locale
'C.UTF-8' lc_collate 'C.UTF-8' lc_ctype 'C.UTF-8'
ddevienne-> locale_provider 'builtin' template template0;
CREATE DATABASE
ddevienne=> select datlocprovider, datlocale, datcollate, datctype
from pg_database where datname = 'dd_v168c';
 datlocprovider | datlocale | datcollate | datctype
----------------+-----------+------------+----------
 b              | C.UTF-8   | C.UTF-8    | C.UTF-8
(1 row)

AFAIK, C and C.UTF-8 are NOT the same thing, for collations.
It is indeed super confusing Jeff. I'm lost.

How can I get a

datlocprovider | datlocale | datcollate | datctype
----------------+-----------+------------+----------
 b              | C.UTF-8   | C.UTF-8    | C.UTF-8

database on Windows *AND* Linux?
If not possible using the same SQL (but why...), using what SQL?



Re: LOCALE C.UTF-8 on EDB Windows v17 server

From
Laurenz Albe
Date:
On Thu, 2025-06-05 at 10:53 +0200, Dominique Devienne wrote:
> Thanks Laurenz. Indeed, Using LOCALE vs BUILTIN_LOCALE matters.
>
> On Linux, no error unlike on Windows (still inconsistent there IMHO),
> but the result is slightly different for datcollate and datctype (C vs en_US),
> while the same for datlocprovider and datlocale, what I looked at.
>
> Thus I kinda persist that there *is* a portability issue here.

Perhaps, if omitting BUILTIN_LOCALE actually fails on Windows
(I cannot test it, no Windows nearby).

> Also, note what the doc says:
>
> If locale_provider is builtin, then locale or builtin_locale must be
> specified and set to either C or C.UTF-8.
>
> It clearly says "locale or builtin_locale", emphasis on the OR.

You are right, and that's how it works on Linux.
BUILTIN_LOCALE is not required.

> So two issues here.
> 1) the doc is wrong or misleading on this.

Perhaps the problem is in the implementation, not the documentation.

> 2) the same command works on Linux, but not Windows.

Unfortunately I am not in a position to get to the bottom of that.
In principle, it is acceptable for commands to fail on Windows and work
elsewhere, if operating system things like collations are involved.
But I agree that the "builtin" locale provider should work the same everywhere.

Yours,
Laurenz Albe



Re: LOCALE C.UTF-8 on EDB Windows v17 server

From
Dominique Devienne
Date:
On Thu, Jun 5, 2025 at 1:40 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> On Thu, 2025-06-05 at 10:53 +0200, Dominique Devienne wrote:
> > It clearly says "locale or builtin_locale", emphasis on the OR.
>
> You are right, and that's how it works on Linux.
> BUILTIN_LOCALE is not required.

Still, required or not, they yield different results
(for datcollate and datctype), which is hardly expected.



Re: LOCALE C.UTF-8 on EDB Windows v17 server

From
"Daniel Verite"
Date:
    Dominique Devienne wrote:

> On Linux, no error unlike on Windows (still inconsistent there IMHO),
> but the result is slightly different for datcollate and datctype (C vs
> en_US),
> while the same for datlocprovider and datlocale, what I looked at.
>
> Thus I kinda persist that there *is* a portability issue here.

"datcollate" and "datctype" refer to operating system locale names.

 locale 'C.UTF-8' or lc_collate 'C.UTF-8' lc_ctype 'C.UTF-8'
cannot work on Windows because Windows does not have a locale
named C.UTF-8, whereas a Linux system does (well at least recent
Linuxes. Some old Linuxes don't).

What you are seeing is the effect of OS locales not being portable
across systems. That's confusing but not a Postgres bug.


Best regards,
--
Daniel Vérité
https://postgresql.verite.pro/



Re: LOCALE C.UTF-8 on EDB Windows v17 server

From
Dominique Devienne
Date:
On Thu, Jun 5, 2025 at 2:40 PM Daniel Verite <daniel@manitou-mail.org> wrote:
>         Dominique Devienne wrote:
> > On Linux, no error unlike on Windows (still inconsistent there IMHO),
> > but the result is slightly different for datcollate and datctype (C vs
> > en_US),
> > while the same for datlocprovider and datlocale, what I looked at.
> >
> > Thus I kinda persist that there *is* a portability issue here.
>
> "datcollate" and "datctype" refer to operating system locale names.
>
>  locale 'C.UTF-8' or lc_collate 'C.UTF-8' lc_ctype 'C.UTF-8'
> cannot work on Windows because Windows does not have a locale
> named C.UTF-8, whereas a Linux system does (well at least recent
> Linuxes. Some old Linuxes don't).

But isn't the point of the new-in-v17 builtin provider is to be system
independent???

> What you are seeing is the effect of OS locales not being portable
> across systems. That's confusing but not a Postgres bug.

Thus builtin SHOULD be portable IMHO. --DD



Re: LOCALE C.UTF-8 on EDB Windows v17 server

From
"Daniel Verite"
Date:
    Dominique Devienne wrote:

> >  locale 'C.UTF-8' or lc_collate 'C.UTF-8' lc_ctype 'C.UTF-8'
> > cannot work on Windows because Windows does not have a locale
> > named C.UTF-8, whereas a Linux system does (well at least recent
> > Linuxes. Some old Linuxes don't).
>
> But isn't the point of the new-in-v17 builtin provider is to be system
> independent???

Yes, definitely.

But suppose your database has an extension that calls local-dependent
code, such as strxfrm() [1] for instance.

The linked MSVC doc says:

 "The transformation is made using the locale's LC_COLLATE category
 setting. For more information on LC_COLLATE, see setlocale. strxfrm
 uses the current locale for its locale-dependent behavior"

But what will be the value in LC_COLLATE when this extension code
is running in a database using the builtin provider?
It's the value found in pg_database.datcollate that was specified
when creating the database with the lc_collate or locale option.

The builtin provider routines are used for code inside Postgres
core, but code outside its perimeter can still call libc functions
that depend on lc_collate and lc_ctype.


[1]
https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/strxfrm-wcsxfrm-strxfrm-l-wcsxfrm-l?view=msvc-170

Best regards,
--
Daniel Vérité
https://postgresql.verite.pro/



Re: LOCALE C.UTF-8 on EDB Windows v17 server

From
Dominique Devienne
Date:
On Thu, Jun 5, 2025 at 5:01 PM Daniel Verite <daniel@manitou-mail.org> wrote:
>         Dominique Devienne wrote:
> > >  locale 'C.UTF-8' or lc_collate 'C.UTF-8' lc_ctype 'C.UTF-8'
> > > cannot work on Windows because Windows does not have a locale
> > > named C.UTF-8, whereas a Linux system does (well at least recent
> > > Linuxes. Some old Linuxes don't).
> >
> > But isn't the point of the new-in-v17 builtin provider is to be system
> > independent???
>
> Yes, definitely.
>
> But suppose your database has an extension that calls local-dependent
> code, such as strxfrm() [1] for instance.
>
> The linked MSVC doc says:
>
>  "The transformation is made using the locale's LC_COLLATE category
>  setting. For more information on LC_COLLATE, see setlocale. strxfrm
>  uses the current locale for its locale-dependent behavior"
>
> But what will be the value in LC_COLLATE when this extension code
> is running in a database using the builtin provider?
> It's the value found in pg_database.datcollate that was specified
> when creating the database with the lc_collate or locale option.
>
> The builtin provider routines are used for code inside Postgres
> core, but code outside its perimeter can still call libc functions
> that depend on lc_collate and lc_ctype.

So you're saying datcollate and datctype from pg_database are
irrelevant to PostgreSQL itself, and only extensions might be affects?

Which implies that I shouldn't worry about those differences?
Am I reading you right? --DD



Re: LOCALE C.UTF-8 on EDB Windows v17 server

From
"Daniel Verite"
Date:
    Dominique Devienne wrote:

> So you're saying datcollate and datctype from pg_database are
> irrelevant to PostgreSQL itself, and only extensions might be affects?

Almost. An exception that still exists in v18, as far as I can see [1],
is the default full text search parser still using libc functions like
iswdigit(), iswpunct(), iswspace()... that depend on LC_CTYPE.

So you could see differences between OSes in tsvector contents
in a database with the builtin provider.
Unless using LC_CTYPE=C. But then the parsing is suboptimal, since the
parser does not recognize Unicode fancy punctuation signs or spaces as
such.
Personally I would still care to set LC_CTYPE to a reasonable UTF-8 locale
with v17 or v18.

[1]
https://doxygen.postgresql.org/wparser__def_8c.html#a420ea398a8a11db92412a2af7bf45e40

Best regards,
--
Daniel Vérité
https://postgresql.verite.pro/



Re: LOCALE C.UTF-8 on EDB Windows v17 server

From
Jeff Davis
Date:
On Thu, 2025-06-05 at 11:34 +0200, Dominique Devienne wrote:
> On Thu, Jun 5, 2025 at 11:07 AM Dominique Devienne
> <ddevienne@gmail.com> wrote:
> > So... It is possible to have the SAME command on Windows and Linux,
> > which yields the SAME datcollate and datctype values???
> > So far, such a command eludes me, I'm afraid. --DD
>
> So I tried to be explicit about lc_collate and lc_ctype too.
> OK on Linux, KO on Windows...

LC_COLLATE and LC_CTYPE (datcollate and datctype) are platform-
dependent (handled by libc) so it won't be possible for those to have
the same meaning across different systems. The only locale that's
guaranteed to be available, AFAIK, is "C". Even some other unix-like
operating systems don't support "C.UTF-8".

The good news is that LC_COLLATE and LC_CTYPE don't have much effect if
you are using a different provider like "builtin" or ICU. They affect
the server's LC_COLLATE and LC_CTYPE environment, which does have a few
effects, but in general we are trying to avoid cases where those matter
a lot.

To maintain consistency across platforms, use LC_COLLATE=C and
LC_CTYPE=C when creating a database; along with the builtin provider
and UTF-8 as you are already doing.

Regards,
    Jeff Davis




Re: LOCALE C.UTF-8 on EDB Windows v17 server

From
Jeff Davis
Date:
On Thu, 2025-06-05 at 15:07 +0200, Dominique Devienne wrote:
> But isn't the point of the new-in-v17 builtin provider is to be
> system
> independent???

Yes, a major part of the builtin provider is complete consistency
across platforms for the entire collation system -- anything affected
by the database default collation or a COLLATE clause, including
comparisons, casing behavior, pattern matching, etc. New major
versions of Postgres may update Unicode, but those updates will never
affect comparisons in the builtin C.UTF-8 locale; and will only affect
other behaviors (like casing) subject to the (rather strict) Unicode
stability policy[1].

Regarding datcollate and datctype: those affect the LC_COLLATE and
LC_CTYPE environment variables, and Postgres does a setlocale() upon a
new database connection. That only affects libc functions like
strcoll(), so it won't affect the builtin provider or ICU which don't
use strcoll().

You're right to ask why those matter at all, then. It's hard for me to
guarantee that datcollate/datctype won't affect some other part of the
system or an extension (I see that Daniel offered some more details).
I'd like to force LC_COLLATE=C and LC_CTYPE=C, and then there'd be no
question, but I won't promise when that will happen. I'd suggest just
forcing those to "C" in your database.

Regards,
    Jeff Davis

[1] https://www.unicode.org/policies/stability_policy.html



Re: LOCALE C.UTF-8 on EDB Windows v17 server

From
Jeff Davis
Date:
On Thu, 2025-06-05 at 10:53 +0200, Dominique Devienne wrote:
> If locale_provider is builtin, then locale or builtin_locale must be
> specified and set to either C or C.UTF-8.
>
> It clearly says "locale or builtin_locale", emphasis on the OR.
>
> So two issues here.
> 1) the doc is wrong or misleading on this.

The code in dbcommands.c:createdb():

    if (localeEl && localeEl->arg)
    {
        dbcollate = defGetString(localeEl);
        dbctype = defGetString(localeEl);
        dblocale = defGetString(localeEl);
    }
    if (builtinlocaleEl && builtinlocaleEl->arg)
        dblocale = defGetString(builtinlocaleEl);
    if (collateEl && collateEl->arg)
        dbcollate = defGetString(collateEl);
    if (ctypeEl && ctypeEl->arg)
        dbctype = defGetString(ctypeEl);
    if (iculocaleEl && iculocaleEl->arg)
        dblocale = defGetString(iculocaleEl);

So LC_COLLATE, LC_CTYPE, and BUILTIN_LOCALE all fall back to LOCALE if
they aren't set. On windows, it fails because LC_COLLATE and LC_CTYPE
fall back to LOCALE, which is "C.UTF-8", which doesn't exist.

(I know the CREATE DATABASE command is confusing, but it grew
historically and we needed to support previously-working commands.)

If you have specific doc suggestions to clarify this, please let me
know.

> 2) the same command works on Linux, but not Windows.

As long as we accept the libc provider, or allow the user to set
LC_COLLATE/LC_CTYPE, then there will be some commands that succeed on
some platforms and fail on others. Even with ICU, there may be versions
that accept a locale and versions that don't.

I'd like to make libc less "special" so that users who don't want to
use it aren't confronted with errors about things that don't matter to
them. I welcome suggestions that can move us closer to that goal
without breaking previously-working commands.

Regards,
    Jeff Davis