Re: Windows default locale vs initdb - Mailing list pgsql-hackers
From | Pavel Stehule |
---|---|
Subject | Re: Windows default locale vs initdb |
Date | |
Msg-id | CAFj8pRAe0P5amXJomBpGma8b4s0iaV=RKmHkSNwg42QLvmJKPg@mail.gmail.com Whole thread Raw |
In response to | Re: Windows default locale vs initdb (Andrew Dunstan <andrew@dunslane.net>) |
List | pgsql-hackers |
po 19. 4. 2021 v 12:52 odesílatel Andrew Dunstan <andrew@dunslane.net> napsal:
On Mon, Apr 19, 2021 at 4:53 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:po 19. 4. 2021 v 7:43 odesílatel Thomas Munro <thomas.munro@gmail.com> napsal:Hi,
Moving this topic into its own thread from the one about collation
versions, because it concerns pre-existing problems, and that thread
is long.
Currently initdb sets up template databases with old-style Windows
locale names reported by the OS, and they seem to have caused us quite
a few problems over the years:
db29620d "Work around Windows locale name with non-ASCII character."
aa1d2fc5 "Another attempt at fixing Windows Norwegian locale."
db477b69 "Deal with yet another issue related to "Norwegian (Bokmål)"..."
9f12a3b9 "Tolerate version lookup failure for old style Windows locale..."
... and probably more, and also various threads about , for example,
"German_German.1252" vs "German_Switzerland.1252" which seem to get
confused or badly canonicalised or rejected somewhere in the mix.
I hadn't focused on any of that before, being a non-Windows-user, but
the entire contents of win32setlocale.c supports the theory that
Windows' manual meant what it said when it said[1]:
"We do not recommend this form for locale strings embedded in
code or serialized to storage, because these strings are more likely
to be changed by an operating system update than the locale name
form."
I suppose that was the only form available at the time the code was
written, so there was no choice. The question we asked ourselves
multiple times in the other thread was how we're supposed to get to
the modern BCP 47 form when creating the template databases. It looks
like one possibility, since Vista, is to call
GetUserDefaultLocaleName()[2], which doesn't appear to have been
discussed before on this list. That doesn't allow you to ask for the
default for each individual category, but I don't know if that is even
a concept for Windows user settings. It may be that some of the other
nearby functions give a better answer for some reason. But one thing
is clear from a test that someone kindly ran for me: it reports
standardised strings like "en-NZ", not strings like "English_New
Zealand.1252".
No patch, but I wondered if any Windows hackers have any feedback on
relative sanity of trying to fix all these problems this way.Last weekend I talked with one user about one interesting (and messing) issue. They needed to create a new database with Czech collation on Azure SAS. There was not any entry in pg_collation for Czech language. The reply from Microsoft support was to use CREATE DATABASE xxx TEMPLATE 'template0' ENCODING 'utf8' LOCALE 'cs_CZ.UTF8' and it was working.My understanding from Microsoft staff at conferences is that Azure's PostgreSQL SAS runs on linux, not WIndows.
I had different informations, but still there was something wrong because no czech locales was in pg_collation
cheersandrew
pgsql-hackers by date: