Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607 - Mailing list pgsql-bugs

From Thomas Munro
Subject Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date
Msg-id CA+hUKGJDv3pCnQXY13uYW1oe9CvFw8Y2j8mBMh2HGk_X0qYEcw@mail.gmail.com
Whole thread Raw
In response to RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607  ("Haifang Wang (Centific Technologies Inc)" <v-haiwang@microsoft.com>)
Responses RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
List pgsql-bugs
On Tue, May 21, 2024 at 8:17 AM Haifang Wang (Centific Technologies
Inc) <v-haiwang@microsoft.com> wrote:
> What questions do you have? Could you please list them clearly so that Vishwa could help to answer?

I already did, twice, but perhaps Vishwa or others can't see the whole
thread, so here is this whole thread in our project email archive:


https://www.postgresql.org/message-id/flat/PH8PR21MB3902F334A3174C54058F792CE5182%40PH8PR21MB3902.namprd21.prod.outlook.com

But let me ask the questions again, with some motivation/reason I want
to know in parentheses:

1.  What is the oldest Windows release that can understand the "new"
BCP47 locale names, like "tr-TR" or "tr-TR.1452"?  (Some PostgreSQL
versions, for example PostgreSQL 12, are expected to run on old
versions of Windows from long before Windows 10, so we might have to
consider this.  However, if we go with Tom's idea that we do nothing
by default but just allow users to supply their own optional mapping
file, then this question becomes unimportant, users can figure out for
themselves whether it works, and presumably only 10+ got the update
that renamed Turkey to Türkiye.  [And in reality, I hope/expect that
no one really does run old out-of-support OSes, because that's crazy,
but I'm not allowed to assume...])

2.  If we translate to BCP47 locale names like "tr-TR" automatically,
should we put the ".1452" on the end?  What does it mean exactly?
What does it mean if you don't put it there?  (I could guess that if
you don't put it on, the encoding in "char"-based functions is the
"ACP".  What I really want to know is, can it be different from the
"ACP", and if it is, which functions does it affect?  For example if
the ACP is 1521 and I call _tolower_l() giving it a locale_t that I
opened with "en-US.UTF-8", what happens?  I am sure this is a simple
question but we are not Windows programmers, you are the first person
to show up offering to investigate, and I personally found the docs a
bit light on the topic.)

3.  Do the new BCP47 locale names give *exactly* the same results for
strcoll() and tolower() etc, as the old "Turkish*" style names?  (In
other words, is it *exactly the same code and driving data*, just
using different labels?  Or is it a new locale implementation that
could differ arbitrarily in behaviour?  If the answer is yes, it's
just a new naming scheme, then life will be much much simpler for our
users, but if not, then indexes might be corrupted if we tell people
to switch to the new BCP47 names, and so we'd better know about that,
so we can adjust our advice to users.)



pgsql-bugs by date:

Previous
From: Noah Misch
Date:
Subject: Re: relfrozenxid may disagree with row XIDs after 1ccc1e05ae
Next
From: David Rowley
Date:
Subject: Re: UNION removes almost all rows (not duplicates) - in fresh build of pg17!