Re: Chinese initdb on Windows - Mailing list pgsql-hackers

From Dave Page
Subject Re: Chinese initdb on Windows
Date
Msg-id AANLkTimFVUAZidQzPGk3ozQ-oHr57Ag+JHya144MUoVW@mail.gmail.com
Whole thread Raw
In response to Chinese initdb on Windows  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
List pgsql-hackers
On Mon, Mar 21, 2011 at 7:29 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> On windows, if you have OS locale set to "Chinese (Simplified, PRC)", initdb
> fails:
>
> X:\>C:\pgsql-install\bin\initdb.exe -D data2
> The files belonging to this database system will be owned by user "Heikki".
> This user must also own the server process.
>
> The database cluster will be initialized with locale Chinese
> (Simplified)_People
> 's Republic of China.936.
> initdb: locale Chinese (Simplified)_People's Republic of China.936 requires
> unsu
> pported encoding GBK
> Encoding GBK is not allowed as a server-side encoding.
> Rerun initdb with a different locale selection.
>
> The easy workaround for that is to specify --encoding=UTF-8, as UTF-8 can be
> used with any locale on Windows. How about doing that automatically in
> initdb? Now that we have the smarts in psql to detect current encoding from
> the environment and set client_encoding accordingly, it Just Works. Attached
> is a patch for that.
>
>
> Once you get past that, however, there's another issue:
>
>> ...
>>
>> creating directory data2 ... ok
>> creating subdirectories ... ok
>> selecting default max_connections ... 100
>> selecting default shared_buffers ... 32MB
>> creating configuration files ... ok
>> creating template1 database in data2/base/1 ... ok
>> initializing pg_authid ... FATAL:  database locale is incompatible with
>> operatin
>> g system
>> DETAIL:  The database was initialized with LC_COLLATE "Chinese
>> (Simplified)_Peoples Republic of China.936",  which is not recognized by
>> setlocale().
>> HINT:  Recreate the database with another locale or install the missing
>> locale.
>> child process exited with exit code 1
>
> The problem is probably the apostrophe in the locale name, although it seems
> to be missing from the above error message. setlocale() has a known problem
> with locale names that have dots in the country name, and looks like it has
> similar issues with apostrophes.
>
> Fortunately, there are aliases for those problematic locales on Windows,
> that don't have dots or apostrophes in the names. We did some testing in
> EnterpriseDB of various locales on various versions of Windows, and came up
> with the following mappings:
>
> "*_Hong Kong S.A.R.*" -> "*_HKG.*"
> "*_U.A.E.*" -> "*_ARE.*"
> "*_People's Republic of China.*" -> "*_China.*"
> "China_Macau S.A.R..950" -> "ZHM"
>
> The first three mappings map the full country name to an abbreviation that
> is also accepted by Windows' setlocale(). See
> http://msdn.microsoft.com/en-us/library/cdax410z%28v=vs.71%29.aspx. ARE is
> not on that list, but seems to work.
>
> Macau is trickier. ZHM is not an abbreviation of the country, but of the
> whole locale, so we can't replace just the country part. So this will not
> work for "Finnish_Macau S.A.R..950", like the other mappings do.
> Nevertheless, it works for the common case.
>
> Any objections to the 2nd attached patch, which adds the mapping of those
> locale names on Windows?
>
> I'm thinking it's not too late to do this in 9.1.

I've heard complaints a number of times from Chinese users who I
believe this would help.

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Merlin Moncure
Date:
Subject: Re: 2nd Level Buffer Cache
Next
From: Jim Nasby
Date:
Subject: Re: really lazy vacuums?