Thread: Fix for 8.3 MSVC locale (Was [HACKERS] NLS on MSVC strikes back!)

Fix for 8.3 MSVC locale (Was [HACKERS] NLS on MSVC strikes back!)

From
"Gevik Babakhani"
Date:
Hereby a patch that fixes NLS support on PG 8.3 compiled with MSVC.

There problem:

NLS support does not work on PG 8.3 compiled with MSVC.
I encountered this bug when I was trying to show localized months and days
using TO_CHAR.

The main reason for this problem is because Gettext on Windows does not
respond to LC_MESSAGES environment variable. Changing this variable should
trigger Gettext to load a different messages catalog which unfortunately
does not work on Windows. Gettext uses the locale of the current thread in
execution to determine which message catalog should be loaded.

This is all discussed in:
http://archives.postgresql.org/pgsql-hackers/2008-02/msg00485.php

How it is fixed:

Changing the LC_MESSAGES is done in pg_locale.c::pg_perm_setlocale(int
category, const char *locale).
In order to force Gettext to load a messages catalog we have to call
WIN32API::SetThreadLocale(unsigned long locale_id)

(You probably see it coming)
Our input parameter for locale in pg_perm_setlocale is a string and there is
no unified way in Windows to translate a locale string to a locale_id for
SetThreadLocale to use. Therefore we will use a static table of values to
anticipate the locale name:

pg_win32_locale_database::{0x041b,{"sk","sk-SK","sk_SK","Slovak_Slovakia"}}

Given any of "sk","sk-SK","sk_SK".... the 0x041b is returned for
SetThreadLocale as input parameter.

(We are not quite done yet)
Gettext, internally uses a hack to force itself to reload. This hidden
feature is also explained in GetText docs. By incrementing a Gettext
internal variable (_nl_msg_cat_cntr) we force Gettext to reload on the next
LC_MESSAGES->MSGID query.

Tests:

- Tested on Win XP MSVC (VC++ 8.0)
- Just a routine "make check" test on my Linux box
- MINGW is not tested. (I do not have the installation)

Fix notes:

- Gettext behaves oddly when the env. variable LANGUAGE is set before
starting PG
- The locale names in the static table are case sensitive. 'nl_NL' !=
'NL_NL'
- At this moment I only have included locale names we actually support in PG
installation.
- Where are the JP locale .po and .mo files? These are not in sources!
- Even though there are 20+ locales in PG installation, does not mean those
are complete. Try TMMonth,TMDay on "nl_NL". You will get English names :)
(My fault, hee hee,  I haven't completed nl.po translations yet.)

TODO:

- Provide/complete the day and month names for all supported locales.
- Create docs for supported locale names on Windows. (Values of the static
table)

Regards,
Gevik Babakhani
------------------------------------------------
PostgreSQL NL       http://www.postgresql.nl
TrueSoftware BV     http://www.truesoftware.nl
------------------------------------------------


Attachment

Re: Fix for 8.3 MSVC locale (Was [HACKERS] NLS on MSVC strikes back!)

From
Magnus Hagander
Date:
Gevik Babakhani wrote:
> Hereby a patch that fixes NLS support on PG 8.3 compiled with MSVC.

Haven't looked into the details of the patch yet, will do so. But the
first thing I notice - you say this is only for MSVC, right? But the
patch will also change the behaviour for the mingw build. Since you say
you haven't tested on it, does the documentation imply that it would
work on mingw, or is this likely to break that build? Perhaps it should
be made conditional on MSVC only, and not on WIN32?

//Magnus



Re: Fix for 8.3 MSVC locale (Was [HACKERS] NLS on MSVC strikes back!)

From
"Hiroshi Saito"
Date:
Hi.

Sorry, I don't understand the point of this patch.
However,  reality was confirmed.

I use "initdb -E UTF-8 --no-locale"
Gevik-san patch apply it. (So, ja is not contained.)
http://winpg.jp/~saito/pg_work/LC_MESSAGE_CHECK/Gevik_afterpatch.png
Were you changeful before and after the correction?

P.S)
I was thinking about the improvement of other relations.

Regards,
Hiroshi Saito

----- Original Message -----
From: "Magnus Hagander" <magnus@hagander.net>


> Gevik Babakhani wrote:
>> Hereby a patch that fixes NLS support on PG 8.3 compiled with MSVC.
>
> Haven't looked into the details of the patch yet, will do so. But the
> first thing I notice - you say this is only for MSVC, right? But the
> patch will also change the behaviour for the mingw build. Since you say
> you haven't tested on it, does the documentation imply that it would
> work on mingw, or is this likely to break that build? Perhaps it should
> be made conditional on MSVC only, and not on WIN32?
>
> //Magnus
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
>
>                http://www.postgresql.org/about/donate

Re: Fix for 8.3 MSVC locale (Was [HACKERS] NLS on MSVC strikes back!)

From
"Gevik Babakhani"
Date:
> Haven't looked into the details of the patch yet, will do so.
> But the first thing I notice - you say this is only for MSVC,
> right? But the patch will also change the behaviour for the
> mingw build. Since you say you haven't tested on it, does the
> documentation imply that it would work on mingw, or is this
> likely to break that build? Perhaps it should be made
> conditional on MSVC only, and not on WIN32?
>

Thinking about the above... It is a good idea to have something like this be
compiled only for MSVC.
I guess we have to add an additional preprocessor directive from mkvcbuild
to the solution (what about MSVC_COMPILER?)

I am installing mingw to test the patch there. Chances are it will break
because mingw does __declspec(dllimport) differently than msvc

Regards,
Gevik.



Re: Fix for 8.3 MSVC locale (Was [HACKERS] NLS on MSVC strikes back!)

From
Magnus Hagander
Date:
Gevik Babakhani wrote:
>> Haven't looked into the details of the patch yet, will do so.
>> But the first thing I notice - you say this is only for MSVC,
>> right? But the patch will also change the behaviour for the
>> mingw build. Since you say you haven't tested on it, does the
>> documentation imply that it would work on mingw, or is this
>> likely to break that build? Perhaps it should be made
>> conditional on MSVC only, and not on WIN32?
>>
>
> Thinking about the above... It is a good idea to have something like this be
> compiled only for MSVC.
> I guess we have to add an additional preprocessor directive from mkvcbuild
> to the solution (what about MSVC_COMPILER?)

We have a directive called WIN32_ONLY_COMPILER that's used for this.
It'll pick up MSVC and Borland C++ which normally behave at least almost
the same.


> I am installing mingw to test the patch there. Chances are it will break
> because mingw does __declspec(dllimport) differently than msvc

Thanks.

//Magnus

Re: Fix for 8.3 MSVC locale (Was [HACKERS] NLS on MSVC strikes back!)

From
"Gevik Babakhani"
Date:
Thank you. Is there any reason why JP locale files are not in normal
installation?

> -----Original Message-----
> From: Hiroshi Saito [mailto:z-saito@guitar.ocn.ne.jp]
> Sent: Thursday, February 14, 2008 8:00 PM
> To: Magnus Hagander; Gevik Babakhani
> Cc: pgsql-patches@postgresql.org
> Subject: Re: [PATCHES] Fix for 8.3 MSVC locale (Was [HACKERS]
> NLS on MSVC strikes back!)
>
> Hi.
>
> Sorry, I don't understand the point of this patch.
> However,  reality was confirmed.
>
> I use "initdb -E UTF-8 --no-locale"
> Gevik-san patch apply it. (So, ja is not contained.)
> http://winpg.jp/~saito/pg_work/LC_MESSAGE_CHECK/Gevik_afterpatch.png
> Were you changeful before and after the correction?
>
> P.S)
> I was thinking about the improvement of other relations.
>
> Regards,
> Hiroshi Saito
>
> ----- Original Message -----
> From: "Magnus Hagander" <magnus@hagander.net>
>
>
> > Gevik Babakhani wrote:
> >> Hereby a patch that fixes NLS support on PG 8.3 compiled with MSVC.
> >
> > Haven't looked into the details of the patch yet, will do
> so. But the
> > first thing I notice - you say this is only for MSVC,
> right? But the
> > patch will also change the behaviour for the mingw build.
> Since you say
> > you haven't tested on it, does the documentation imply that
> it would
> > work on mingw, or is this likely to break that build?
> Perhaps it should
> > be made conditional on MSVC only, and not on WIN32?
> >
> > //Magnus
> >
> >
> >
> > ---------------------------(end of
> broadcast)---------------------------
> > TIP 7: You can help support the PostgreSQL project by donating at
> >
> >                http://www.postgresql.org/about/donate
>


Re: Fix for 8.3 MSVC locale (Was [HACKERS] NLS on MSVC strikes back!)

From
"Gevik Babakhani"
Date:
> We have a directive called WIN32_ONLY_COMPILER that's used for this.
> It'll pick up MSVC and Borland C++ which normally behave at
> least almost the same.
>
>
> > I am installing mingw to test the patch there. Chances are
> it will break
> > because mingw does __declspec(dllimport) differently than msvc
>
> Thanks.
>
> //Magnus

Humm... I was expecting it to break... but it compiled just fine :)

Here are the steps I took..

On a Win 2003 VM (VMWare):
1. Installed MinGW-5.1.3.exe
2. Installed MSYS-1.0.10.exe
3. Installed msysDTK-1.0.1.exe
4. Installed gettext-0.14.4.exe into C:\MinGW
4. Downloaded sources tarbal
5. ./configure --prefix=/home/gevik/build  --without-zlib --enable-nls
6. make check, every thing was OK and 114 tests passed :)
7. make install,....initdb...createdb....etc..etc..
8. set LC_MESSAGES and tested. See attachment :)

The patch works both for MSVC and MINGW.

Regards,
Gevik.



Attachment

Re: Fix for 8.3 MSVC locale (Was [HACKERS] NLS on MSVC strikes back!)

From
Alvaro Herrera
Date:
Gevik Babakhani wrote:

> gevik=# set lc_messages to 'Spanish_Spain';
> SET
> gevik=# select to_char((current_date + s.a),'TMDay TMMonth YYYY') as dates from generate_series(0,6) as s(a);
>          dates
> -----------------------
>  Jueves Febrero 2008
>  Viernes Febrero 2008
>  Sbado Febrero 2008
>  Domingo Febrero 2008
>  Lunes Febrero 2008
>  Martes Febrero 2008
>  Mircoles Febrero 2008
> (7 rows)

Hmm, interestingly you lost the diacritics here.  The output is mangled
for Saturday and Wednesday, which should read "Sábado" and "Miércoles"
respectively.

It is not good that the system allows you to output invalidly encoded
data.  What happens if you try setting lc_messages to
Spanish_Spain.65001 instead?

Ideally, it should be an error to set lc_messages to a value that's not
compatible with the current encoding.  Do we do that currently elsewhere?

(Perhaps this is not a problem with your patch, but rather a problem
that's worth fixing separately.)

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: Fix for 8.3 MSVC locale (Was [HACKERS] NLS on MSVC strikes back!)

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Ideally, it should be an error to set lc_messages to a value that's not
> compatible with the current encoding.  Do we do that currently elsewhere?

We don't currently enforce that, and I'm not sure it's possible to do so
on non-Windows machines.  AFAIR the POSIX API doesn't even associate a
character set with anything but LC_CTYPE.

Does it make any sense to wire in the assumption that locale names have
the form "something.encoding-id"?  If we did that, we could enforce that
the encoding-id part matches LC_CTYPE, or maybe even just alter the
presented values for other LC_foo variables to match.

            regards, tom lane

Re: Fix for 8.3 MSVC locale (Was [HACKERS] NLS on MSVCstrikes back!)

From
"Gevik Babakhani"
Date:
> Hmm, interestingly you lost the diacritics here.  The output
> is mangled for Saturday and Wednesday, which should read
> "Sábado" and "Miércoles"
> respectively.
>
> It is not good that the system allows you to output invalidly
> encoded data.  What happens if you try setting lc_messages to
> Spanish_Spain.65001 instead?

This is because of the codepage dosbox had when I rand the test.
I ran the same test in PGAdmin and correct values are presented.

> Ideally, it should be an error to set lc_messages to a value
> that's not compatible with the current encoding.  Do we do
> that currently elsewhere?

Not that I know of.



Re: Fix for 8.3 MSVC locale (Was [HACKERS] NLS on MSVC strikes back!)

From
"Hiroshi Saito"
Date:
Hi.

Um, my screen shot is looking at the problem.....
..
set LC_MESSAGES='de_DE.UTF-8';
.. it is not Japanese..however, result is Japanese message.

From: "Gevik Babakhani" <pgdev@xs4all.nl>

> Thank you. Is there any reason why JP locale files are not in normal
> installation?
>
>> -----Original Message-----
>> From: Hiroshi Saito [mailto:z-saito@guitar.ocn.ne.jp]
>> Sent: Thursday, February 14, 2008 8:00 PM
>> To: Magnus Hagander; Gevik Babakhani
>> Cc: pgsql-patches@postgresql.org
>> Subject: Re: [PATCHES] Fix for 8.3 MSVC locale (Was [HACKERS]
>> NLS on MSVC strikes back!)
>>
>> Hi.
>>
>> Sorry, I don't understand the point of this patch.
>> However,  reality was confirmed.
>>
>> I use "initdb -E UTF-8 --no-locale"
>> Gevik-san patch apply it. (So, ja is not contained.)
>> http://winpg.jp/~saito/pg_work/LC_MESSAGE_CHECK/Gevik_afterpatch.png
>> Were you changeful before and after the correction?
>>
>> P.S)
>> I was thinking about the improvement of other relations.
>>
>> Regards,
>> Hiroshi Saito


Re: Fix for 8.3 MSVC locale (Was [HACKERS] NLS on MSVC strikes back!)

From
Bruce Momjian
Date:
Added to Win32 TODO:

        o Fix MSVC NLS support, like for to_char()

          http://archives.postgresql.org/pgsql-hackers/2008-02/msg00485.php
          http://archives.postgresql.org/pgsql-patches/2008-02/msg00038.php


---------------------------------------------------------------------------

Gevik Babakhani wrote:
> Hereby a patch that fixes NLS support on PG 8.3 compiled with MSVC.
>
> There problem:
>
> NLS support does not work on PG 8.3 compiled with MSVC.
> I encountered this bug when I was trying to show localized months and days
> using TO_CHAR.
>
> The main reason for this problem is because Gettext on Windows does not
> respond to LC_MESSAGES environment variable. Changing this variable should
> trigger Gettext to load a different messages catalog which unfortunately
> does not work on Windows. Gettext uses the locale of the current thread in
> execution to determine which message catalog should be loaded.
>
> This is all discussed in:
> http://archives.postgresql.org/pgsql-hackers/2008-02/msg00485.php
>
> How it is fixed:
>
> Changing the LC_MESSAGES is done in pg_locale.c::pg_perm_setlocale(int
> category, const char *locale).
> In order to force Gettext to load a messages catalog we have to call
> WIN32API::SetThreadLocale(unsigned long locale_id)
>
> (You probably see it coming)
> Our input parameter for locale in pg_perm_setlocale is a string and there is
> no unified way in Windows to translate a locale string to a locale_id for
> SetThreadLocale to use. Therefore we will use a static table of values to
> anticipate the locale name:
>
> pg_win32_locale_database::{0x041b,{"sk","sk-SK","sk_SK","Slovak_Slovakia"}}
>
> Given any of "sk","sk-SK","sk_SK".... the 0x041b is returned for
> SetThreadLocale as input parameter.
>
> (We are not quite done yet)
> Gettext, internally uses a hack to force itself to reload. This hidden
> feature is also explained in GetText docs. By incrementing a Gettext
> internal variable (_nl_msg_cat_cntr) we force Gettext to reload on the next
> LC_MESSAGES->MSGID query.
>
> Tests:
>
> - Tested on Win XP MSVC (VC++ 8.0)
> - Just a routine "make check" test on my Linux box
> - MINGW is not tested. (I do not have the installation)
>
> Fix notes:
>
> - Gettext behaves oddly when the env. variable LANGUAGE is set before
> starting PG
> - The locale names in the static table are case sensitive. 'nl_NL' !=
> 'NL_NL'
> - At this moment I only have included locale names we actually support in PG
> installation.
> - Where are the JP locale .po and .mo files? These are not in sources!
> - Even though there are 20+ locales in PG installation, does not mean those
> are complete. Try TMMonth,TMDay on "nl_NL". You will get English names :)
> (My fault, hee hee,  I haven't completed nl.po translations yet.)
>
> TODO:
>
> - Provide/complete the day and month names for all supported locales.
> - Create docs for supported locale names on Windows. (Values of the static
> table)
>
> Regards,
> Gevik Babakhani
> ------------------------------------------------
> PostgreSQL NL       http://www.postgresql.nl
> TrueSoftware BV     http://www.truesoftware.nl
> ------------------------------------------------
>

[ Attachment, skipping... ]

[ Attachment, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +