Thread: msvc++ build of 8.2.4 and encodings

msvc++ build of 8.2.4 and encodings

From
Charlie Savage
Date:
Hope this is the right place for this post...

I'm been trying out the msvc++ build scripts for postgresql 8.2.4 on my 
development laptop (using window xp pro).

I noticed the sort orders of queries changed.  Investigating more, 
encodings don't seem to be working as expected.

Using a MSVC++ build:
> CREATE DATABASE test1 WITH ENCODING = 'utf8';
> show all

"lc_collate";"English_United States.1252"
"lc_ctype";"English_United States.1252"
"lc_messages";"C"
"lc_monetary";"C"
"lc_numeric";"C"
"lc_time";"C"

Using a MSYS build:
> CREATE DATABASE test1 WITH ENCODING = 'utf8';
> show all

"lc_collate";"en_US.UTF-8"
"lc_ctype";"en_US.UTF-8"
"lc_messages";"C"
"lc_monetary";"C"
"lc_numeric";"C"
"lc_time";"C"

In both cases, the database clusters were created like this:

initdb ---locale=c --encoding=utf8;

Note that I successfully built all the various encoding projects for the 
MSVC++ build and have installed them.

I'd be happy to debug this a bit more if would be helpful.

Thanks,

Charlie

Re: msvc++ build of 8.2.4 and encodings

From
Charlie Savage
Date:
> Using a MSYS build:
> 
>  > CREATE DATABASE test1 WITH ENCODING = 'utf8';
> 
>  > show all
> 
> "lc_collate";"en_US.UTF-8"
> "lc_ctype";"en_US.UTF-8"
> "lc_messages";"C"
> "lc_monetary";"C"
> "lc_numeric";"C"
> "lc_time";"C"

Sorry, the above output is for Linux (Fedora Core 6).  With an MSYS 
build on my XP laptop its:

"lc_collate";"C"
"lc_ctype";"C"
"lc_messages";"C"
"lc_monetary";"C"
"lc_numeric";"C"
"lc_time";"C"

Still different than the MSVC++ build.

Thanks,

Charlie

Re: msvc++ build of 8.2.4 and encodings

From
Andrew Dunstan
Date:

Charlie Savage wrote:
> Hope this is the right place for this post...
>
> I'm been trying out the msvc++ build scripts for postgresql 8.2.4 on 
> my development laptop (using window xp pro).
>
> I noticed the sort orders of queries changed.  Investigating more, 
> encodings don't seem to be working as expected.
>
> Using a MSVC++ build:
>
> > CREATE DATABASE test1 WITH ENCODING = 'utf8';
>
> > show all
>
> "lc_collate";"English_United States.1252"
> "lc_ctype";"English_United States.1252"
> "lc_messages";"C"
> "lc_monetary";"C"
> "lc_numeric";"C"
> "lc_time";"C"
>
> Using a MSYS build:
>
> > CREATE DATABASE test1 WITH ENCODING = 'utf8';
>
> > show all
>
> "lc_collate";"en_US.UTF-8"
> "lc_ctype";"en_US.UTF-8"
> "lc_messages";"C"
> "lc_monetary";"C"
> "lc_numeric";"C"
> "lc_time";"C"
>
> In both cases, the database clusters were created like this:
>
> initdb ---locale=c --encoding=utf8;
>
>

That seems most unlikely - without the superfluous dash it should set 
both lc_collate and lc_ctype to C.

Please try the following in both cases:

initdb --no-locale --encoding=utf8 data
pg_controldata data | grep LC_

If it doesn't show this:

LC_COLLATE:                           C
LC_CTYPE:                             C

then that's a bug. Or if after that you connect to the instance and 
"show lc_collate" or "show lc_ctype" don't likewise show C then that's a 
bug.

Are you by any chance loading a library that calls setlocale() ?

cheers

andrew



Re: msvc++ build of 8.2.4 and encodings

From
Charlie Savage
Date:
Hi Andrew,

Thank for the reply.

>> In both cases, the database clusters were created like this:
>>
>> initdb ---locale=c --encoding=utf8;
>>
>>
> 
> That seems most unlikely - without the superfluous dash it should set 
> both lc_collate and lc_ctype to C.

Ah, sorry, that was a typo.  If you actually try it:

C:\WINDOWS\system32>initdb ---locale=C --encoding=utf8 c:\data_msvcc3
initdb: illegal option -- -locale=C

> 
> Please try the following in both cases:
> 
> initdb --no-locale --encoding=utf8 data
> pg_controldata data | grep LC_
> 
> If it doesn't show this:
> 
> LC_COLLATE:                           C
> LC_CTYPE:                             C
> 
> then that's a bug. 

With MSYS build:

initdb --no-locale --encoding=utf8 c:\data_msys

C:\WINDOWS\system32>pg_controldata c:\data_msys | grep LC_
LC_COLLATE:                           C
LC_CTYPE:                             C


[connect to postgres database]
show lc_collate                       C
show lc_ctype                         C
> create database test with encoding='utf8'

[switch to postgres database]
show lc_collate                       C
show lc_ctype                         C


With VC++ build:

initdb --no-locale --encoding=utf8 c:\data_msvcc

C:\WINDOWS\system32>pg_controldata c:\data_msvcc | grep LC_
LC_COLLATE:                           C
LC_CTYPE:                             C

show lc_collate                       C
show lc_ctype                         C
> create database test with encoding='utf8'

[switch to postgres database]
show lc_collate                       C
show lc_ctype                         C


Ok, so this works.

And if I use --locale=C for initdb it gives the same answers.

> Are you by any chance loading a library that calls setlocale() ?

Hmm.   Its postgresql 8.2.4 + tsearch2 + tree + postgis.  postgis in 
turn loads proj4 and geos.  I grepped through those 3 libraries source 
code and did not find any calls to setlocale.  So I don't think so.

So now I'm confused - if I go back to my other cluster that I originally 
wrote about (created with the MSVC++ build also) and create a database 
it has a different lc_collate (English_United States.1252").  Could this 
be from the dump/reload?

Charlie

Re: msvc++ build of 8.2.4 and encodings

From
Magnus Hagander
Date:
On Wed, Aug 29, 2007 at 09:49:03PM -0600, Charlie Savage wrote:
> Hmm.   Its postgresql 8.2.4 + tsearch2 + tree + postgis.  postgis in 
> turn loads proj4 and geos.  I grepped through those 3 libraries source 
> code and did not find any calls to setlocale.  So I don't think so.
> 
> So now I'm confused - if I go back to my other cluster that I originally 
> wrote about (created with the MSVC++ build also) and create a database 
> it has a different lc_collate (English_United States.1252").  Could this 
> be from the dump/reload?

Shouldn't be - it's set a initdb and not at reload. My guess would be that
you somehow missed the locale parameter on that initdb call - I don't
suppose you still have it in yuor commandline history? :_)

There should be zero difference in what initdb does, and I've never seen
anything like that other than when I missed some option to it.

//Magnus