Re: Unicode database on non-unicode operating system - Mailing list pgsql-general

From Morten Barklund
Subject Re: Unicode database on non-unicode operating system
Date
Msg-id AB6A9C75F1620048B14C9E7D9526F5B136CB92@TBWAMAIL.tbwa.dk
Whole thread Raw
In response to Re: Unicode database on non-unicode operating system  (Peter Eisentraut <peter_e@gmx.net>)
List pgsql-general
Hi Peter,

Thank you once again. That cleared up a lot of confusion for me and my 
co-workers and the next server set up will be with unicode and en_DK.utf8 
to ensure consistency.


Regards,
Morten Barklund

-----Original Message-----
From: Peter Eisentraut [mailto:peter_e@gmx.net] 
Sent: Tuesday, July 15, 2008 3:50 PM
To: pgsql-general@postgresql.org
Cc: Morten Barklund
Subject: Re: [GENERAL] Unicode database on non-unicode operating system

Am Dienstag, 15. Juli 2008 schrieb Morten Barklund:
> I can see that lc_collate (sorting) and lc_ctype (lower-upper conversion)
> is set to en_DK and I guess that default encoding for en_DK is iso88591 or
> maybe windows1252.

It is ISO-8859-1.  There is no support for Windows charmaps on Linux.

> Thus my server should have been initialized with 
> en_DK.utf8 or?

Yes, or you should have chosen a different encoding (LATIN1 in your case) when 
creating the database.

> How do I find out what the default encoding for the locale en_DK is?

$ LC_ALL=en_DK locale charmap
ISO-8859-1

Note that this is not the "default" encoding, it is the *only* encoding 
supported by that locale.

> I can see, that normally one would sub-specify this by either 
> adding .iso88591 or .utf8, but is windows1252 then default?

It might be reasonable to use the .iso88591 or .utf8 suffixes if you want to 
be explicit, but the unsuffixed locale name is usually just an alias for one 
of these.

> I am not able to reinitdb, as many other databases are running, which might
> be affected negatively. This means, that even though my database is created
> WITH ENCODING 'unicode', it is in fact "broken" as the locale does not
> fully support unicode string handling?

Yes.  If you can't reinitdb, then you should recreate the database with 
encoding LATIN1.  This won't allow all Unicode characters, obviously, but at 
least you get proper behavior for the Danish characters that you need.



pgsql-general by date:

Previous
From: "Dave Page"
Date:
Subject: Re: FAQ correction for Windows 2000/XP
Next
From: Tom Lane
Date:
Subject: Re: C-procedure crashed in Postgres 8.3.3 when using 'text' variable (WinXP)