Thread: Charset Win1250 on Windows and Ubuntu
Hi!
I have a software that uses Postgresql. This program (and website) developed and working on Window (XP/2003), with native charset (win1250).
Prior week we got a special request to install this software to a Linux server.
Yesterday I installed Ubu9.10 on VirtualBox, and tried to moving the database under Linux.
First big problem is that when I tried to create a database with same parameters as in Windows, the PGAdmin show an error.
The errormessage is:
"Error: new encoding (Win1250) is incompatible with the encoding of the template database (UTF8)."
Ok, I changed to "template0".
Then I got error that Win1250 is not good for collation hu_HU.UTF8.
When I tried to insert hungarian chars (to check sort order), the C and POSIX return wrong result - as I thought before.
The Windows version of PG and Admin is not supports collation, so these two options are disable (collation, character type).
But in Linux I have only UTF version that can sort rows in good order.
The problem that the client program is win1250 based, and I must rewrite all things to make same results.
Have anybody some way, some tricky solution for this problem?
Thanks for your help:
dd
I have a software that uses Postgresql. This program (and website) developed and working on Window (XP/2003), with native charset (win1250).
Prior week we got a special request to install this software to a Linux server.
Yesterday I installed Ubu9.10 on VirtualBox, and tried to moving the database under Linux.
First big problem is that when I tried to create a database with same parameters as in Windows, the PGAdmin show an error.
The errormessage is:
"Error: new encoding (Win1250) is incompatible with the encoding of the template database (UTF8)."
Ok, I changed to "template0".
Then I got error that Win1250 is not good for collation hu_HU.UTF8.
When I tried to insert hungarian chars (to check sort order), the C and POSIX return wrong result - as I thought before.
The Windows version of PG and Admin is not supports collation, so these two options are disable (collation, character type).
But in Linux I have only UTF version that can sort rows in good order.
The problem that the client program is win1250 based, and I must rewrite all things to make same results.
Have anybody some way, some tricky solution for this problem?
Thanks for your help:
dd
On Friday 18 December 2009 4:30:46 am Durumdara wrote: > Hi! > > I have a software that uses Postgresql. This program (and website) > developed and working on Window (XP/2003), with native charset (win1250). > > Prior week we got a special request to install this software to a Linux > server. > > Yesterday I installed Ubu9.10 on VirtualBox, and tried to moving the > database under Linux. > > First big problem is that when I tried to create a database with same > parameters as in Windows, the PGAdmin show an error. > The errormessage is: > "Error: new encoding (Win1250) is incompatible with the encoding of the > template database (UTF8)." > > Ok, I changed to "template0". > > Then I got error that Win1250 is not good for collation hu_HU.UTF8. > > When I tried to insert hungarian chars (to check sort order), the C and > POSIX return wrong result - as I thought before. > > The Windows version of PG and Admin is not supports collation, so these two > options are disable (collation, character type). There is a Linux version of PGAdmin available for Ubuntu 9.10. > > But in Linux I have only UTF version that can sort rows in good order. > > The problem that the client program is win1250 based, and I must rewrite > all things to make same results. > > Have anybody some way, some tricky solution for this problem? Use psql and CREATE DATABASE: http://www.postgresql.org/docs/8.4/interactive/sql-createdatabase.html > > Thanks for your help: > dd -- Adrian Klaver aklaver@comcast.net
On Sat, Dec 19, 2009 at 8:54 PM, Adrian Klaver <aklaver@comcast.net> wrote: >> The Windows version of PG and Admin is not supports collation, so these two >> options are disable (collation, character type). > > There is a Linux version of PGAdmin available for Ubuntu 9.10. Doesn't matter - pgAdmin supports collation and ctype on all platforms when creating databases. If the options are disabled, it's because the OP is running a server older than 8.4. -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com
On Saturday 19 December 2009 1:04:30 pm Dave Page wrote: > On Sat, Dec 19, 2009 at 8:54 PM, Adrian Klaver <aklaver@comcast.net> wrote: > >> The Windows version of PG and Admin is not supports collation, so these > >> two options are disable (collation, character type). > > > > There is a Linux version of PGAdmin available for Ubuntu 9.10. > > Doesn't matter - pgAdmin supports collation and ctype on all platforms > when creating databases. If the options are disabled, it's because the > OP is running a server older than 8.4. That is what I get for assuming. I figured since the OP was using Ubuntu 9.10 they where using the default version of Postgres, 8.4. -- Adrian Klaver aklaver@comcast.net
Durumdara wrote: > I have a software that uses Postgresql. This program (and website) developed and working on Window (XP/2003), > with native charset (win1250). > > Prior week we got a special request to install this software to a Linux server. > > Yesterday I installed Ubu9.10 on VirtualBox, and tried to moving the database under Linux. > > First big problem is that when I tried to create a database with same parameters as in Windows, the PGAdmin > show an error. > The errormessage is: > "Error: new encoding (Win1250) is incompatible with the encoding of the template database (UTF8)." > > Ok, I changed to "template0". > > Then I got error that Win1250 is not good for collation hu_HU.UTF8. > > When I tried to insert hungarian chars (to check sort order), the C and POSIX return wrong result - as I thought before. > > The Windows version of PG and Admin is not supports collation, so these two options are disable (collation, > character type). > > But in Linux I have only UTF version that can sort rows in good order. > > The problem that the client program is win1250 based, and I must rewrite all things to make same results. > > Have anybody some way, some tricky solution for this problem? If the collation ho_HU.UTF8 is what you want (can sort rows in good order), you should use UTF8 as database encoding. If you need the data in WIN1250 on the client side, change the client encoding to WIN1250. So: - Create the database with UTF8. - Change the client encoding to WIN1250 (e.g. by setting the environment variable PGCLIENTENCODING). - Import the dump of the Windows database. It will be converted to UTF-8. - Make sure that the client program has client encoding WIN1250. Yours, Laurenz Albe
Hi!
So if I have Python and pygresql, can I set this value in Python?
The main problem that I don't want to set this value globally - possible another applications want to use another encoding...
Thanks for your help:
dd
2009/12/19 Albe Laurenz <laurenz.albe@wien.gv.at>
If you need the data in WIN1250 on the client side, change the client encoding to WIN1250.
So:
- Create the database with UTF8.
- Change the client encoding to WIN1250 (e.g. by setting the environment variable PGCLIENTENCODING).
- Import the dump of the Windows database. It will be converted to UTF-8.
- Make sure that the client program has client encoding WIN1250.
Yours,
Laurenz Albe
So if I have Python and pygresql, can I set this value in Python?
The main problem that I don't want to set this value globally - possible another applications want to use another encoding...
Thanks for your help:
dd
On Mon, Dec 21, 2009 at 10:26:51AM +0100, Durumdara wrote: > So if I have Python and pygresql, can I set this value in Python? > The main problem that I don't want to set this value globally - possible > another applications want to use another encoding... Each connection can set the encoding to whatever they like. Something I find useful is to setup the DB as UTF-8 but then do: ALTER DATABASE foo SET client_encoding = latin9; which sets the default for the DB, or ALTER USER bar SET client_encoding = latin9; Which lets you set the defauts for each user. This means that old scripts can work unchanged but newer scripts can choose UTF-8 if they want it. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Please line up in a tree and maintain the heap invariant while > boarding. Thank you for flying nlogn airlines.
Attachment
On 21 Dec 2009, at 10:26, Durumdara wrote: > So if I have Python and pygresql, can I set this value in Python? > The main problem that I don't want to set this value globally - possible another applications want to use another encoding.... Sure you can, just execute SET client_encoding TO 'WIN1250' once you've set up your connection. You can even do that betweenqueries if your client encoding requirements change between queries. Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll see there is no forest. !DSPAM:737,4b2f51b9228057414011521!
Durumdara wrote: >> - Change the client encoding to WIN1250 (e.g. by >> setting the environment variable PGCLIENTENCODING). > > So if I have Python and pygresql, can I set this value in Python? > The main problem that I don't want to set this value globally > - possible another applications want to use another encoding... There may be special Python functions, but you can use the following SQL statement: SET client_encoding TO 'WIN1250' Yours, Laurenz Albe
Hi!
Is it converted to "?" or an exception dropped?
And if the UTF db contains non win1250 character?
Is it replaced in result with "?" or some exception dropped?
Thanks:
dd
2009/12/21 Albe Laurenz <laurenz.albe@wien.gv.at>
And what happening what DB recognize not win1250 character in SQL?Durumdara wrote:
>> - Change the client encoding to WIN1250 (e.g. by
>> setting the environment variable PGCLIENTENCODING).
>> So if I have Python and pygresql, can I set this value in Python?There may be special Python functions, but you can use the following
> The main problem that I don't want to set this value globally
> - possible another applications want to use another encoding...
SQL statement: SET client_encoding TO 'WIN1250'
Is it converted to "?" or an exception dropped?
And if the UTF db contains non win1250 character?
Is it replaced in result with "?" or some exception dropped?
Thanks:
dd
Durumdara wrote: [client_encoding is switched to WIN1250] > And what happening what DB recognize not win1250 character in SQL? > Is it converted to "?" or an exception dropped? > And if the UTF db contains non win1250 character? > Is it replaced in result with "?" or some exception dropped? What you wrote is very confusing/confused; this is problably a language problem. I'll try to reformulate your questions and answer them; if I got something wrong, please tell me. Q: What happens if your SQL statement contains a character that is not WIN1250 encoded? Is it converted to "?" or do you get an error? A: You get an error (this is not Oracle). Here an example for hex 88: ERROR: character 0x88 of encoding "WIN1250" has no equivalent in "UTF8" Since every known character is representable in UTF-8, that means that this is an invalid byte. Q: What happens if you select a character in the UTF8 database that cannot be converted to WIN1250? A: You will also get an error. Here is what you get for selecting a "G clef": ERROR: character 0xf09d849e of encoding "UTF8" has no equivalent in "WIN1250" Yours, Laurenz Albe