Re: encoding - Mailing list pgsql-odbc
From | Joel Fradkin |
---|---|
Subject | Re: encoding |
Date | |
Msg-id | 001c01c55732$9e56d800$797ba8c0@jfradkin Whole thread Raw |
In response to | Re: encoding (Marko Ristola <marko.ristola@kolumbus.fi>) |
Responses |
Re: encoding
|
List | pgsql-odbc |
I originally tried a Unicode database, but when the .net application I wrote to move the data from mssql to postgres blew up on the french characters. I am live now on postgres, is there a simple way to move from SQL_ASCHII to Unicode? Assuming the new 8.0 odbc driver will correctly present the data if the database is Unicode. It was the older odbc driver that gave me the error writing to the Unicode database. I guess the lib connection is ok since I could cut and paste French chars into the Unicode database, but when I used the program (7.4 odbc driver) it gave me an error trying to update the data base. That is why I switched to the SQL_ASCHII at the time. I do plan on implementing a second postgres server for reporting. I am hoping I can figure out how to use slonie to replicate the first server onto the second (can start with a restore, just need to keep the data synced up). I am a bit worried about the replication slowing things down even more. Joel Fradkin Wazagua, Inc. 2520 Trailmate Dr Sarasota, Florida 34243 Tel. 941-753-7111 ext 305 jfradkin@wazagua.com www.wazagua.com Powered by Wazagua Providing you with the latest Web-based technology & advanced tools. C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc This email message is for the use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and delete and destroy all copies of the original message, including attachments. -----Original Message----- From: Marko Ristola [mailto:marko.ristola@kolumbus.fi] Sent: Thursday, May 12, 2005 3:28 PM To: Joel Fradkin Cc: pgsql-odbc@postgresql.org Subject: Re: [ODBC] encoding So the new ODBC driver does 7bit ASCII -> UNICODE conversion. Windows Unicode conversion functions seem to set 8 bit non-ascii characters into question marks. That is a correct behaviour for charset conversion functions. I recommend strongly for new database installs to use something else than SQL_ASCII, because you use non-US characters also. Latin1(iso-8859-1) or similar is fine for workability. UTF8 is very good alternative, because everybody is moving into it in the long term. You get more portability with UTF-8, but it is a bit slower than Latin1. The new driver has an improved unicode support. That is the reason, why the 7bit ASCII->UNICODE conversion will be done in the new failing driver. About ten years ago UNICODE was not used so much. All programs worked well with 7bit ASCII settings. Nowadays you need to tell for applications, that what charset you are using. Otherwise you might find a program, that does charset conversions, and the characters will move into question marks, like they did. So the first step is to tell for the database, that what charset you are using :) So, you have still performance issues to solve. On my opinion, different databases might need a bit different optimization: if you optimize for MSSQL, it might be slow with PostgreSQL, and perhaps vice versa. This rule applies for many databases, although I don't have experience with MSSQL on this regard. If both databases use a similar query plan, they might be of similar speed (algorithmically similar). I don't know, how the number of CPUs affect on this with these databases: alghoritmically the work to be done is the same on similar plans, but there are two workers. Query speed increase in time might be up to twise as fast compared to one CPU (if the query in question can be parallelized nicely at software and hardware levels). It is sometimes a good idea to use more than one database server, if the performance is not good enough otherwise: for example using different databases for different tasks to balance the load. There was in these days (this week) an interesting thread on the PostgreSQL performance list about 100 computer WWW server system with many databases and caches to avoid unnecessary database usage. Good luck for you. Marko Ristola Joel Fradkin wrote: >The data base is SQL_ASCHII >I guess the locale is whatever it defaults to when you install from rpm on >redhat as4 not sure? >lc_messages = 'en_US.UTF-8' # locale for system error message >strings >lc_monetary = 'en_US.UTF-8' # locale for monetary formatting >lc_numeric = 'en_US.UTF-8' # locale for number formatting >lc_time = 'en_US.UTF-8' # locale for time formatting > >The client is a win2k box. > >I can see the chars look ok when I view using pgadmin. >.net was displaying them ok. >The old odbc driver was displaying them ok. > >Just the new ODBC driver is doing something to them to make them appear as >question marks. > >In any event I switched to the old driver and the site is ok. >I am very busy with after conversion repairs, but maybe later I can take a >closer look at if there is a better way (I am brain dead at the moment 75 >hours last week and looking like that this week). > >Unfortunately I am still having severe issues with speed and may need to use >my 2 proc SQL server for some reporting. > >Joel Fradkin > >Wazagua, Inc. >2520 Trailmate Dr >Sarasota, Florida 34243 >Tel. 941-753-7111 ext 305 > >jfradkin@wazagua.com >www.wazagua.com >Powered by Wazagua >Providing you with the latest Web-based technology & advanced tools. >C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc > This email message is for the use of the intended recipient(s) and may >contain confidential and privileged information. Any unauthorized review, >use, disclosure or distribution is prohibited. If you are not the intended >recipient, please contact the sender by reply email and delete and destroy >all copies of the original message, including attachments. > > > > >-----Original Message----- >From: Marko Ristola [mailto:marko.ristola@kolumbus.fi] >Sent: Wednesday, May 11, 2005 1:17 PM >To: Joel Fradkin >Cc: pgsql-odbc@postgresql.org >Subject: Re: [ODBC] encoding > > > >Hi > >Database's charset must be something other than plain ASCII. >(Same thing needs to be in Windows.) > >Client charset is defined by environment variables. >PostgreSQL Server charset is defined at least in database creation. > >When charsets are defined correctly, the PostgreSQL does know >the charsets and can do client charset conversions. > >The newest Windows ODBC driver requires correct locale settings. >Maybe the older PostgreSQL server + ODBC driver don't do any >conversions, thus they just works, in that case, when there >is no need for charset conversions. > >What is you PostgreSQL server's database locale setting? >Please see documentation for "create database", >and INITDB commandline tools for charset selection. > > >I hope this helps. I'm interested in charset alterations in ODBC, but >I don't know the psqodbc charset alteration history, or the last version's >functionality, well enough, to give robust answers. > >Marko Ristola > >Joel Fradkin wrote: > > > >>I just wanted to document a recent issue, it may be that I am not aware of >>the proper way to use encoding with the 8.0 versions of odbc. >> >>With 7.4 I was getting char codes correctly from the odbc. >> >>With version 8. (just downloaded) I had a issue on my windows 2000 servers >>displaying question marks instead of the French chars. >> >>I was testing on win2003 with 7.4, so I switched the win2k machines and >> >> >they > > >>display correctly (I amusing asp). >> >> >> >>Joel Fradkin >> >> >> >>Wazagua, Inc. >>2520 Trailmate Dr >>Sarasota, Florida 34243 >>Tel. 941-753-7111 ext 305 >> >> >> >>jfradkin@wazagua.com >>www.wazagua.com >>Powered by Wazagua >>Providing you with the latest Web-based technology & advanced tools. >>C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc >>This email message is for the use of the intended recipient(s) and may >>contain confidential and privileged information. Any unauthorized review, >>use, disclosure or distribution is prohibited. If you are not the intended >>recipient, please contact the sender by reply email and delete and destroy >>all copies of the original message, including attachments. >> >> >> >> >> >> >> >> >> >> >> >> >> > > >
pgsql-odbc by date: