Re: [Q] UTF-8 testing with Windows/ODBC 8.3.0400 - Mailing list pgsql-odbc
From | V S P |
---|---|
Subject | Re: [Q] UTF-8 testing with Windows/ODBC 8.3.0400 |
Date | |
Msg-id | 1237363789.13407.1306005793@webmail.messagingengine.com Whole thread Raw |
In response to | Re: [Q] UTF-8 testing with Windows/ODBC 8.3.0400 (Craig Ringer <craig@postnewspapers.com.au>) |
Responses |
Re: [Q] UTF-8 testing with Windows/ODBC 8.3.0400
|
List | pgsql-odbc |
Hi thank you for the follow up. Just had a break through... I believe I was able to resolve most of the problems I finally found a post on the net that says: if you want UTF8 in your ODBC-based client program and Postgres is in UTF8 then use the ASCII driver not the Unicode So as soon as switched that things worked (I also kept: set client_encoding='UTF' on my ODBC connections at startup) I used : wchar_t wData[1000]; ::MultiByteToWideChar(CP_UTF8, 0, my_normal_std_string.c_str(), -1, wData, 1000); to convert the read data and display in the debugger Before the switch to the Ascii odbc driver, the above was just showing question marks. So reading UTF8 into ODBC programming using the Ascii driver works perfect. Since I now I understood what was going on, converted most of my strings to wstrings, and then enabled Unicode Version of the PG ODBC driver -- and that works too ! :-) so now I have a #define where I switch between wstrings and strings and of course a few other things, and then I flip the drivers in ODBC datasource and things work (I have tested selects sofar). Three things that I am not still sure about, and may be you can help: a) does Posgtres driver on unixODBC do the same as Windows (that is there is a unicode and non unicode versions of the drivers ? (I am interested in 64 bit linux and 64 bit freebsd ones) b) I noticed that when using the Unicode version (first) and Ascii version (second) the value of SWORD right before SQLVLEN is different (it is 12 on the ascii and -9 on unicode version) -- what does this mean? disp_otrq_x86d 8a4-b90 EXIT SQLDescribeColW with return code 0 (SQL_SUCCESS) HSTMT 013F1BA8 UWORD 11 WCHAR * 0x01A28974 [ 9] "cntr_data" SWORD 512 SWORD * 0x01A28BC4 (9) SWORD * 0x01A28BB8 (-9) SQLULEN * 0x01A28B94 (4096) SWORD * 0x01A28BA0 (0) SWORD * 0x01A28B88 (1) disp_otrq_x86d ab8-498 EXIT SQLDescribeColW with return code 0 (SQL_SUCCESS) HSTMT 013F1C38 UWORD 11 WCHAR * 0x01A28974 [ 9] "cntr_data" SWORD 512 SWORD * 0x01A28BC4 (9) SWORD * 0x01A28BB8 (12) SQLULEN * 0x01A28B94 (4096) SWORD * 0x01A28BA0 (0) SWORD * 0x01A28B88 (1) another question: I have about 6 tables where about 20 fields in each table, 2 fields are 65K long (they are declared as varchar(65000) is this is OK for ODBC drivers, and what if anything I should be setting on them? Thank you again for your follow up, Vlad On Wed, 18 Mar 2009 15:46 +0900, "Craig Ringer" <craig@postnewspapers.com.au> wrote: > V S P wrote: > > > My C++ program relies on OTL C++ library to do DB access, and in the > > Visual Studio debugger I see only question marks '?' for the strings. > > How would Visual Studio know that the std::string instances in question > contain UTF-8 data? std::string is a byte string, not a character string > - it could contain text in any encoding (or non-text data) and VC++ has > no way of knowing how to interpret it. > > What it probably does is display anything within the ASCII range, and > otherwise display ?s . > > If you expect to be able to work with those strings as real text, you > probably want to use std::wstring instead, and USE APPROPRIATE ENCODING > CONVERSION ROUTINES. Note that the width of wchar_t varies from platform > to platform, so you'll need to convert to/from UTF-16 for a 2 byte > wchar_t, or to/from UTF-32 for a 4-byte wchar_t. > > (I hate working with unicode and encodings in standard C++ *SO* much - > argh! One of the only areas where I really wish I was using Java. If > only the QString class from Qt was part of standard C++ ... ). > > > I am using std::string to store the bytestream from varchar column an I > > think it is ok > > because I do not need to 'manipulate' the content. > > True - but VC++ won't be able to understand what's in it, either. > > > I cannot figure out what else I might be doing wrong.... as I said, all > > I need for now it is just to test out that a C++ program via ODBC can > > get the data. > > Your description really isn't adequate to say. It's highly likely that > you're retrieving the data from the database fine, but your tools don't > know it's UTF-8 and aren't able to work with it correctly. That's mostly > a guess with the amount of information you've provided, though. > > Perhaps you could post a small, self-contained test program and a SQL > script to populate a test database? Then post the results of running the > program against the database, including the hex values of the bytes > returned by the ODBC interface. > > -- > Craig Ringer -- V S P toreason@fastmail.fm -- http://www.fastmail.fm - A no graphics, no pop-ups email service
pgsql-odbc by date: