Re: [Q] UTF-8 testing with Windows/ODBC 8.3.0400 - Mailing list pgsql-odbc

From V S P
Subject Re: [Q] UTF-8 testing with Windows/ODBC 8.3.0400
Date
Msg-id 1237363789.13407.1306005793@webmail.messagingengine.com
Whole thread Raw
In response to Re: [Q] UTF-8 testing with Windows/ODBC 8.3.0400  (Craig Ringer <craig@postnewspapers.com.au>)
Responses Re: [Q] UTF-8 testing with Windows/ODBC 8.3.0400
List pgsql-odbc
Hi thank you for the follow up.
Just had a break through...

I believe I was able to resolve most of the problems

I finally found a post on the net that says:
if you want UTF8 in your ODBC-based client program and Postgres is in
UTF8
then use the ASCII driver  not the Unicode

So as soon as switched that things worked
(I also kept:  set  client_encoding='UTF' on my ODBC connections at
startup)

I used :

wchar_t wData[1000];
::MultiByteToWideChar(CP_UTF8, 0, my_normal_std_string.c_str(), -1,
wData, 1000);

to convert the read data and display in the debugger

Before the switch to the Ascii odbc driver, the above was just showing
question marks.


So reading UTF8 into ODBC programming using the Ascii driver works
perfect.

Since I now I understood what was going on, converted most of my strings
to wstrings, and then enabled Unicode Version of the PG ODBC driver --
and that works too  !

:-) so now I have a #define where I switch between wstrings and strings
and of course a few other things, and then I flip the drivers in ODBC
datasource
and things work  (I have tested selects sofar).


Three things that I am not still sure about, and may be you can help:

a) does Posgtres driver on unixODBC do the same as Windows (that is
there is a unicode and non unicode versions of the drivers ?
(I am interested in 64 bit linux and 64 bit freebsd ones)

b) I noticed that when using the Unicode version (first) and
Ascii version (second) the value of SWORD right before SQLVLEN is
different
(it is 12 on the ascii and -9 on unicode version) -- what does this
mean?

disp_otrq_x86d  8a4-b90 EXIT  SQLDescribeColW  with return code 0
(SQL_SUCCESS)
        HSTMT               013F1BA8
        UWORD                       11
        WCHAR *             0x01A28974 [       9] "cntr_data"
        SWORD                      512
        SWORD *             0x01A28BC4 (9)
        SWORD *             0x01A28BB8 (-9)
        SQLULEN *           0x01A28B94 (4096)
        SWORD *             0x01A28BA0 (0)
        SWORD *             0x01A28B88 (1)


disp_otrq_x86d  ab8-498 EXIT  SQLDescribeColW  with return code 0
(SQL_SUCCESS)
        HSTMT               013F1C38
        UWORD                       11
        WCHAR *             0x01A28974 [       9] "cntr_data"
        SWORD                      512
        SWORD *             0x01A28BC4 (9)
        SWORD *             0x01A28BB8 (12)
        SQLULEN *           0x01A28B94 (4096)
        SWORD *             0x01A28BA0 (0)
        SWORD *             0x01A28B88 (1)




another  question:  I have about 6 tables where about 20 fields in each
table,
2 fields are 65K long (they are declared as varchar(65000) is this is OK
for ODBC drivers, and what if anything I should be setting on them?



Thank you again for your follow up,
Vlad





On Wed, 18 Mar 2009 15:46 +0900, "Craig Ringer"
<craig@postnewspapers.com.au> wrote:
> V S P wrote:
>
> > My C++ program relies on OTL C++ library to do DB access, and in the
> > Visual Studio debugger I see only question marks '?' for the strings.
>
> How would Visual Studio know that the std::string instances in question
> contain UTF-8 data? std::string is a byte string, not a character string
> - it could contain text in any encoding (or non-text data) and VC++ has
> no way of knowing how to interpret it.
>
> What it probably does is display anything within the ASCII range, and
> otherwise display ?s .
>
> If you expect to be able to work with those strings as real text, you
> probably want to use std::wstring instead, and USE APPROPRIATE ENCODING
> CONVERSION ROUTINES. Note that the width of wchar_t varies from platform
> to platform, so you'll need to convert to/from UTF-16 for a 2 byte
> wchar_t, or to/from UTF-32 for a 4-byte wchar_t.
>
> (I hate working with unicode and encodings in standard C++ *SO* much -
> argh! One of the only areas where I really wish I was using Java. If
> only the QString class from Qt was part of standard C++ ... ).
>
> > I am using std::string to store the bytestream from varchar column an I
> > think it is ok
> > because I do not need to 'manipulate' the content.
>
> True - but VC++ won't be able to understand what's in it, either.
>
> > I cannot figure out what else I might be doing wrong.... as I said, all
> > I need for now it is just to test out that a C++ program via ODBC can
> > get the data.
>
> Your description really isn't adequate to say. It's highly likely that
> you're retrieving the data from the database fine, but your tools don't
> know it's UTF-8 and aren't able to work with it correctly. That's mostly
> a guess with the amount of information you've provided, though.
>
> Perhaps you could post a small, self-contained test program and a SQL
> script to populate a test database? Then post the results of running the
> program against the database, including the hex values of the bytes
> returned by the ODBC interface.
>
> --
> Craig Ringer
--
  V S P
  toreason@fastmail.fm

--
http://www.fastmail.fm - A no graphics, no pop-ups email service


pgsql-odbc by date:

Previous
From: Craig Ringer
Date:
Subject: Re: [Q] UTF-8 testing with Windows/ODBC 8.3.0400
Next
From: Hiroshi Inoue
Date:
Subject: Re: [Q] UTF-8 testing with Windows/ODBC 8.3.0400