Re: Bug report: odbc driver does not convert wide chars to chars - Mailing list pgsql-odbc

From Jon Raiford
Subject Re: Bug report: odbc driver does not convert wide chars to chars
Date
Msg-id E8335E5D-9B6F-4EAA-A298-50DACE10DBBD@labware.com
Whole thread Raw
In response to Re: Bug report: odbc driver does not convert wide chars to chars  (Jon Raiford <raiford@labware.com>)
List pgsql-odbc

First of all, I should say that I am not one of the pgsql-odbc driver developers so I do not speak for them. I only speak for myself.

 

I suggest you reference the documentation that supports your case, specifically in regards to using WCHAR types with the A / narrow API calls.  Clearly the value has or has been converted to 16-bit characters and is using the first null as an end of string marker.  It isn’t clear from your description if the issue is in your code or understanding of the spec, the driver manager, or the ODBC driver.  I would suggest that saying simply that the MySQL driver works for your scenario may not be enough evidence.

 

Alternatively, the driver code is available and you may decide to find and fix the issue and submit a patch. Many of us have done that but is definitely not required.

 

Best of luck to you and maybe someone will be able to help you.

 

Jon

 

From: Farid z <farid@zidsoft.com>
Date: Monday, November 22, 2021 at 7:07 PM
To: Jon Raiford <raiford@labware.com>, "pgsql-odbc@postgresql.org" <pgsql-odbc@postgresql.org>
Subject: RE: Bug report: odbc driver does not convert wide chars to chars

 

Client ODBC apps are supposed to use the narrow or wide ODBC API depending on whether the client app is a narrow or wide app on Windows (A or W versions of the Windows/ODBC API).

 

Since my app is a narrow Windows app (uses narrow windows and ODBC API), it uses the ANSI/narrow versions of MySQL and MariaDB drivers and all ODBC drivers that have separate ANSI/narrow and Wide versions of a driver to match the API narrow/wide ODBC API calls of the drivers.

 

The ANSI/wide determines what version of the ODBC API is called (narrow or wide,

ie, SQLBindColA or SQLBindColW, etc) and has nothing to do with what actual data types the app uses. Client ODBC apps can use any ODBC data type including char and wchar_t data types and ODBC Drivers are required to convert from bound parameter data type to target dbms column data type as necessary.

 

When an app (whether the app uses narrow or wide version of the ODBC API) binds a parameter/column data buffer as wchar_t and the database column data type is UTF-8, the driver has to converts the wchar_t buffer data to UTF-8 on inserting the data into the dbms. This conversion of data types as necessary has nothing to do with whether the ODBC driver is A or W).

 

So it is very confusing to make a narrow Windows client API app connect with W ODBC driver where there is also a narrow version of the driver. This would at a minimum require the ODBC driver manager to convert perfectly valid narrow client data to wide data to match the ODBC Driver Wide API interface.

 

I tested the PostgreSQL Unicode version of the ODBC driver and it worked in this case. However, the driver performance is significantly reduced by over 50% (ODBC Driver Manager has to convert all the client narrow data buffers to wide buffers as it relays the client data to the ODBC driver).

 

So, this is a bug in the PostgreSQL ANSI ODBC driver. But I guess, it has a workaround with a severely degraded ODBC driver performance.

 

Not sure why the PostgreSQL narrow driver does not do what it is supposed to do rather than requiring apps to use the wide version of the driver to work with wchar_t data. The ODBC spec says client apps can use any valid ODBC data types and the ODBC driver is supposed to convert the data types as necessary.

 

With Windows now supporting UTF-8 narrow apps natively via UTF-8 code page, it is not very helpful, nor necessary, to incur severely degraded PostgreSQL ODBC driver performance by forcing native Windows UTF-8/narrow client apps to have to use the wide version of the PostgreSQL ODBC driver to work with other DBMS  wchar_t data types (like MS SQL Server, Oracle, etc, wide char data types).

 

I would recommend fixing the PostgreSQL ODBC ANSI(narrow) driver to work with native narrow windows client apps without always incurring unnecessary conversions by the ODBC driver manager of the narrow client ODBC apps all data to wide chars.

 

Farid

 

CompareData  Compare and synchronize sql dbms data visually

Strobe  Strobe light for your phone

 

From: Jon Raiford
Sent: Monday, November 22, 2021 3:31 PM
To: pgsql-odbc@postgresql.org
Cc: Farid z
Subject: Re: Bug report: odbc driver does not convert wide chars to chars

 

This does not sound like a bug. I’m sure that you will find that you are using a Unicode version of MySQL and MariaDB ODBC drivers.  Is there a reason you are not using the Unicode Postgres driver?  If you insist on using the ANSI driver, I would suggest not trying to use the Unicode “W” (wide char) functions or data types.  But really, I think your answer is to just use the Unicode driver.  I honestly don’t know why the ANSI driver is still being distributed, but I suppose there are still a few people out there who need it.

 

Jon

 

From: Farid z <farid@zidsoft.com>
Date: Monday, November 22, 2021 at 2:45 PM
To: Jon Raiford <raiford@labware.com>
Subject: RE: Bug report: odbc driver does not convert wide chars to chars

 

Right, I think that’s what’s happening.

 

App is UTF-8 app code-page on Windows.

App is telling the driver the data buffer is SQL_WCHAR, driver needs to convert from wide chars to the column database data type rather than treating the data buffer as UTF-8 chars.

 

Driver is supposed to convert from source data buffers data type to dbms column data type as necessary. Converting from wide-char to UTF-8 is not ambiguous.

 

This works as expected with other ODBC drivers like MySQL and MariaDB.

 

Farid

 

CompareData  Compare and synchronize sql dbms data visually

Strobe  Strobe light for your phone

 

From: Jon Raiford
Sent: Monday, November 22, 2021 9:01 AM
To: Farid z
Subject: Re: Bug report: odbc driver does not convert wide chars to chars

 

It looks like you are using the ANSI driver (PSQLODBC30A) rather than the Unicode driver (PSQLODBC35W).  The ANSI driver likely sees the first null and assumes the end of the string has been reached.  I would suggest trying again with the Unicode driver.

 

Jon

 

From: Farid z <farid@zidsoft.com>
Date: Sunday, November 21, 2021 at 12:46 PM
To: "pgsql-odbc@postgresql.org" <pgsql-odbc@postgresql.org>
Subject: Bug report: odbc driver does not convert wide chars to chars

 

PostgreSQL 14.0.0, PSQLODBC30A.DLL 13.02.0000

Windows x64 (Windows 10 or Windows 11).

 

Migrating data from SQL Server (or any other dbms that supports SQ_WCHAR/SQL_WVARCHAR) data types) to PostgreSQL.

 

Steps to reproduce:

 

1 create SQL Server table source data table:

 

create table test_char(

col1 nchar(8),

col2 nvarchar(30));

 

2 create PostgreSQL table target table:

 

create table test_char(

col1 char(8),

col2 varchar(30));

 

3 add test data in MS SQL Server table:

 

insert into test_char

values

(N'a', N'a');

 

insert into test_char

values

(N'bb', N'bb');

 

insert into test_char

values

(N'ccc', N'ccc');

 

insert into test_char

values

(N'ffffffff', N'ffffffff');

 

4 application binds the two columns as SQL_C_CHAR and SQL_C_WCHAR (excerpt from attached log).

 

cmpdata         6be8-29f4 EXIT  SQLBindParameter  with return code 0 (SQL_SUCCESS)

               HSTMT               0x000000000591E5D0

               UWORD                        1

               SWORD                        1 <SQL_PARAM_INPUT>

               SWORD                       -8 <SQL_C_WCHAR>

               SWORD                        1 <SQL_CHAR>

               SQLULEN                    8

               SWORD                        0

               PTR                0x0000000000411178

               SQLLEN                    18

               SQLLEN *            0x0000000000411170 (-1)

 

5 drivers successfully executes the inserts statements into PostgreSQL but does not convert from SQL_WCHAR and SQL_WVARCHAR, looks like it driver just grabs the first bytes of each inserted value.

 

6 Actual inserted data is just the first letter of each bound value.

 

‘b’ instead of ‘bb’, ‘c’ instead of ‘ccc’, ‘f’ instead of ‘ffffffff’.

 

 

Please see attached log with insert statements and screen shots.

 

Farid

 

 

CompareData  Compare and synchronize sql dbms data visually

Strobe  Strobe light for your phone

 

 

 

pgsql-odbc by date:

Previous
From: Jon Raiford
Date:
Subject: Re: Bug report: odbc driver does not convert wide chars to chars
Next
From: Andrey Bugaenko
Date:
Subject: Fwd: How to turn off logging