Thread: ANSI and Unicode driver
So really, what is the difference between the ANSI and the Unicode driver? The Unicode driver sets the client encoding to UTF-8, but does that mean that the client application has to use UTF-8 or does the driver manager convert that? What do you use if you have, say, a Chinese application. Or a Latin 1 application but a UTF-8 database? How does this work? I'm confused. -- Peter Eisentraut http://developer.postgresql.org/~petere/
> -----Original Message----- > From: pgsql-odbc-owner@postgresql.org > [mailto:pgsql-odbc-owner@postgresql.org] On Behalf Of Peter Eisentraut > Sent: 20 February 2006 16:05 > To: pgsql-odbc@postgresql.org > Subject: [ODBC] ANSI and Unicode driver > > So really, what is the difference between the ANSI and the > Unicode driver? > The Unicode driver sets the client encoding to UTF-8, but > does that mean that > the client application has to use UTF-8 or does the driver > manager convert > that? What do you use if you have, say, a Chinese > application. Or a Latin 1 > application but a UTF-8 database? How does this work? I'm confused. The Unicode driver adds a bunch of unicode-specific APIs. Our ANSI driver can handle Unicode data as multibyte strings as well, but without the Unicode APIs that many non-multibyte aware versions of Windows require (if that makes sense!). Regards, Dave.
Peter Eisentraut wrote: >So really, what is the difference between the ANSI and the Unicode driver? > There are 2 kind of applications, Unicode applications and ANSI applications. Unicode applications uses UCS-2(4) encoding and call Unicode ODBC APIs. > >The Unicode driver sets the client encoding to UTF-8, but does that mean that >the client application has to use UTF-8 > Though Unicode applications are preferable for Unicode drivers, >or does the driver manager convert >that? > wise driver managers may invoke ANSI <-> UCS-2(4) conversions when ANSI applications call ANSI ODBC APIs for the Unicode driver. regards, Hiroshi Inoue
Dave Page wrote: > The Unicode driver adds a bunch of unicode-specific APIs. Our ANSI > driver can handle Unicode data as multibyte strings as well, but > without the Unicode APIs that many non-multibyte aware versions of > Windows require (if that makes sense!). Well, the information available to me seems to indicate that Unicode drivers will handle ANSI applications just fine, and of course our Unicode driver will also handle any server encoding, so the question is why the ANSI version needs to exist. -- Peter Eisentraut http://developer.postgresql.org/~petere/
On 20/2/06 18:18, "Peter Eisentraut" <peter_e@gmx.net> wrote: > Dave Page wrote: >> The Unicode driver adds a bunch of unicode-specific APIs. Our ANSI >> driver can handle Unicode data as multibyte strings as well, but >> without the Unicode APIs that many non-multibyte aware versions of >> Windows require (if that makes sense!). > > Well, the information available to me seems to indicate that Unicode > drivers will handle ANSI applications just fine, and of course our > Unicode driver will also handle any server encoding, Yes,that's quite correct. > so the question is > why the ANSI version needs to exist. Well, for 8.0 we did release only 1 driver for that reason, but we kept getting odd reports that it didn't handle non-ASCII characters (with umlauts or accents etc) properly in some situations, that others, including myself could never reproduce. After lots of on-list discussion in the latter part of last year during which I made it clear on a number of occasions that we needed help from someone who understood encodings better than I, we eventually concluded that the best solution was to reinstate the old ansi driver, as it appeared to my limited understanding that whilst basic ASCII characters mapped directly into the lower bytes of the Unicode function function parameters, we needed some conversion code to cope with other characters (which the SQL Server driver appears to have as a configurable options for example). Reinstating the ANSI only driver fixed things instantly for all those that were complaining BTW, though I did note an email from Hiroshi Inoue earlier today implying that it's actually the 'wise' DM that handles the conversion (although the complainants were all on Windows iirc). FWIW, the 07_03_ENHANCED branch does only build the Unicode driver, though I'm not yet sure if it will suffer the same problems that 08.00 did. Feel free to explain what exactly is wrong if you know :-) Regards, Dave.
Dave Page <dpage@vale-housing.co.uk> writes: > Reinstating the ANSI only driver fixed things instantly for all those that > were complaining BTW, though I did note an email from Hiroshi Inoue earlier > today implying that it's actually the 'wise' DM that handles the conversion > (although the complainants were all on Windows iirc). FWIW, the > 07_03_ENHANCED branch does only build the Unicode driver, though I'm not yet > sure if it will suffer the same problems that 08.00 did. By the way being a "wise" DM that handle conversions is tricky. Think for instance about ODBC entry points that may return many different data types, including strings. See for instance this bug in unixODBC: <http://comments.gmane.org/gmane.comp.db.unixodbc.devel/1760> Related "DOC: Explanation of Length Arguments for Unicode ODBC Functions" <http://support.microsoft.com/default.aspx?scid=kb;EN-US;q294169>
Hiroshi Inoue <inoue@tpf.co.jp> writes: > Peter Eisentraut wrote: > >>So really, what is the difference between the ANSI and the Unicode driver? "ANSI" actually just means "8 bits" in MS-speak. <http://blogs.msdn.com/oldnewthing/archive/2004/05/31/144893.aspx> And "Unicode" actually means UCS-2/UTF-16. Things started to become clearer for me once I found the translation... > There are 2 kind of applications, Unicode applications and ANSI > applications. > Unicode applications uses UCS-2(4) encoding and call Unicode ODBC APIs. Has anyone already seen some real 4-bytes/UCS-4 ODBC applications running out there, or only 2-bytes/UCS-2/UTF-16 applications like Microsoft implies through all its documentations?