Re: Bug or not about ASCII and Multi-Byte character set - Mailing list pgsql-odbc

From Andreas Pflug
Subject Re: Bug or not about ASCII and Multi-Byte character set
Date
Msg-id 4305E8A4.6000306@pse-consulting.de
Whole thread Raw
In response to Re: Bug or not about ASCII and Multi-Byte character set  (Marc Herbert <Marc.Herbert@emicnetworks.com>)
Responses Re: Bug or not about ASCII and Multi-Byte character set  (Marc Herbert <Marc.Herbert@emicnetworks.com>)
List pgsql-odbc
Marc Herbert wrote:

>If SQL_ASCII is/was equivalent to "ignoring encoding", then it
>looks/looked pretty misnamed!
>
It's not. It should be used for ASCII only, but the database system will
not barf if you offer it a byte with the upper bit set. You're simply on
your own.

>Encoding ignorance should rather be called SQL_BINARY. A BINARY setting
>for strings makes sense, just like when transfering text files using
>FTP: you just don't trust FTP for encodings and use it like a
>filesystem. BINARY just means that: "don't mess-up with encodings and
>let something else deal with the issue".
>
>
No, binary would include 0x00 which is definitely *not* a character but
the string terminator. If SQL_ASCII would be implemented nowadays, there
probably would be a check for the upper bit cleared, and have it
rejected otherwise. But since this part is really really old, this can't
be changed without breaking zillions of old apps that used to ignore
proper storage encoding.

>I guess some people knew what they did and simply did not mixed
>driver/apps, or in a way they mastered and that worked.
>
>
The latter, with the obvious chance to break if the next app accesses
the data. This is certainly not the design goal of a RDBMS.

>Well while reading at the complaints it seems this BINARY mode was
>there before (by "accident"?),
>
No.

>Looks like people fixed issues by themselves before,
>
They didn't fix anything, they worked around the wrong chosen server
encoding. I perfectly understand this, because initially I did the same
mistake.

> and Postgres
>recent fixing does not interact nicely with theirs?
>
>
Automatically choosing the right client encoding and properly converting
in the driver did (and maybe still has) bugs, but fixing these will
certainly support the rules as proper design requires it, not
ill-designed apps.

>PS: BTW "unicode" is not one encoding but many different ones.
>
>
Doesn't matter. Always means the current Unicode for the system: in the
backend UTF-8, on Win32 UCS16, Linux UCS32 or UTF-8 dependent on
interface definition. The *driver* has to take care of the proper
conversion, *if* it is instructed correctly (i.e. correct server encoding)

Regards,
Andreas


pgsql-odbc by date:

Previous
From: "Joel Fradkin"
Date:
Subject: Re: Bug or not about ASCII and Multi-Byte character set
Next
From: Marc Herbert
Date:
Subject: Re: Bug or not about ASCII and Multi-Byte character set