Re: [HACKERS] MULTIBYTE and SQL_ASCII (was Re: Re: A bug with pgsql 7.1/jdbc and non-ascii (8-bit) chars?) - Mailing list pgsql-jdbc

From Tom Lane
Subject Re: [HACKERS] MULTIBYTE and SQL_ASCII (was Re: Re: A bug with pgsql 7.1/jdbc and non-ascii (8-bit) chars?)
Date
Msg-id 18497.989376022@sss.pgh.pa.us
Whole thread Raw
In response to Re: [HACKERS] MULTIBYTE and SQL_ASCII (was Re: Re: A bug with pgsql 7.1/jdbc and non-ascii (8-bit) chars?)  (Barry Lind <barry@xythos.com>)
List pgsql-jdbc
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
>> Tom also mentioned that it might be possible for the server to support
>> setting the character set for a database even when multibyte wasn't
>> enabled.  That would then allow clients like jdbc to get a value from
>> non-multibyte enabled servers that would be more meaningful than the
>> current SQL_ASCII.  If this where done, then the 'UNKNOWN' hack would
>> not be necessary.

> Tom's suggestion does not sound reasonable to me. If PostgreSQL is not
> built with MULTIBYTE, then it means there would be no such idea
> "encoding" in PostgreSQL becuase there is no program to handle
> encodings. Thus it would be meaningless to assign an "encoding" to a
> database if MULTIBYTE is not enabled.

Why?  Without the MULTIBYTE code, the backend cannot perform character
set translations --- but it's perfectly possible that someone might not
need translations.  A lot of European sites are probably very happy
as long as the server gives them back the same 8-bit characters they
stored.  But what they would like, if they have to deal with tools like
JDBC, is to *identify* what character set they are storing data in, so
that their data will be correctly translated to Unicode or whatever.
The obvious way to do that is to allow them to set the value that
getdatabaseencoding() will return.

Essentially, my point is that identifying the character set is useful
to support outside-the-database character set conversions, whether or
not we have compiled the code for inside-the-database conversions.
Moreover, the stored data certainly has some encoding, whether or not
the database contains code that knows enough to do anything useful about
the encoding.  So it's not "meaningless" to be able to store and report
an encoding value.

I am not sure how much of the MULTIBYTE code would have to be activated
to allow this, but surely it's only a small fraction of the complete
feature.

            regards, tom lane

pgsql-jdbc by date:

Previous
From: Larry Mulcahy
Date:
Subject: Is DataSource implemented?
Next
From: Palle Girgensohn
Date:
Subject: Re: Re: [INTERFACES] Trouble with JDBC2 ResultSet.getDate()