Re: [Fwd: Patch for MULTIBYTE and SQL_ASCII (was Re: [JDBC] Re: A bug with pgsql 7.1/jdbc and non-ascii (8-bit) chars?)]] - Mailing list pgsql-patches

From Bruce Momjian
Subject Re: [Fwd: Patch for MULTIBYTE and SQL_ASCII (was Re: [JDBC] Re: A bug with pgsql 7.1/jdbc and non-ascii (8-bit) chars?)]]
Date
Msg-id 200106012057.f51Kvjv01558@candle.pha.pa.us
Whole thread Raw
In response to [Fwd: Patch for MULTIBYTE and SQL_ASCII (was Re: [JDBC] Re: A bug with pgsql 7.1/jdbc and non-ascii (8-bit) chars?)]]  (Barry Lind <barry@xythos.com>)
List pgsql-patches
Patch applied.  Thanks.

> The following patch for JDBC fixes an issue with jdbc running on a
> non-multibyte database loosing 8bit characters.  This patch will cause
> the jdbc driver to ignore the encoding reported by the database when
> multibyte isn't enabled and use the JVM default in that case.
>
> thanks,
> --Barry
>
>
> -------- Original Message --------
> Subject: Re: [HACKERS] MULTIBYTE and SQL_ASCII (was Re: [JDBC] Re: A bug
> with pgsql 7.1/jdbc and non-ascii (8-bit) chars?)
> Date: Fri, 25 May 2001 17:12:09 -0700
> From: Barry Lind
> To: Tatsuo Ishii , tgl@sss.pgh.pa.us
> References: <3AF74768.8060807@xythos.com>
> <20010508110249R.t-ishii@sra.co.jp> <3AF78113.6080907@xythos.com>
> <20010509102305C.t-ishii@sra.co.jp>
>
>
>
> Tatsuo, Tom,
>
> Since the two of you were the only two that seemed to care about this
> thread, I am addressing you directly.  I want to come to some sort of
> resolution.  Since it doesn't appear that anything is going to be
> changed in the backend code inn 7.2 to address the issue here, I will
> submit the attached patch to the jdbc code.
>
> This patch uses the function pg_encoding_to_char(1) to determine that
> multibyte is not enabled on the server (as suggested by Tatsuo), and in
> that case will use the default JVM character set to convert data from
> the backend. This is instead of the current behaviour that will force
> all data to 7bit ascii in the non-multibyte case because
> getdatabaseencoding() always returns SQL_ASCII for non-multibyte databases.
>
> If I don't hear anything, I will go ahead and submit this patch.
>
> thanks for your help on this issue.
>
> --Barry
>
>
> Tatsuo Ishii wrote:
>
> >>> Still I don't see what you are wanting in the JDBC driver if
> >>> PostgreSQL would return "UNKNOWN" indicating that the backend is not
> >>> compiled with MULTIBYTE. Do you want exact the same behavior as prior
> >>> 7.1 driver? i.e. reading data from the PostgreSQL backend, assume its
> >>> encoding default to the Java client (that is set by locale or
> >>> something else) and convert it to UTF-8. If so, that would make sense
> >>> to me...
> >>
> >> My suggestion would be that if the jdbc client was able to determine if
> >> the server character set was UNKNOWN (i.e. no multibyte) that it would
> >> then use some appropriate default character set to perform conversions
> >> to UCS2 (LATIN1 would probably make the most sence as a default).  The
> >> jdbc driver would perform its existing behavior if the character set was
> >> SQL_ASCII and multibyte was enabled (i.e. only support 7bit characters
> >> just like the backend does).
> >>
> >> Note that the user is always able to override the character set used for
> >> conversion by setting the charSet property.
> >
> >
> > I see.  However I would say we could not change the current behavior
> > of the backend until 7.2 is out. It is our policy the we would not
> > add/change existing functionalities while we are in the minor release
> > cycle.
> >
> > What about doing like this:
> >
> > 1. call pg_encoding_to_char(1)    (actually any number except 0 is ok)
> >
> > 2. if it returns "SQL_ASCII", then you could assume that MULTIBYTE is
> > not enbaled.
> >
> > This is pretty ugly, but should work.
> >
> >> Tom also mentioned that it might be possible for the server to support
> >> setting the character set for a database even when multibyte wasn't
> >> enabled.  That would then allow clients like jdbc to get a value from
> >> non-multibyte enabled servers that would be more meaningful than the
> >> current SQL_ASCII.  If this where done, then the 'UNKNOWN' hack would
> >> not be necessary.
> >
> >
> > Tom's suggestion does not sound reasonable to me. If PostgreSQL is not
> > built with MULTIBYTE, then it means there would be no such idea
> > "encoding" in PostgreSQL becuase there is no program to handle
> > encodings. Thus it would be meaningless to assign an "encoding" to a
> > database if MULTIBYTE is not enabled.
> > --
> > Tatsuo Ishii
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 2: you can get off all lists at once with the unregister command
> >     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
> >
> >
>
>
>

> *** ./org/postgresql/Connection.java.orig    Fri May 25 16:23:02 2001
> --- ./org/postgresql/Connection.java    Fri May 25 16:26:55 2001
> ***************
> *** 267,273 ****
>         //
>         firstWarning = null;
>
> !       java.sql.ResultSet initrset = ExecSQL("set datestyle to 'ISO'; select getdatabaseencoding()");
>
>         String dbEncoding = null;
>         //retrieve DB properties
> --- 267,274 ----
>         //
>         firstWarning = null;
>
> !       java.sql.ResultSet initrset = ExecSQL("set datestyle to 'ISO'; " +
> !         "select case when pg_encoding_to_char(1) = 'SQL_ASCII' then 'UNKNOWN' else getdatabaseencoding() end");
>
>         String dbEncoding = null;
>         //retrieve DB properties
> ***************
> *** 319,324 ****
> --- 320,330 ----
>
>           } else if (dbEncoding.equals("WIN")) {
>             dbEncoding = "Cp1252";
> +         } else if (dbEncoding.equals("UNKNOWN")) {
> +           //This isn't a multibyte database so we don't have an encoding to use
> +           //We leave dbEncoding null which will cause the default encoding for the
> +           //JVM to be used
> +           dbEncoding = null;
>           } else {
>             dbEncoding = null;
>           }
>
>

>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

pgsql-patches by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: show all;
Next
From: Tom Lane
Date:
Subject: Re: unary plus