Re: SET client_encoding = 'UTF8' - Mailing list pgsql-jdbc

From Oliver Jowett
Subject Re: SET client_encoding = 'UTF8'
Date
Msg-id 48319F43.5070008@opencloud.com
Whole thread Raw
In response to Re: SET client_encoding = 'UTF8'  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: SET client_encoding = 'UTF8'  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-jdbc
Tom Lane wrote:
> Daniel Migowski <dmigowski@ikoffice.de> writes:
>> Kris Jurka schrieb:
>>> On Sun, 18 May 2008, Daniel Migowski wrote:
>>>> The command SET client_encoding = 'UTF8'
>>> throws an exception in the driver, because the driver expects UNICODE.
>>> This has been discussed before and the problem is that there are a too
>>> many ways to say UTF8 [1].  You can say UTF8, UTF-8, UTF -- 8, and so
>>> on. Perhaps we should strip all spaces and dashes prior to comparison?
>
> Perhaps we should make the backend return the values of client_encoding
> and server_encoding in canonical form (ie, "UTF8") regardless of the
> spelling variant the user used.  I'm not thrilled with having JDBC
> thinking it knows the conversion algorithm the backend uses.
>
> Of course, such a change would break code relying on the older behavior
> :-(

Not sure if this is a big enough issue to warrant a server change. It
only happens when a JDBC client issues a manual SET client_encoding to
an encoding that's UTF8 but isn't spelled "UNICODE". That's going to be
a no-op anyway, so I'm not entirely clear why the client needs to be
sending it in the first place.

It sounds like the root cause might be something like "let's feed
pg_dump output to JDBC". So we could add a special case in the driver to
allow exactly "UTF8" as well as "UNICODE", if that's the canonical way
the server spells it these days.

-O


pgsql-jdbc by date:

Previous
From: Tom Lane
Date:
Subject: Re: SET client_encoding = 'UTF8'
Next
From: Tom Lane
Date:
Subject: Re: SET client_encoding = 'UTF8'