Kris Jurka schrieb:
> On Sun, 18 May 2008, Daniel Migowski wrote:
>> The command SET client_encoding = 'UTF8'
>>
>> throws an exception in the driver, because the driver expects UNICODE.
> This has been discussed before and the problem is that there are a too
> many ways to say UTF8 [1]. You can say UTF8, UTF-8, UTF -- 8, and so
> on. Perhaps we should strip all spaces and dashes prior to comparison?
This would be correct in my opinion. I think no one darse to declare a
charset name the relies on charaters other than 0-9 and a-z to be
identifiable. IMHO we should just allow the way postgres allowes by
itself (we could dig into the parsing code of postgres). I tried at the
command line, and got the following:
set client_encoding='foobar';
FEHLER: Invalid value for parameter »client_encoding«: »foobar«
set client_encoding='utf8';
OK
set client_encoding='utf-8';
OK
set client_encoding='utf -- 8';
OK
set client_encoding='Utf -- 8';
OK
set client_encoding='Utf -- 98';
FEHLER: Invalid value for parameter »client_encoding«: »Utf -- 98«
set client_encoding='Utf_8';
OK
But I think we should be right with
userencoding.toLowercase().replaceall("[^0-9a-z]","").equals("utf8"); //
untested prototype code
or something like this.
>
> [1] http://archives.postgresql.org/pgsql-jdbc/2008-02/threads.php#00174
Thanks for the link.
With best regards,
Daniel Migowski