Re: pgadmin3 clientencoding - Mailing list pgadmin-hackers

From Andreas Pflug
Subject Re: pgadmin3 clientencoding
Date
Msg-id 3EE5C867.2040701@web.de
Whole thread Raw
In response to Re: pgadmin3 clientencoding  (Jean-Michel POURE <jm.poure@freesurf.fr>)
Responses Re: pgadmin3 clientencoding
List pgadmin-hackers
Jean-Michel POURE wrote:

>On Tuesday 10 June 2003 11:39, Andreas Pflug wrote:
>
>
>>OK, this means a client encoding per database is needed, right?
>>Additional property for database?
>>
>>
>
>Yes. Whenever possible database, client and wxWindows encodings should be the
>same. For example, the best solution is to have a full Unicode chain:
>
>- UTF-8 database
>- UTF-8 data stream (set client-encoding='Unicode')
>- UTF-8 display libraries (wxGTK with --enable-unicode).
>
>When server encoding differs, it can cause problems. Example:
>- Latin1 database
>- Unicode stream (set client-encoding='Unicode')
>- UTF-8 display (wxGTK with --enable-unicode)
>
>The grid will display information fine, but whenever the user enters an
>illegal character (for example a euro sign which does not belong to Latin1
>but belongs to UTF-8), it will be dropped by the parser.
>
>This kind of problem is less frequent with Asian encodings:
>- SJSS database
>- Unicode stream (set client-encoding='Unicode')
>- UTF-8 display (wxGTK with --enable-unicode)
>
>The only solution I see would be to use the iconv libraries (or recode
>libraries) to check whether a text entered by a user can be converted into a
>server encoding safely or not.
>
>A "safety" conversion test can be done in two steps:
>1) convert the text entered by the user from UTF-8 into the database encoding,
>2) convert the resulting text back from database encoding into UTF-8.
>
>If the text is the same, the conversion is "safe". Example:
>- Latin1 database
>- Unicode stream (set client-encoding='Unicode')
>- UTF-8 display (wxGTK with --enable-unicode)
>
>1) convert the text entered by the user from UTF-8 into Latin1,
>2) convert the resulting text back from Latin1 into UTF-8.
>
>In this example, if a user enters a euro sign (€), it will be dropped and
>hence the "safety" test will fail.
>
The longer I think about this, the more the current implementation
appears wrong to me. The decisive factor is not a user's wish, but the
ability of our charset conversion ability, and that's pretty clear:
wxString can convert unicode to ascii and back, nothing else. Since
unicode will be the recommended setup for non-ascii databases, the
client encoding should be unicode for all connections. This should
enable correct schema and property display. Allowing the connection to
be something different would mean wxString needs to know how to convert
from xxx to unicode, i.e. implementing a client side conversion, which
doesn't make sense. This means: client encoding=SQL_ASCII for
non-unicode, and UNICODE for unicode compiled pgAdmin3.

The remaining problem is that of text entered by the user. This
separates into two categories:
1) freetext entry from frmQuery. The user is responsible to use correct
settings and input representation
2) guided entry, here we hopefully know what may be entered, and check
ourselves for legal characters.

Regards,
Andreas


pgadmin-hackers by date:

Previous
From: Jean-Michel POURE
Date:
Subject: pgAdmin3 RedHat binaries available
Next
From: "Adam H. Pendleton"
Date:
Subject: Re: Linking error (same old story)