Re: Patch for pl/tcl Tcl_ExternalToUtf and Tcl_UtfToExternal - Mailing list pgsql-patches

From Reinhard Max
Subject Re: Patch for pl/tcl Tcl_ExternalToUtf and Tcl_UtfToExternal
Date
Msg-id Pine.LNX.4.33.0109041317540.8768-100000@wotan.suse.de
Whole thread Raw
In response to Re: Patch for pl/tcl Tcl_ExternalToUtf and Tcl_UtfToExternal support  (Vsevolod Lobko <seva@sevasoft.kiev.ua>)
Responses Re: Patch for pl/tcl Tcl_ExternalToUtf and Tcl_UtfToExternal  (Peter Eisentraut <peter_e@gmx.net>)
List pgsql-patches
Hi,

sorry for stepping late into this discussion.
I've been on vacation for two weeks.

On Thu, 23 Aug 2001, Vsevolod Lobko wrote:

> > > Patch assumes that database encoding and system encoding of Tcl is
> > > equal.
> >
> > Hmm, is that a tenable assumption?  I don't know, I'm just asking.
>
> Yes, because it does 8-bit to unicode conversion and must to know
> codepage for 8-bit characters. Unfortunately charset names for tcl
> and postgres does not match, so this demands additional field in
> charset tables or additional table :((

I think you can't assume that a database has always the same encoding
as Tcl's system encoding. For pl/tcl you could set the system encoding
to the database's encoding, but then you'd need that additional name
conversion table anyway be it a database table or hardcoded. For PgTcl
it is definitely up to the user which system encoding the interpreter
has.

I for example create my databases in UNICODE (to get PostgreSQL
working with Tcl 8.3 and without patching pl/tcl or PgTcl), but my
Tcl-Interpreter's system encoding is iso-8859-1.

So basically there are two possibilities:

a) Patch pl/tcl and PgTcl to do the code conversion, but do it right
   by using the Database's encoding instead of Tcl's system encoding.

b) Require databases to be in UNICODE if they are to be accessed
   from Tcl >= 8.1 so that the strings that come out of the database
   are already UTF-8.

For b) it would be nice to have a per-database attribute that
specifies the default client encoding that is used for clients that
don't explicitely set an encoding. I think of something like:

$ createdb --encoing UNICODE --default-client-encoding LATIN1 foo

This database could be used from Tcl without any code conversion, but
would look like it was in LATIN1 for other clients (e.g. psql) if they
don't explicitely set an encoding.


I'd vote for b), because I think there is a general movement towards
Unicode anyways.

cu
    Reinhard


pgsql-patches by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Patch for pl/tcl Tcl_ExternalToUtf and Tcl_UtfToExternal
Next
From: Bruce Momjian
Date:
Subject: Re: pgcrypto/px.c fix