Re: libpgtcl doesn't use UTF encoding of TCL - Mailing list pgsql-bugs

From Reinhard Max
Subject Re: libpgtcl doesn't use UTF encoding of TCL
Date
Msg-id Pine.LNX.4.33.0109061021170.23831-100000@wotan.suse.de
Whole thread Raw
In response to Re: libpgtcl doesn't use UTF encoding of TCL  (Bruce Momjian <pgman@candle.pha.pa.us>)
List pgsql-bugs
Hi Bruce,

On Wed, 5 Sep 2001, Bruce Momjian wrote:

> I have a patch here that handles all the TCL/UTF issues.
> Would you let me know if it is OK?

I think, it isn't really a clean fix. It only works, if your
database's encoding and Tcl's system encoding are identical. If the
database uses a different encoding than Tcl, you still end up with
wrong characters. Also, the configure switch (if needed at all)
should IMHO be a disable switch, because the conversion is mandatory
for Tcl >= 8.1 unless someone really knows that he won't have any
8-Bit characters in his database. So less people would get bitten, if
UTF conversion was enabled by default for Tcl >= 8.1 .

Besides these flaws, I think the patch could be simpler and avoid the
UTF_BEGIN and UTF_END macros if UTF_U2E and UTF_E2U were (maybe
inlined) functions and defined like this (untested):

char* UTF_U2E(CONST char * source)
{
    static Tcl_DString *destPtr = NULL;

    if (destPtr == NULL) {
        destPtr = (Tcl_DString *) malloc(sizeof(Tcl_DString));
    } else {
        Tcl_DStringFree(destPtr);
    }
    return Tcl_UtfToExternalDString(NULL, source, -1, destPtr);
}


See also the mail, I sent to pgsql-patches last Tuesday on the same
topic.

In addition to my suggestion there to require the database to be
UNICODE for Tcl >= 8.1, I just had another Idea how it could be solved
cleanly:

What about making --enable-unicode-conversion mandatory when
PostgreSQL gets compiled with Tcl support and changing PgTcl and
PL/Tcl to set their client encoding to UNICODE at _runtime_ when they
find themselfes running with a Tcl interpreter that needs UTF-8 (i.e.
Tcl >= 8.1)?

Going this way, we could even retain binary compatibility for PgTcl
and PL/Tcl with Tcl versions prior and after Tcl's move to UTF-8.

One Question remains here: Do --enable-multibyte and
--enable-unicode-conversion have any downsides (besides a larger
executable), if they are compiled in, but not used?

cu
    Reinhard

pgsql-bugs by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: BUG(fixed) in CREATE TABLE ADD CONSTRAINT...
Next
From: Michał Pasternak
Date:
Subject: ILIKE + OR fault - propably a memory leak