Re: Unicode problems on IRC - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Unicode problems on IRC
Date
Msg-id 25403.1113103724@sss.pgh.pa.us
Whole thread Raw
In response to Re: Unicode problems on IRC  ("John Hansen" <john@geeknet.com.au>)
Responses Re: Unicode problems on IRC  (Bruce Momjian <pgman@candle.pha.pa.us>)
List pgsql-hackers
"John Hansen" <john@geeknet.com.au> writes:
>> That is backpatched to 8.0.X.  Does that not fix the problem reported?

> No, as andrew said, what this patch does, is allow values > 0xffff and
> at the same time validates the input to make sure it's valid utf8.

The impression I get is that most of the 'Unicode characters above
0x10000' reports we've seen did not come from people who actually needed
more-than-16-bit Unicode codepoints, but from people who had screwed up
their encoding settings and were trying to tell the backend that Latin1
was Unicode or some such.  So I'm a bit worried that extending the
backend support to full 32-bit Unicode will do more to mask encoding
mistakes than it will do to create needed functionality.

Not that I'm against adding the functionality.  I'm just doubtful that
the reports we've seen really indicate that we need it, or that adding
it will cut down on the incidence of complaints :-(
        regards, tom lane


pgsql-hackers by date:

Previous
From: "John Hansen"
Date:
Subject: Re: Unicode problems on IRC
Next
From: Bruce Momjian
Date:
Subject: Re: Unicode problems on IRC