Re: Unicode problems on IRC

From: Tom Lane
Subject: Re: Unicode problems on IRC
Date: ,
Msg-id: 28749.1113151193@sss.pgh.pa.us
(view: Whole thread, Raw)
In response to: Re: Unicode problems on IRC  (Andrew - Supernews)
Responses: Re: Unicode problems on IRC  (Oliver Jowett)
List: pgsql-hackers

Tree view

Unicode problems on IRC  (Christopher Kings-Lynne, )
 Re: Unicode problems on IRC  (Bruce Momjian, )
 Re: Unicode problems on IRC  (Andrew - Supernews, )
 Re: Unicode problems on IRC  ("John Hansen", )
  Re: Unicode problems on IRC  (Tom Lane, )
   Re: Unicode problems on IRC  (Bruce Momjian, )
  Re: Unicode problems on IRC  (Andrew - Supernews, )
   Re: Unicode problems on IRC  (Tom Lane, )
    Re: Unicode problems on IRC  (Oliver Jowett, )
  Re: Unicode problems on IRC  (Andrew - Supernews, )
 Re: Unicode problems on IRC  ("John Hansen", )
  Re: Unicode problems on IRC  (Andrew - Supernews, )

Andrew - Supernews <> writes:
> On 2005-04-10, Tom Lane <> wrote:
>> The impression I get is that most of the 'Unicode characters above
>> 0x10000' reports we've seen did not come from people who actually needed
>> more-than-16-bit Unicode codepoints, but from people who had screwed up
>> their encoding settings and were trying to tell the backend that Latin1
>> was Unicode or some such.

> I think you will find that this impression is actually false. Or that at
> the very least, _correct_ verification of UTF-8 sequences will still
> catch essentially all cases of non-utf-8 input mislabelled as utf-8
> while allowing the full range of Unicode codepoints.

Yeah?  Cool.  Does John's proposed patch do it "correctly"?

http://candle.pha.pa.us/mhonarc/patches2/msg00076.html
        regards, tom lane



pgsql-hackers by date:

From: "Jim C. Nasby"
Date:
Subject: System vs non-system casts
From: Oliver Jowett
Date:
Subject: Re: Unicode problems on IRC