Re: Unicode problems on IRC

From: Tom Lane
Subject: Re: Unicode problems on IRC
Date: ,
Msg-id: 25403.1113103724@sss.pgh.pa.us
(view: Whole thread, Raw)
In response to: Re: Unicode problems on IRC  ("John Hansen")
Responses: Re: Unicode problems on IRC  (Bruce Momjian)
List: pgsql-hackers

Tree view

Unicode problems on IRC  (Christopher Kings-Lynne, )
 Re: Unicode problems on IRC  (Bruce Momjian, )
 Re: Unicode problems on IRC  (Andrew - Supernews, )
 Re: Unicode problems on IRC  ("John Hansen", )
  Re: Unicode problems on IRC  (Tom Lane, )
   Re: Unicode problems on IRC  (Bruce Momjian, )
  Re: Unicode problems on IRC  (Andrew - Supernews, )
   Re: Unicode problems on IRC  (Tom Lane, )
    Re: Unicode problems on IRC  (Oliver Jowett, )
  Re: Unicode problems on IRC  (Andrew - Supernews, )
 Re: Unicode problems on IRC  ("John Hansen", )
  Re: Unicode problems on IRC  (Andrew - Supernews, )

"John Hansen" <> writes:
>> That is backpatched to 8.0.X.  Does that not fix the problem reported?

> No, as andrew said, what this patch does, is allow values > 0xffff and
> at the same time validates the input to make sure it's valid utf8.

The impression I get is that most of the 'Unicode characters above
0x10000' reports we've seen did not come from people who actually needed
more-than-16-bit Unicode codepoints, but from people who had screwed up
their encoding settings and were trying to tell the backend that Latin1
was Unicode or some such.  So I'm a bit worried that extending the
backend support to full 32-bit Unicode will do more to mask encoding
mistakes than it will do to create needed functionality.

Not that I'm against adding the functionality.  I'm just doubtful that
the reports we've seen really indicate that we need it, or that adding
it will cut down on the incidence of complaints :-(
        regards, tom lane



pgsql-hackers by date:

From: Oliver Jowett
Date:
Subject: Re: prepared statements don't log arguments?
From: Bruce Momjian
Date:
Subject: Three-byte Unicode characters