Re: Unicode problems on IRC

From: Bruce Momjian
Subject: Re: Unicode problems on IRC
Date: ,
Msg-id: 200504100332.j3A3WdR20840@candle.pha.pa.us
(view: Whole thread, Raw)
In response to: Re: Unicode problems on IRC  (Tom Lane)
List: pgsql-hackers

Tree view

Unicode problems on IRC  (Christopher Kings-Lynne, )
 Re: Unicode problems on IRC  (Bruce Momjian, )
 Re: Unicode problems on IRC  (Andrew - Supernews, )
 Re: Unicode problems on IRC  ("John Hansen", )
  Re: Unicode problems on IRC  (Tom Lane, )
   Re: Unicode problems on IRC  (Bruce Momjian, )
  Re: Unicode problems on IRC  (Andrew - Supernews, )
   Re: Unicode problems on IRC  (Tom Lane, )
    Re: Unicode problems on IRC  (Oliver Jowett, )
  Re: Unicode problems on IRC  (Andrew - Supernews, )
 Re: Unicode problems on IRC  ("John Hansen", )
  Re: Unicode problems on IRC  (Andrew - Supernews, )

Tom Lane wrote:
> "John Hansen" <> writes:
> >> That is backpatched to 8.0.X.  Does that not fix the problem reported?
> 
> > No, as andrew said, what this patch does, is allow values > 0xffff and
> > at the same time validates the input to make sure it's valid utf8.
> 
> The impression I get is that most of the 'Unicode characters above
> 0x10000' reports we've seen did not come from people who actually needed
> more-than-16-bit Unicode codepoints, but from people who had screwed up
> their encoding settings and were trying to tell the backend that Latin1
> was Unicode or some such.  So I'm a bit worried that extending the
> backend support to full 32-bit Unicode will do more to mask encoding
> mistakes than it will do to create needed functionality.

Yes, that was my impression too.

The upper/lower/initcap issue was that some operating systems were
testing unicode values even if the local was set to C.  That is fixed in
8.0.2, but I now see this is a different problem.

> Not that I'm against adding the functionality.  I'm just doubtful that
> the reports we've seen really indicate that we need it, or that adding
> it will cut down on the incidence of complaints :-(

Yea, that was my question too.  I figured Japan or Chinese would be
using these longer values, and if they are fine, why are others having
problems.  It would be great to find a test case that actually shows the
failure.

--  Bruce Momjian                        |  http://candle.pha.pa.us                |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 



pgsql-hackers by date:

From: Oliver Jowett
Date:
Subject: Re: prepared statements don't log arguments?
From: Bruce Momjian
Date:
Subject: Three-byte Unicode characters