Re: Unicode problems on IRC - Mailing list pgsql-hackers

From John Hansen
Subject Re: Unicode problems on IRC
Date
Msg-id 5066E5A966339E42AA04BA10BA706AE5628D@rodrick.geeknet.com.au
Whole thread Raw
In response to Unicode problems on IRC  (Christopher Kings-Lynne <chriskl@familyhealth.com.au>)
Responses Re: Unicode problems on IRC  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers

> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Bruce Momjian
> Sent: Sunday, April 10, 2005 8:18 AM
> To: Christopher Kings-Lynne
> Cc: pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Unicode problems on IRC
>
> Christopher Kings-Lynne wrote:
> > Hey guys,
> >
> > The 'Unicode characters above 0x10000' issue keeps rearing its ugly
> > head in the IRC channel.  I propose that it be fixed, even
> backported...
> >
> > This is John Hansen's most recent patch to fix it:
> >
> > http://archives.postgresql.org/pgsql-patches/2004-11/msg00259.php
> >
> > And from what I can tell it was committed, then reverted because it
> > wasn't a "bug".  It was going to go in for 8.1.
> >
> > We on the channel are starting to think that it is in fact a bug.
> > There are are people with legitimately utf-8 encoded XML documents
> > that they cannot store in PostgreSQL.  Apparently in the
> distant past,
> > Unicode was limited to 0x10000, but then was extended.
> >
> > Perhaps we can reopen this case...
>
> Uh, I thought we fixed this another way, buy not using
> Unicode-aware functions for upper/lower/initcap when the
> locale is "C" or "POSIX".
> That is backpatched to 8.0.X.  Does that not fix the problem reported?

No, as andrew said, what this patch does, is allow values > 0xffff and
at the same time validates the input to make sure it's valid utf8.


... John
>
> --
>   Bruce Momjian                        |  http://candle.pha.pa.us
>   pgman@candle.pha.pa.us               |  (610) 359-1001
>   +  If your life is a hard drive,     |  13 Roberts Road
>   +  Christ can be your backup.        |  Newtown Square,
> Pennsylvania 19073
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index
> scan if your
>       joining column's datatypes do not match
>
>


pgsql-hackers by date:

Previous
From: Andrew - Supernews
Date:
Subject: Re: Unicode problems on IRC
Next
From: Tom Lane
Date:
Subject: Re: Unicode problems on IRC