Re: UTF8 conversion differences from v8.1.3 to v8.1.4 - Mailing list pgsql-general

From Martijn van Oosterhout
Subject Re: UTF8 conversion differences from v8.1.3 to v8.1.4
Date
Msg-id 20060719103556.GC31786@svana.org
Whole thread Raw
In response to Re: UTF8 conversion differences from v8.1.3 to v8.1.4  (Eric Faulhaber <ecf@goldencode.com>)
Responses Re: UTF8 conversion differences from v8.1.3 to v8.1.4  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
On Tue, Jul 18, 2006 at 08:03:51PM -0400, Eric Faulhaber wrote:
> > It's not a defect ... or at least, it doesn't make sense to change it
> > unless you are willing to go through the entire system to make it able
> > to store null bytes in text.  We've looked at that in the past and
> > always concluded that it was completely impractical :-(
>
> :-( indeed, though I appreciate the dialog, Tom.  Sadly, this would not
> be the first completely impractical task on my todo list ;-)

It's a pity postgres doesn't handle nulls in strings. Perl for example
handles it just fine, but I imagine they've reimplemented many of the
string functions themselves anyway.

Looking at the code it doesn't appear that there are too many places
that are problematic. The real killer though is the regex matching and
sorting, they like null terminated strings. The latter could be dealt
with using ICU which doesn't treat the zero code point specially. But
after that, there's probably others too. I suppose a concerted effort
would have to be made to try and make it work properly.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment

pgsql-general by date:

Previous
From: "deepak pal"
Date:
Subject: what step need to configure postgres for java application
Next
From: "Christian Rengstl"
Date:
Subject: Re: Performance problem with query