Home > mailing lists

Re: improve Chinese locale performance - Mailing list pgsql-hackers

From	Martijn van Oosterhout
Subject	Re: improve Chinese locale performance
Date	July 28, 2013 09:40:15
Msg-id	20130728093940.GA5652@svana.org Whole thread
In response to	Re: improve Chinese locale performance (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: improve Chinese locale performance
List	pgsql-hackers

Tree view

On Tue, Jul 23, 2013 at 10:34:21AM -0400, Robert Haas wrote:
> I pretty much lost interest in ICU upon reading that they use UTF-16
> as their internal format.
>
> http://userguide.icu-project.org/strings#TOC-Strings-in-ICU

The UTF-8 support has been steadily improving:
 For example, icu::Collator::compareUTF8() compares two UTF-8 strings incrementally, without converting all of the two
stringsto UTF-16 if there is an early base letter difference. 

http://userguide.icu-project.org/strings/utf-8

For all other encodings you should be able to use an iterator. As to
performance I have no idea.

The main issue with strxfrm() is its lame API. If it supported
returning prefixes you'd be set, but as it is you need >10MB of memory
just to transform a 10MB string, even if only the first few characers
would be enough to sort...

Mvg,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.  -- Arthur Schopenhauer

pgsql-hackers by date:

From: Atri Sharma
Date: 28 July 2013, 06:52:05
Subject: Re: replication_reserved_connections

From: Marko Tiikkaja
Date: 28 July 2013, 10:23:51
Subject: Re: replication_reserved_connections

Re: improve Chinese locale performance - Mailing list pgsql-hackers

Previous

Next