Home > mailing lists

Re: improve Chinese locale performance - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: improve Chinese locale performance
Date	July 23, 2013 14:34:27
Msg-id	CA+Tgmob8UxfNDc1gyX=7tPLtcaDcYzHLhSrDAkGkNq8-0YaJfA@mail.gmail.com Whole thread
In response to	Re: improve Chinese locale performance (Craig Ringer <craig@2ndquadrant.com>)
Responses	Re: improve Chinese locale performance
List	pgsql-hackers

Tree view

On Tue, Jul 23, 2013 at 9:42 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
> (Replying on phone, please forgive bad quoting)
>
> Isn't this pretty much what adopting ICU is supposed to give us? OS-independent collations?

Yes.

> I'd be interested in seeing the rest data for this performance report, partly as I'd like to see how ICU collations
wouldcompare when ICU is crudely hacked into place for testing.

I pretty much lost interest in ICU upon reading that they use UTF-16
as their internal format.

http://userguide.icu-project.org/strings#TOC-Strings-in-ICU

What that would mean for us is that instead of copying the input
strings into a temporary buffer and passing the buffer to strcoll(),
we'd need to convert them to ICU's representation (which means writing
twice as many bytes as the length of the input string in cases where
the input string is mostly single-byte characters) and then call ICU's
strcoll() equivalent.  I agree that it might be worth testing, but I
can't work up much optimism.  It seems to me that something that
operates directly on the server encoding could run a whole lot faster.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

From: Amit Kapila
Date: 23 July 2013, 14:13:55
Subject: Re: Performance Improvement by reducing WAL for Update Operation

From: Robert Haas
Date: 23 July 2013, 14:35:17
Subject: Re: [9.4 CF 1] And then there were 5

Re: improve Chinese locale performance - Mailing list pgsql-hackers

Previous

Next