Re: improve Chinese locale performance - Mailing list pgsql-hackers

From Andrew Dunstan
Subject Re: improve Chinese locale performance
Date
Msg-id 51ED6E66.4040803@dunslane.net
Whole thread Raw
In response to Re: improve Chinese locale performance  (Greg Stark <stark@mit.edu>)
List pgsql-hackers
On 07/22/2013 12:49 PM, Greg Stark wrote:
> On Mon, Jul 22, 2013 at 12:50 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
>> I think part of the problem is that we call strcoll for each comparison,
>> instead of doing strxfrm once for each datum and then just strcmp for
>> each comparison.  That is effectively equivalent to what the proposal
>> implements.
> Fwiw I used to be a big proponent of using strxfrm. But upon further
> analysis I realized it was a real difficult tradeoff. strxrfm saves
> potentially a lot of cpu cost but at the expense of expanding the size
> of the sort key. If the sort spills to disk or even if it's just
> memory bandwidth limited it might actually be slower than doing the
> additional cpu work of calling strcoll.
>
> It's hard to see how to decide in advance which way will be faster. I
> suspect strxfrm is still the better bet, especially for complex large
> character set based locales like Chinese. strcoll might still win by a
> large margin on simple mostly-ascii character sets.
>
>


Perhaps we need a bit of performance testing to prove the point.

Maybe the behaviour should be locale-dependent.

cheers

andrew




pgsql-hackers by date:

Previous
From: Andrew Gierth
Date:
Subject: Re: Review: UNNEST (and other functions) WITH ORDINALITY
Next
From: Greg Smith
Date:
Subject: Re: [PATCH] pgbench --throttle (submission 7 - with lag measurement)