Thread: performance impact of non-C locale

performance impact of non-C locale

From
Axel Rau
Date:
Hi everyone,

some erp software requires a change of my pgsql cluster from
    locale C        encoding UTF-8
to
    locale de_DE.UTF-8    encoding UTF-8

Most of my databases have only ASCII text data (8 bit UTF8 code range)
in the text columns.
Does the above change influence index performance on such columns?

Does postmaster keep track on any multibyte characters being inserted
in such columns, so that the planner can adapt?

What other performance impacts can be expected?

Axel
---


Re: performance impact of non-C locale

From
Peter Eisentraut
Date:
Axel Rau wrote:
> some erp software requires a change of my pgsql cluster from
>     locale C        encoding UTF-8
> to
>     locale de_DE.UTF-8    encoding UTF-8
>
> Most of my databases have only ASCII text data (8 bit UTF8 code range)
> in the text columns.
> Does the above change influence index performance on such columns?

Yes.

> Does postmaster keep track on any multibyte characters being inserted in
> such columns, so that the planner can adapt?

No.

> What other performance impacts can be expected?

The performance impact is mainly with string comparisons and sorts.  I
suggest you run your own tests to find out what is acceptable in your
scenario.

Re: performance impact of non-C locale

From
Axel Rau
Date:
Am 11.09.2008 um 11:29 schrieb Peter Eisentraut:

>>
>> What other performance impacts can be expected?
>
> The performance impact is mainly with string comparisons and sorts.
> I suggest you run your own tests to find out what is acceptable in
> your scenario.
Im not yet convinced to switch to non-C locale. Is the following
intended behavior:
With lc_ctype  C:          select lower('ÄÖÜ'); => ÄÖÜ
With lc_ctype  en_US.utf8  select lower('ÆÅË'); => æåë
? (Both have server encoding UTF8)

Axel
---


Re: performance impact of non-C locale

From
Peter Eisentraut
Date:
Axel Rau wrote:
> Im not yet convinced to switch to non-C locale. Is the following
> intended behavior:
> With lc_ctype  C:          select lower('ÄÖÜ'); => ÄÖÜ
> With lc_ctype  en_US.utf8  select lower('ÆÅË'); => æåë
> ? (Both have server encoding UTF8)

I would expect exactly that.