Re: Speed up ICU case conversion by using ucasemap_utf8To*() - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: Speed up ICU case conversion by using ucasemap_utf8To*()
Date
Msg-id 72c7c2b5848da44caddfe0f20f6c7ebc7c0c6e60.camel@j-davis.com
Whole thread Raw
List pgsql-hackers
On Fri, 2024-12-20 at 06:20 +0100, Andreas Karlsson wrote:
> SELECT count(upper) FROM (SELECT upper(('Kålhuvud ' || i) COLLATE
> "sv-SE-x-icu") FROM generate_series(1, 1000000) i);
>
> master:  ~540 ms
> Patched: ~460 ms
> glibc:   ~410 ms

It looks like you are opening and closing the UCaseMap object each
time. Why not save it in pg_locale_t? That should speed it up even more
and hopefully beat libc.


Also, to support older ICU versions consistently, we need to fix up the
locale name to support "und"; cf. pg_ucol_open(). Perhaps factor out
that logic?

Regards,
    Jeff Davis




pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: Memory leak in WAL sender with pgoutput (v10~)
Next
From: Tom Lane
Date:
Subject: Re: Discussion on a LISTEN-ALL syntax