Performance problems with Thai language - Mailing list pgsql-performance

From Andrey Zhidenkov
Subject Performance problems with Thai language
Date
Msg-id CAJw4d1Wf1hP4sHKwWnxauEiwi=WL6_YLw58GZ1Be=gLOPMiAOw@mail.gmail.com
Whole thread Raw
List pgsql-performance
We have faced in issue in our Postgresql 9.5.13 cluster. Inserts into
btree index are too slow when strings contain Thai characters.Test
script and results are in the attachment. Test shows that insert Thai
string into index is more than 60x times slower than Chinese or
Russian, for example. Tracing with perf showed that problem is in
strcoll_l() libc function (see thai-slow.svg). This function is used
when locale is different from 'C'. For 'C' locale just simple
comparison is used (see thai-fast.graph) and performance is OK.
Of course, I googled and thought that it is a bug in glibc
(https://sourceware.org/bugzilla/show_bug.cgi?id=18441), but when I
tried previous version of glibc (2.19 and 2.13) I found out that it
still reproduced. I know that I can upgrade PostgreSQL to 10 and user
libicu for string comparison but is there any way to fix that in
PostgreSQL 9.5.13?

P.S. I can provide COLLATE "C" for this column during its creation but
it looks a little bit tricky.

-- 
With best regards, Andrey Zhidenkov

Attachment

pgsql-performance by date:

Previous
From: Nicolas Paris
Date:
Subject: LEFT JOIN LATERAL optimisation at plan time
Next
From: "Felix A. Kater"
Date:
Subject: pg_pub_decrypt: 10x performance hit with gpg v2