Thread: Slow performance of collate "en_US.utf8"

Slow performance of collate "en_US.utf8"

From
Alexey Borschev
Date:

Hi everyone!

I see poor performance of text sorting of collate "en_US.utf8" in PG 17.4.

Test query:

explain (analyze, costs, buffers, verbose)

SELECT

('БвЁжЫйяЙSёYz&$' || gen.id) collate "en_US.utf8"

FROM generate_series(1, 10000) AS gen(id)

order by 1 desc;

I've got execution time like:

Execution Time: 73.068 ms

Same poor result with ru_RU.UTF8.

With other collations time is much better:

explain (analyze, costs, buffers, verbose)

select

('БвЁжЫйяЙSёYz&$' || gen.id) collate "C"

from generate_series(1, 10000) AS gen(id)

order by 1 desc;

Execution Time: 4.792 ms

explain (analyze, costs, buffers, verbose)

SELECT

('БвЁжЫйяЙSёYz&$' || gen.id) collate "C.utf8"

FROM generate_series(1, 10000) AS gen(id)

order by 1 desc;

Execution Time: 7.473 ms

explain (analyze, costs, buffers, verbose)

select

('БвЁжЫйяЙSёYz&$' || gen.id) collate "und-x-icu"

from generate_series(1, 10000) AS gen(id)

order by 1 desc;

Execution Time: 13.282 ms

Yes, collate C is fastest, ICU collations is ~ 2 slower then C, but "en_US.utf8" is ~ 10x slower!

I suspect it is some performance issue over there.

Can someone of PG hackers reproduce this please?

1) This PG17 instance was installed with default options, and initdb got en_US.utf8 as system default collation and created PG cluster with it.

It seems like most PG databases are created this way with en_US.utf8 by default.

2) Typical text\varchar columns are created with DB default en_US.utf8 and performs poor.

explain (analyze, costs, buffers, verbose)

select

('БвЁжЫйяЙSёYz&$' || gen.id)

from generate_series(1, 10000) AS gen(id)

order by 1 desc;

Execution Time: 73.600 ms

3) The index search operations are also slower with en_US.utf8, but the difference is not as high.

Please see attached file with test table and indexes, with tests on index performance.

System Details:

postgres@borschev-pg-copydb1:~$ uname -a Linux borschev-pg-copydb1 6.1.0-7-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.20-1 (2023-03-19) x86_64 GNU/Linux

postgres@borschev-pg-copydb1:~$ cat /etc/issue Debian GNU/Linux trixie/sid \n \l

select version();

PostgreSQL 17.4 (Debian 17.4-1.pgdg110+2) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit

SELECT * FROM pg_config();

|name |setting |

|-----------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|

|BINDIR |/usr/lib/postgresql/17/bin |

|DOCDIR |/usr/share/doc/postgresql-doc-17 |

|HTMLDIR |/usr/share/doc/postgresql-doc-17 |

|INCLUDEDIR |/usr/include/postgresql |

|PKGINCLUDEDIR |/usr/include/postgresql |

|INCLUDEDIR-SERVER|/usr/include/postgresql/17/server |

|LIBDIR |/usr/lib/x86_64-linux-gnu |

|PKGLIBDIR |/usr/lib/postgresql/17/lib |

|LOCALEDIR |/usr/share/locale |

|MANDIR |/usr/share/postgresql/17/man |

|SHAREDIR |/usr/share/postgresql/17 |

|SYSCONFDIR |/etc/postgresql-common |

|PGXS |/usr/lib/postgresql/17/lib/pgxs/src/makefiles/pgxs.mk |

|CONFIGURE | '--build=x86_64-linux-gnu' '--prefix=/usr' '--includedir=${prefix}/include' '--mandir=${prefix}/share/man' '--infodir=${prefix}/share/info' '--sysconfdir=/etc' '--localstatedir=/var' '--disable-option-checking' '--disable-silent-rules' '--libdir=${prefix}/lib/x86_64-linux-gnu' '--runstatedir=/run' '--disable-maintainer-mode' '--disable-dependency-tracking' '--with-tcl' '--with-perl' '--with-python' '--with-pam' '--with-openssl' '--with-libxml' '--with-libxslt' '--mandir=/usr/share/postgresql/17/man' '--docdir=/usr/share/doc/postgresql-doc-17' '--sysconfdir=/etc/postgresql-common' '--datarootdir=/usr/share/' '--datadir=/usr/share/postgresql/17' '--bindir=/usr/lib/postgresql/17/bin' '--libdir=/usr/lib/x86_64-linux-gnu/' '--libexecdir=/usr/lib/postgresql/' '--includedir=/usr/include/postgresql/' '--with-extra-version= (Debian 17.4-1.pgdg110+2)' '--enable-nls' '--enable-thread-safety' '--enable-debug' '--disable-rpath' '--with-uuid=e2fs' '--with-gnu-ld' '--with-gssapi' '--with-ldap' '--with-pgport=5432' '--with-system-tzdata=/usr/share/zoneinfo' 'AWK=mawk' 'MKDIR_P=/bin/mkdir -p' 'PROVE=/usr/bin/prove' 'PYTHON=/usr/bin/python3' 'TAR=/bin/tar' 'XSLTPROC=xsltproc --nonet' 'CFLAGS=-g -O2 -fstack-protector-strong -Wformat -Werror=format-security -fno-omit-frame-pointer' 'LDFLAGS=-Wl,-z,relro -Wl,-z,now' '--enable-tap-tests' '--with-icu' '--with-llvm' 'LLVM_CONFIG=/usr/bin/llvm-config-16' 'CLANG=/usr/bin/clang-16' '--with-lz4' '--with-zstd' '--with-systemd' '--with-selinux' '--enable-dtrace' 'build_alias=x86_64-linux-gnu' 'CPPFLAGS=-Wdate-time -D_FORTIFY_SOURCE=2' 'CXXFLAGS=-g -O2 -fstack-protector-strong -Wformat -Werror=format-security'|

|CC |gcc |

|CPPFLAGS |-Wdate-time -D_FORTIFY_SOURCE=2 -D_GNU_SOURCE -I/usr/include/libxml2 |

|CFLAGS |-Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wimplicit-fallthrough=3 -Wcast-function-type -Wshadow=compatible-local -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-format-truncation -Wno-stringop-truncation -g -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -fno-omit-frame-pointer |

|CFLAGS_SL |-fPIC |

|LDFLAGS |-Wl,-z,relro -Wl,-z,now -L/usr/lib/llvm-16/lib -Wl,--as-needed |

|LDFLAGS_EX | |

|LDFLAGS_SL | |

|LIBS |-lpgcommon -lpgport -lselinux -lzstd -llz4 -lxslt -lxml2 -lpam -lssl -lcrypto -lgssapi_krb5 -lz -lreadline -lpthread -lrt -ldl -lm |

|VERSION |PostgreSQL 17.4 (Debian 17.4-1.pgdg110+2)

select * from pg_database;

|oid |datname |datdba|encoding|datlocprovider|datistemplate|datallowconn|dathasloginevt|datconnlimit|datfrozenxid|datminmxid|dattablespace|datcollate |datctype |datlocale|daticurules|datcollversion|datacl |

|------|---------|------|--------|--------------|-------------|------------|--------------|------------|------------|----------|-------------|-----------|-----------|---------|-----------|--------------|-----------------------------------|

|5 |postgres |10 |6 |c |false |true |false |-1 |730 |1 |1,663 |en_US.UTF-8|en_US.UTF-8| | |2.40 | |

|16,388|demo |10 |6 |c |false |true |false |-1 |730 |1 |1,663 |en_US.UTF-8|en_US.UTF-8| | |2.40 | |

|1 |template1|10 |6 |c |true |true |false |-1 |730 |1 |1,663 |en_US.UTF-8|en_US.UTF-8| | |2.40 |{=c/postgres,postgres=CTc/postgres}|

|4 |template0|10 |6 |c |true |false |false |-1 |730 |1 |1,663 |en_US.UTF-8|en_US.UTF-8| | | |{=c/postgres,postgres=CTc/postgres}|

Collations are standard, out-of-the-box:

select collname, collprovider, collencoding, collcollate, collctype, colllocale,

collversion, collisdeterministic

FROM pg_collation cll

--where cll.collname in ('C.utf8', 'en_US.utf8', 'ru-RU-x-icu')

|collname |collprovider|collencoding|collcollate|collctype |colllocale|collversion|collisdeterministic|

|-----------|------------|------------|-----------|----------|----------|-----------|-------------------|

|default |d |-1 | | | | |true |

|C |c |-1 |C |C | | |true |

|POSIX |c |-1 |POSIX |POSIX | | |true |

|ucs_basic |b |6 | | |C |1 |true |

|pg_c_utf8 |b |6 | | |C.UTF-8 |1 |true |

|unicode |i |-1 | | |und |153.14 |true |

|C.utf8 |c |6 |C.utf8 |C.utf8 | | |true |

|en_US.utf8 |c |6 |en_US.utf8 |en_US.utf8| |2.40 |true |

|ru_RU.utf8 |c |6 |ru_RU.utf8 |ru_RU.utf8| |2.40 |true |

|en_US |c |6 |en_US.utf8 |en_US.utf8| |2.40 |true |

|ru_RU |c |6 |ru_RU.utf8 |ru_RU.utf8| |2.40 |true |

Attachment

Re: Slow performance of collate "en_US.utf8"

From
Achilleas Mantzios - cloud
Date:

Hi Alexey

On 2/27/25 15:54, Alexey Borschev wrote:

Hi everyone!

I see poor performance of text sorting of collate "en_US.utf8" in PG 17.4.

Test query:

explain (analyze, costs, buffers, verbose)

SELECT

('БвЁжЫйяЙSёYz&$' || gen.id) collate "en_US.utf8"

FROM generate_series(1, 10000) AS gen(id)

order by 1 desc;

I've got execution time like:

Execution Time: 73.068 ms

Same poor result with ru_RU.UTF8.

With other collations time is much better:

explain (analyze, costs, buffers, verbose)

select

('БвЁжЫйяЙSёYz&$' || gen.id) collate "C"

from generate_series(1, 10000) AS gen(id)

order by 1 desc;

Execution Time: 4.792 ms

explain (analyze, costs, buffers, verbose)

SELECT

('БвЁжЫйяЙSёYz&$' || gen.id) collate "C.utf8"

FROM generate_series(1, 10000) AS gen(id)

order by 1 desc;

Execution Time: 7.473 ms

explain (analyze, costs, buffers, verbose)

select

('БвЁжЫйяЙSёYz&$' || gen.id) collate "und-x-icu"

from generate_series(1, 10000) AS gen(id)

order by 1 desc;

Execution Time: 13.282 ms

Yes, collate C is fastest, ICU collations is ~ 2 slower then C, but "en_US.utf8" is ~ 10x slower!

I suspect it is some performance issue over there.

Can someone of PG hackers reproduce this please?

1) This PG17 instance was installed with default options, and initdb got en_US.utf8 as system default collation and created PG cluster with it.

It seems like most PG databases are created this way with en_US.utf8 by default.

It seems you initialized the cluster with libc as the locale provider. Have you tried with icu ?

2) Typical text\varchar columns are created with DB default en_US.utf8 and performs poor.

explain (analyze, costs, buffers, verbose)

select

('БвЁжЫйяЙSёYz&$' || gen.id)

from generate_series(1, 10000) AS gen(id)

order by 1 desc;

Execution Time: 73.600 ms

3) The index search operations are also slower with en_US.utf8, but the difference is not as high.

Please see attached file with test table and indexes, with tests on index performance.

System Details:

postgres@borschev-pg-copydb1:~$ uname -a Linux borschev-pg-copydb1 6.1.0-7-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.20-1 (2023-03-19) x86_64 GNU/Linux

postgres@borschev-pg-copydb1:~$ cat /etc/issue Debian GNU/Linux trixie/sid \n \l

select version();

PostgreSQL 17.4 (Debian 17.4-1.pgdg110+2) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit

SELECT * FROM pg_config();

|name |setting |

|-----------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|

|BINDIR |/usr/lib/postgresql/17/bin |

|DOCDIR |/usr/share/doc/postgresql-doc-17 |

|HTMLDIR |/usr/share/doc/postgresql-doc-17 |

|INCLUDEDIR |/usr/include/postgresql |

|PKGINCLUDEDIR |/usr/include/postgresql |

|INCLUDEDIR-SERVER|/usr/include/postgresql/17/server |

|LIBDIR |/usr/lib/x86_64-linux-gnu |

|PKGLIBDIR |/usr/lib/postgresql/17/lib |

|LOCALEDIR |/usr/share/locale |

|MANDIR |/usr/share/postgresql/17/man |

|SHAREDIR |/usr/share/postgresql/17 |

|SYSCONFDIR |/etc/postgresql-common |

|PGXS |/usr/lib/postgresql/17/lib/pgxs/src/makefiles/pgxs.mk |

|CONFIGURE | '--build=x86_64-linux-gnu' '--prefix=/usr' '--includedir=${prefix}/include' '--mandir=${prefix}/share/man' '--infodir=${prefix}/share/info' '--sysconfdir=/etc' '--localstatedir=/var' '--disable-option-checking' '--disable-silent-rules' '--libdir=${prefix}/lib/x86_64-linux-gnu' '--runstatedir=/run' '--disable-maintainer-mode' '--disable-dependency-tracking' '--with-tcl' '--with-perl' '--with-python' '--with-pam' '--with-openssl' '--with-libxml' '--with-libxslt' '--mandir=/usr/share/postgresql/17/man' '--docdir=/usr/share/doc/postgresql-doc-17' '--sysconfdir=/etc/postgresql-common' '--datarootdir=/usr/share/' '--datadir=/usr/share/postgresql/17' '--bindir=/usr/lib/postgresql/17/bin' '--libdir=/usr/lib/x86_64-linux-gnu/' '--libexecdir=/usr/lib/postgresql/' '--includedir=/usr/include/postgresql/' '--with-extra-version= (Debian 17.4-1.pgdg110+2)' '--enable-nls' '--enable-thread-safety' '--enable-debug' '--disable-rpath' '--with-uuid=e2fs' '--with-gnu-ld' '--with-gssapi' '--with-ldap' '--with-pgport=5432' '--with-system-tzdata=/usr/share/zoneinfo' 'AWK=mawk' 'MKDIR_P=/bin/mkdir -p' 'PROVE=/usr/bin/prove' 'PYTHON=/usr/bin/python3' 'TAR=/bin/tar' 'XSLTPROC=xsltproc --nonet' 'CFLAGS=-g -O2 -fstack-protector-strong -Wformat -Werror=format-security -fno-omit-frame-pointer' 'LDFLAGS=-Wl,-z,relro -Wl,-z,now' '--enable-tap-tests' '--with-icu' '--with-llvm' 'LLVM_CONFIG=/usr/bin/llvm-config-16' 'CLANG=/usr/bin/clang-16' '--with-lz4' '--with-zstd' '--with-systemd' '--with-selinux' '--enable-dtrace' 'build_alias=x86_64-linux-gnu' 'CPPFLAGS=-Wdate-time -D_FORTIFY_SOURCE=2' 'CXXFLAGS=-g -O2 -fstack-protector-strong -Wformat -Werror=format-security'|

|CC |gcc |

|CPPFLAGS |-Wdate-time -D_FORTIFY_SOURCE=2 -D_GNU_SOURCE -I/usr/include/libxml2 |

|CFLAGS |-Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wimplicit-fallthrough=3 -Wcast-function-type -Wshadow=compatible-local -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-format-truncation -Wno-stringop-truncation -g -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -fno-omit-frame-pointer |

|CFLAGS_SL |-fPIC |

|LDFLAGS |-Wl,-z,relro -Wl,-z,now -L/usr/lib/llvm-16/lib -Wl,--as-needed |

|LDFLAGS_EX | |

|LDFLAGS_SL | |

|LIBS |-lpgcommon -lpgport -lselinux -lzstd -llz4 -lxslt -lxml2 -lpam -lssl -lcrypto -lgssapi_krb5 -lz -lreadline -lpthread -lrt -ldl -lm |

|VERSION |PostgreSQL 17.4 (Debian 17.4-1.pgdg110+2)

select * from pg_database;

|oid |datname |datdba|encoding|datlocprovider|datistemplate|datallowconn|dathasloginevt|datconnlimit|datfrozenxid|datminmxid|dattablespace|datcollate |datctype |datlocale|daticurules|datcollversion|datacl |

|------|---------|------|--------|--------------|-------------|------------|--------------|------------|------------|----------|-------------|-----------|-----------|---------|-----------|--------------|-----------------------------------|

|5 |postgres |10 |6 |c |false |true |false |-1 |730 |1 |1,663 |en_US.UTF-8|en_US.UTF-8| | |2.40 | |

|16,388|demo |10 |6 |c |false |true |false |-1 |730 |1 |1,663 |en_US.UTF-8|en_US.UTF-8| | |2.40 | |

|1 |template1|10 |6 |c |true |true |false |-1 |730 |1 |1,663 |en_US.UTF-8|en_US.UTF-8| | |2.40 |{=c/postgres,postgres=CTc/postgres}|

|4 |template0|10 |6 |c |true |false |false |-1 |730 |1 |1,663 |en_US.UTF-8|en_US.UTF-8| | | |{=c/postgres,postgres=CTc/postgres}|

Collations are standard, out-of-the-box:

select collname, collprovider, collencoding, collcollate, collctype, colllocale,

collversion, collisdeterministic

FROM pg_collation cll

--where cll.collname in ('C.utf8', 'en_US.utf8', 'ru-RU-x-icu')

|collname |collprovider|collencoding|collcollate|collctype |colllocale|collversion|collisdeterministic|

|-----------|------------|------------|-----------|----------|----------|-----------|-------------------|

|default |d |-1 | | | | |true |

|C |c |-1 |C |C | | |true |

|POSIX |c |-1 |POSIX |POSIX | | |true |

|ucs_basic |b |6 | | |C |1 |true |

|pg_c_utf8 |b |6 | | |C.UTF-8 |1 |true |

|unicode |i |-1 | | |und |153.14 |true |

|C.utf8 |c |6 |C.utf8 |C.utf8 | | |true |

|en_US.utf8 |c |6 |en_US.utf8 |en_US.utf8| |2.40 |true |

|ru_RU.utf8 |c |6 |ru_RU.utf8 |ru_RU.utf8| |2.40 |true |

|en_US |c |6 |en_US.utf8 |en_US.utf8| |2.40 |true |

|ru_RU |c |6 |ru_RU.utf8 |ru_RU.utf8| |2.40 |true |

Re: Slow performance of collate "en_US.utf8"

From
Laurenz Albe
Date:
On Thu, 2025-02-27 at 16:54 +0300, Alexey Borschev wrote:
> I see poor performance of text sorting of collate "en_US.utf8" in PG 17.4.

I'd say that you would have to complain to the authors of the
GNU C library, which provides this collation.

Yours,
Laurenz Albe



Re: Slow performance of collate "en_US.utf8"

From
Joe Conway
Date:
On 2/28/25 09:16, Laurenz Albe wrote:
> On Thu, 2025-02-27 at 16:54 +0300, Alexey Borschev wrote:
>> I see poor performance of text sorting of collate "en_US.utf8" in PG 17.4.
> 
> I'd say that you would have to complain to the authors of the
> GNU C library, which provides this collation.

Yep -- glibc starting with version 2.21 has a massive performance 
regression for certain cases and the glibc folks have basically said 
they will not fix it. If you try the same thing on RHEL 7.x with glibc 
2.17 it will perform about the same as ICU.

If you are using pg17 you should consider using the new builtin 
collation provider -- it will perform almost as well as the 'C' locale. 
Something like:
--------
CREATE DATABASE builtincoll LOCALE_PROVIDER builtin
BUILTIN_LOCALE 'C.UTF-8' TEMPLATE template0;
--------

-- 
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Slow performance of collate "en_US.utf8"

From
Thomas Munro
Date:
On Sat, Mar 1, 2025 at 9:03 AM Joe Conway <mail@joeconway.com> wrote:
> On 2/28/25 09:16, Laurenz Albe wrote:
> > On Thu, 2025-02-27 at 16:54 +0300, Alexey Borschev wrote:
> >> I see poor performance of text sorting of collate "en_US.utf8" in PG 17.4.
> >
> > I'd say that you would have to complain to the authors of the
> > GNU C library, which provides this collation.
>
> Yep -- glibc starting with version 2.21 has a massive performance
> regression for certain cases and the glibc folks have basically said
> they will not fix it. If you try the same thing on RHEL 7.x with glibc
> 2.17 it will perform about the same as ICU.

I've idly wondered if this is the culprit, do you know?

https://github.com/bminor/glibc/commit/0742aef6e52a935f9ccd69594831b56d807feef3

It seems to have bet that strings either differ in primary weight as
early as they do in synthetic natural language tests, or not at all
because they are equal and that is detected with a fast-path binary
comparison.  The average first different character word-to-word in my
/usr/shared/dict/words is at position ~5.6 (some kind of worst case as
it is already sorted), cf 14+ multibyte sequences in OP's example,
which must be well outside their test parameters I would guess.  I
didn't read the code but the description has a miasma of
quadratic-catching-on-fire about it: it's now rescanning the secondary
weights with repeated traversals, because the cache they ripped out
wasn't pulling its own weight at small common prefix sizes, or
something like that?  I wonder if 2.21 also got faster for PostgreSQL
sorting /usr/share/dict/words as you might expect from that
description.  Database keys with long common prefixes *probably*
shouldn't be using natural language sorting anyway, so "don't do
that", but knowledge of collations is not well distributed... on the
other hand I suspect you can dream up some real natural language
examples that lose the bet too: sort "lastname, firstname" across a
country of 300 million, maybe?



Re: Slow performance of collate "en_US.utf8"

From
Joe Conway
Date:
On 2/28/25 17:49, Thomas Munro wrote:
> On Sat, Mar 1, 2025 at 9:03 AM Joe Conway <mail@joeconway.com> wrote:
>> On 2/28/25 09:16, Laurenz Albe wrote:
>> > On Thu, 2025-02-27 at 16:54 +0300, Alexey Borschev wrote:
>> >> I see poor performance of text sorting of collate "en_US.utf8" in PG 17.4.
>> >
>> > I'd say that you would have to complain to the authors of the
>> > GNU C library, which provides this collation.
>>
>> Yep -- glibc starting with version 2.21 has a massive performance
>> regression for certain cases and the glibc folks have basically said
>> they will not fix it. If you try the same thing on RHEL 7.x with glibc
>> 2.17 it will perform about the same as ICU.
> 
> I've idly wondered if this is the culprit, do you know?
> 
> https://github.com/bminor/glibc/commit/0742aef6e52a935f9ccd69594831b56d807feef3

Yes, that was definitely the one that caused the regression. Note that 
if you look closely you will find there is a revert of that patch on 
glibc on certain distros. But not on RHEL and RHEL-alike.

Someone else pointed out this thread to me:
https://sourceware.org/bugzilla/show_bug.cgi?id=18441

Note the last message on that thread:
8<--------------
  Carlos O'Donell 2019-05-09 20:44:56 UTC

(In reply to vectoroc from comment #13)
 > Hello. Is there any chance that the issues will be fixed? Unfortunately
 > PostgreSQL Is unable to use ICU some base features (e.g in analyze
 > operation).

We haven't had anyone working on strcoll_l performance improvements. So 
it's unlikely that this will get merged or reviewed any time soon.
8<--------------


-- 
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com