check_strxfrm_bug() - Mailing list pgsql-hackers

From Thomas Munro
Subject check_strxfrm_bug()
Date
Msg-id CA+hUKGJ-ZPJwKHVLbqye92-ZXeLoCHu5wJL6L6HhNP7FkJ=meA@mail.gmail.com
Whole thread Raw
Responses Re: check_strxfrm_bug()  (Thomas Munro <thomas.munro@gmail.com>)
Re: check_strxfrm_bug()  (Nathan Bossart <nathandbossart@gmail.com>)
List pgsql-hackers
Hi

While studying Jeff's new crop of collation patches I noticed in
passing that check_strxfrm_bug() must surely by now be unnecessary.
The buffer overrun bugs were fixed a decade ago, and the relevant
systems are way out of support.  If you're worried that the bugs might
come back, then the test is insufficient: modern versions of both OSes
have strxfrm_l(), which we aren't checking.  In any case, we also
completely disable this stuff because of bugs and quality problems in
every other known implementation, via TRUST_STRXFRM (or rather the
lack of it).  So I think it's time to remove that function; please see
attached.

Just by the way, if you like slow motion domino runs, check this out:

* Original pgsql-bugs investigation into strxfrm() inconsistencies
  https://www.postgresql.org/message-id/flat/111D0E27-A8F3-4A84-A4E0-B0FB703863DF@s24.com

* I happened to test that on bleeding-edge FreeBSD 11 (wasn't released
yet), because at that time FreeBSD was in the process of adopting
illumos's new collation code, and reported teething problems:
  https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=208266

* FreeBSD, DragonFly and illumos's trees were then partially fixed by
the authors, but our strcolltest.c still showed some remaining
problems in some locales (and it still does on my local FreeBSD
battlestation):
  https://github.com/freebsd/freebsd-src/commit/c48dc2a193b9befceda8dfc6f894d73251cc00a4
  https://www.illumos.org/rb/r/402/

* The authors traced the remaining problem to flaws in the Unicode
project's CLDR/POSIX data, and the report was accepted:
  https://www.illumos.org/issues/7962
  https://unicode-org.atlassian.net/browse/CLDR-10394

Eventually that'll be fixed, and (I guess) trigger at least a CLDR
minor version bump affecting all downstream consumers (ICU, ...).
Then... maybe... at least FreeBSD will finally pass that test.  I do
wonder whether other consumer libraries are also confused by that
problem source data, and if not, why not; are glibc's problems related
or just random code or data quality problems in different areas?  (I
also don't know why a problem in that data should affect strxfrm() and
strcoll() differently, but I don't plan to find out owing to an acute
shortage of round tuits).

But in the meantime, I still can't recommend turning on TRUST_STRXFRM
on any OS that I know of!  The strcolltest.c program certainly still
finds fault with glibc 2.36 despite the last update on that redhat
bugzilla ticket that suggested that the big resync back in 2.28 was
going to fix it.

To be fair, macOS does actually pass that test for all locales, but
the strxfrm() result is too narrow to be useful, according to comments
in our tree.  I would guess that a couple of other OSes with the old
Berkeley locale code are similar.

Attachment

pgsql-hackers by date:

Previous
From: Kyotaro Horiguchi
Date:
Subject: Re: Time delayed LR (WAS Re: logical replication restrictions)
Next
From: "houzj.fnst@fujitsu.com"
Date:
Subject: RE: Perform streaming logical transactions by background workers and parallel apply