Re: `pg_trgm` not recognizing Chinese characters in macOS - Mailing list pgsql-bugs

From Tom Lane
Subject Re: `pg_trgm` not recognizing Chinese characters in macOS
Date
Msg-id 18165.1536672013@sss.pgh.pa.us
Whole thread Raw
In response to `pg_trgm` not recognizing Chinese characters in macOS  (Haotian Yang <yangnw@live.com>)
Responses 回复:`pg_trgm` not recognizing Chinese characters in macOS
List pgsql-bugs
Haotian Yang <yangnw@live.com> writes:
> Versions: macOS 10.13.6, PostgreSQL 10.5, pg_trgm 1.3.
> LC_ALL=en_US.UTF-8

pg_trgm relies on libc's functions (specifically, iswalpha()) to determine
what is a word character or not.  Unfortunately, the UTF8 locale support
in macOS is pretty incomplete, and I don't find it too surprising that
it's not recognizing Chinese characters as alphabetic.  Now, you could
make a good argument that they *shouldn't* be considered alphabetic in
an en_US locale; but I'm unsure whether switching to a more appropriate
locale will help.

Anyway, I'd first try zh_CN.UTF-8, and if that doesn't fix it, the place
to complain is https://bugreport.apple.com/ ... I'm sure they know about
it already, but the number of reports has an impact on how fast they
fix things.

            regards, tom lane


pgsql-bugs by date:

Previous
From: Andrew Gierth
Date:
Subject: Re: BUG #15379: Release process of the index access method is not called when an error occurs.
Next
From: Alexander Korotkov
Date:
Subject: Re: BUG #15378: SP-GIST memory context screwup?