On Tue, Mar 24, 2026 at 4:07 PM Jeff Davis <pgsql@j-davis.com> wrote:
On Sat, 2026-03-21 at 20:14 -0700, Mark Dilger wrote: > After v2-0001, ILIKE uses str_casefold() for matching, but pg_trgm > still > uses str_tolower() for trigram extraction (trgm_op.c:352 and :948). > With builtin collations, these produce different results.
Interesting, thank you. As stated in the original message, I was unsure about changing pg_trgm without adjusting the regex logic, also:
do you have a suggestion about an easy way to do that, or should we revisit in the next cycle?
pg_trgm appears to be lossy, with recheck logic. I would think you just need to make it give answers which at least include everything that a regex would match, and then allow recheck to prune that down. My concern is having pg_trgm give less than all the answers, so that after recheck you get fewer results than a seqscan would have returned. Would switching to casefold be strictly broader than regex? If so, you would just need to convert pg_trgm to use casefold and then rely on the recheck machinery.
Sorry if this misses something discussed upthread. I'm clearly assuming here that you don't mind that such a change necessitates a REINDEX.