Thread: Case Conversion Functions
Hi, There're lots of places in the code which uses either pg_tolower() or just tolower() - without aware of MB characters; or some on-their-own implementations of pg_tolower(). (Actually, AFAIK, whole MB case conversion is broken in -rHEAD.) For instance, consider backend/utils/adt/{like.c, like_match.c} file. Some lines of iwchareq() are a duplication of pg_tolower(). Another example: backend/parser/scansup.c 152 else if (ch >= 0x80 && isupper(ch)) 153 ch = tolower(ch); Is this an intended behaviour or they're waiting for somebody to clean them up. Regards.
Volkan YAZICI <yazicivo@ttnet.net.tr> writes: > There're lots of places in the code which uses either pg_tolower() > or just tolower() - without aware of MB characters; or some > on-their-own implementations of pg_tolower(). (Actually, AFAIK, > whole MB case conversion is broken in -rHEAD.) The upper/lower functions themselves work AFAIK, but I agree that stuff like ILIKE probably is broken for MB encodings. regex character classes need help too. > Another example: > backend/parser/scansup.c > 152 else if (ch >= 0x80 && isupper(ch)) > 153 ch = tolower(ch); Fooling with that is fairly risky --- we've been burnt before by locale-dependent case folding of SQL identifiers. In particular it'd be really bad if the folding could change on-the-fly at runtime. regards, tom lane