On Tue, Jan 16, 2007 at 06:16:22AM +0000, James Russell wrote:
> Description: Private Use Unicode character crashes server when using ILIKE
The archives show that ILIKE is known to be broken with multibyte
characters in 8.1 and earlier, although I don't recall seeing reports
of a crash resulting. I got a crash in 8.1.6 built from the latest
source in CVS; here's a partial stack trace:
(gdb) bt
#0 MBMatchTextIC (t=0x2a98613d1c "�\200\202\206", tlen=4, p=0x0, plen=4) at like_match.c:195
#1 0x00000000005ae558 in texticlike (fcinfo=Variable "fcinfo" is not available.
) at like.c:355
I wonder if this is a problem only with code points outside of Plane 0,
viz., those with UTF-8 sequences longer than three bytes. I don't get
a crash with U+FFFD (E'\357\277\275') but I do with U+10000
(E'\360\220\200\200') and other four-byte sequences.
> - I have not yet tried to reproduce the bug on the latest Postgres 8.2.x
It appears to work in 8.2.1; at least it didn't crash. The 8.2
Release Notes contain the following item:
* Allow ILIKE to work for multi-byte encodings (Tom)
Internally, ILIKE now calls lower() and then uses LIKE. Locale-specific
regular expression patterns still do not work in these encodings.
--
Michael Fuhr