>>>Tom Lane said:> Daniel Kalchev <daniel@digsys.bg> writes:> > To summarize the problem. If key contains (equivalent
cyrillic>> letters) 'ABC', 'ABCD', 'DAB' and 'ABX' and the query is:> > > SELECT key FROM t WHERE key ~* '^AB';> > >
indexscan will be used and the correct tuples ('ABC', 'ABCD' and> > 'ABX') will be returned. If the query is> > >
SELECTkey FROM t WHERE key ~* '^ab';> > > index scan will be used and no tuples will be returned.> > Hm. Is it
possiblethat isalpha() is doing the wrong thing on your> machine? makeIndexable() currently assumes that isalpha()
returnstrue> for any character that is subject to case conversion, but I wonder> whether that's a good enough test.
In fact, after giving it some though... the expression in gram.y
(strcmp(opname,"~*")
== 0 && isalpha(n->val.val.str[pos])))
is wrong. The statement in my view decides that a regular expression is not
indexable if it contains special characters or if it contains non-alpha
characters. Therefore, the statement should be written as:
(strcmp(opname,"~*")
== 0 && !isalpha((unsigned char)n->val.val.str[pos])))
(two fixes :) This makes indexes work for '^abc' (lowercase ASCII). But does
not find anything, which means regex does not work. It does not work for both
ASCII and non-ASCII text/patterns. :-(
> The other possibility is that regexp's internal handling of> case-insensitive matching is not right.
I believe it to be terribly wrong, and some releases ago it worked with 8-bit
characters by just compiling it with -funsigned-char. Now this breaks things...
Daniel