Andrew Dunstan wrote:
>
>
> Tom Lane wrote:
>> Andrew Dunstan <andrew@dunslane.net> writes:
>>
>>> ... It turns out (according to the analysis) that the only time we
>>> actually need to use NextChar is when we are matching an "_" in a
>>> like/ilike pattern.
>>>
>>
>> I thought we'd determined that advancing bytewise for "%" was also
>> risky,
>> in two cases:
>>
>> 1. Multibyte character set that is not UTF8 (more specifically, does not
>> have a guarantee that first bytes and not-first bytes are distinct)
I thought we disposed of the idea that there was a problem with charsets
that didn't do first byte special.
And Dennis said:
> Tom Lane skrev:
>> You could imagine trying to do
>> % a byte at a time (and indeed that's what I'd been thinking it did)
>> but that gets you out of sync which breaks the _ case.
>
> It is only when you have a pattern like '%_' when this is a problem
> and we could detect this and do byte by byte when it's not. Now we
> check (*p == '\\') || (*p == '_') in each iteration when we scan over
> characters for '%', and we could do it once and have different loops
> for the two cases.
That's pretty much what the patch does now - It never tries to match a
single byte when it sees "_", whether or not preceeded by "%".
cheers
andrew