Re: like/ilike improvements - Mailing list pgsql-hackers

From Andrew Dunstan
Subject Re: like/ilike improvements
Date
Msg-id 4656C0A4.1040500@dunslane.net
Whole thread Raw
In response to Re: like/ilike improvements  ("Zeugswetter Andreas ADI SD" <ZeugswetterA@spardat.at>)
Responses Re: like/ilike improvements  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers

Zeugswetter Andreas ADI SD wrote:
>
>> You have to be on a first byte before you can meaningfully 
>> apply NextChar, and you have to use NextChar or else you 
>> don't count characters correctly (eg "__" must match 2 chars 
>> not 2 bytes).
>>     
>
> Well, for utf8 NextChar could advance to the next char even if the
> current byte
> position is in the middle of a multibyte char (skip over all 10xxxxxx). 
>
>
>   

It doesn't matter - we are satisfied that it won't happen. However, this 
might well be a useful optimisation of NextChar() for the UTF8 case as 
something like
 do { (t)++; (tlen)--}  while ((*(t) & 0xC0) == 0x80 && tlen > 0)

In fact, I'm wondering if that might make the other UTF8 stuff redundant 
- the whole point of what we're doing is to avoid expensive calls to 
NextChar;

cheers

andrew


pgsql-hackers by date:

Previous
From: "Jaime Casanova"
Date:
Subject: Re: Reviewing temp_tablespaces GUC patch
Next
From: Andrew Dunstan
Date:
Subject: Re: like/ilike improvements