Re: like/ilike improvements - Mailing list pgsql-hackers

From Zeugswetter Andreas ADI SD
Subject Re: like/ilike improvements
Date
Msg-id E1539E0ED7043848906A8FF995BDA579021B259E@m0143.s-mxs.net
Whole thread Raw
In response to Re: like/ilike improvements  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: like/ilike improvements
Re: like/ilike improvements
List pgsql-hackers
> > However, I have just about convinced myself that we don't need
> > IsFirstByte for matching "_" for UTF8, either preceded by "%" or
not,
> > as it should always be true. Can anyone come up with a counter
example?
>
> You have to be on a first byte before you can meaningfully
> apply NextChar, and you have to use NextChar or else you
> don't count characters correctly (eg "__" must match 2 chars
> not 2 bytes).

Well, for utf8 NextChar could advance to the next char even if the
current byte
position is in the middle of a multibyte char (skip over all 10xxxxxx).

(Assuming utf16 surrogate pairs are not encoded as 2 x 3bytes, which is
not valid utf8 anyway)

Andreas


pgsql-hackers by date:

Previous
From: "Guillaume Smet"
Date:
Subject: Re: Why not keeping positions in GIN?
Next
From: "Jaime Casanova"
Date:
Subject: Re: Reviewing temp_tablespaces GUC patch