Home > mailing lists

Re: UTF8MatchText - Mailing list pgsql-patches

From	Andrew Dunstan
Subject	Re: UTF8MatchText
Date	May 17, 2007 15:36:59
Msg-id	464CA0C2.4010700@dunslane.net Whole thread Raw
In response to	Re: UTF8MatchText (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: UTF8MatchText
List	pgsql-patches

Tree view


Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>
>> Tom Lane wrote:
>>
>>> Wait a second ... I just thought of a counterexample that destroys the
>>> entire concept.  Consider the pattern 'A__B', which clearly is supposed
>>> to match strings of four *characters*.  With the proposed patch in
>>> place, it would match strings of four *bytes*.  Which is not the correct
>>> behavior.
>>>
>
>
>>  From what I can see the code is quite careful about when it calls
>> NextByte vs NextChar, and after _ it calls NextChar.
>>
>
> Except that the entire point of this patch is to dumb down NextChar to
> be the same as NextByte for UTF8 strings.
>
>
>

That's not what I see in (what I think is) the latest submission, which
includes this snippet:

+ /* Set up for utf8 characters */
+ #define CHAREQ(p1, p2)    wchareq(p1, p2)
+ #define NextChar(p, plen) \
+   do { int __l = pg_utf_mblen(p); (p) +=__l; (plen) -=__l; } while (0)
+
+ /*
+  * UTF8MatchText -- specialized version of MBMatchText for UTF8
+  */
+ static int
+ UTF8MatchText(char *t, int tlen, char *p, int plen)

Am I looking at the wrong thing? This is from around April 9th I think.


cheers

andrew

pgsql-patches by date:

From: Tom Lane
Date: 17 May 2007, 15:17:34
Subject: Re: UTF8MatchText

From: Gregory Stark
Date: 17 May 2007, 15:38:52
Subject: Re: CREATE TABLE LIKE INCLUDING INDEXES support

Re: UTF8MatchText - Mailing list pgsql-patches

Previous

Next