Home > mailing lists

Re: [HACKERS] like/ilike improvements - Mailing list pgsql-patches

From	Tom Lane
Subject	Re: [HACKERS] like/ilike improvements
Date	June 1, 2007 22:54:13
Msg-id	25250.1180738444@sss.pgh.pa.us Whole thread Raw
In response to	Re: [HACKERS] like/ilike improvements (Andrew Dunstan <andrew@dunslane.net>)
Responses	Re: [HACKERS] like/ilike improvements
List	pgsql-patches

Tree view

Andrew Dunstan <andrew@dunslane.net> writes:
> OK, here is a patch that I think incorporates all the ideas discussed
> (including part of Mark Mielke's suggestion about optimising %_). There
> is now no special treatment of UTF8 other than its use of a faster
> NextChar macro.

Looks mostly pretty good.  I would suggest replacing tests "tlen == 0"
and "plen == 0" with "<= 0", just so the code doesn't go completely
insane if presented with invalidly-encoded data that causes it to step
beyond the end of data.  Also, this comment is not really good enough:

> !         /*
> !          * It is safe to use NextByte instead of NextChar here, even for
> !          * multi-byte character sets, because we are not following
> !          * immediately after a wildcard character.
> !          */
> !         NextByte(t, tlen);
> !         NextByte(p, plen);
>       }

I'd suggest adding something like "If we are in the middle of a
multibyte character, we must already have matched at least one byte of
the character from both text and pattern; so we cannot get out-of-sync
on character boundaries.  And we know that no backend-legal encoding
allows ASCII characters such as '%' to appear as non-first bytes of
characters, so we won't mistakenly detect a new wildcard."

            regards, tom lane

pgsql-patches by date:

From: Bruce Momjian
Date: 01 June 2007, 21:11:43
Subject: Re: [pgsql-patches] Ctid chain following enhancement

From: Tom Lane
Date: 01 June 2007, 22:58:59
Subject: Re: [HACKERS] like/ilike improvements

Re: [HACKERS] like/ilike improvements - Mailing list pgsql-patches

Previous

Next