Re: [SQL] Searching Text Fields - Case Sensitive? - Mailing list pgsql-sql

From Tom Lane
Subject Re: [SQL] Searching Text Fields - Case Sensitive?
Date
Msg-id 4269.934554435@sss.pgh.pa.us
Whole thread Raw
In response to Re: [SQL] Searching Text Fields - Case Sensitive?  ("Moray McConnachie" <moray.mcconnachie@computing-services.oxford.ac.uk>)
List pgsql-sql
"Moray McConnachie" <moray.mcconnachie@computing-services.oxford.ac.uk> writes:
>> SELECT * FROM mydb WHERE lower(description) LIKE "%foobar%";

> Brings up an interesting question. Which is faster, to use the
> lower/upper functions as above (and of course if foobar is a variable,
> it would also have to be made lower case by the program calling
> pgsql), or to use the pattern matching case-insensitive operator?

Offhand I'd guess that the case-insensitive regex would be faster than
the above, because it avoids the overhead of making an extra function
call.  (Functions that return text, or any other variable-length data
type, require a memory allocation step --- for typical string lengths
I'd imagine the malloc costs more than the actual character-slinging...)

*However*, the situation changes considerably if you have an index
on lower(description) and a pattern that allows the index to be used.
The above pattern does not, but any left-anchored pattern, say
'foobar%', would allow index restriction clauses to be generated and
used.  The effective query with a left-anchored pattern is like this:
... WHERE lower(description) LIKE "foobar%" AND      lower(description) >= "foobar" AND      lower(description) <
"foobas";

(actually there are some character-set issues that complicate matters,
but that's the basic idea).  When you have an index on
lower(description), the system can and will use the last two clauses
to restrict an indexscan so that tuples outside the range foobar to
foobas are never even fetched.  Obviously, this can make for a speedup
of orders of magnitude, swamping any nitpicky little questions of how
fast a particular expression can be evaluated.

If you use a case-insensitive regex then this doesn't work, even for a
left-anchored pattern ('^foobar'), because the system knows that 'f' can
match either 'f' or 'F' so it can't generate the index restriction
clause...
        regards, tom lane


pgsql-sql by date:

Previous
From: Patrik Kudo
Date:
Subject: Re: [SQL] Problems with default date and time
Next
From: "Hutton, Rob"
Date:
Subject: RE: [SQL] Problems with default date and time