Re: How to boost performance of queries containing pattern matching characters - Mailing list pgsql-performance

From Shaun Thomas
Subject Re: How to boost performance of queries containing pattern matching characters
Date
Msg-id 4D593458.1020004@peak6.com
Whole thread Raw
In response to How to boost performance of queries containing pattern matching characters  ("Gnanakumar" <gnanam@zoniac.com>)
List pgsql-performance
On 02/14/2011 12:59 AM, Gnanakumar wrote:

> QUERY:  DELETE FROM MYTABLE WHERE EMAIL ILIKE '%domain.com%'
> EMAIL column is VARCHAR(256).

Honestly? You'd be better off normalizing this column and maybe hiding
that fact in a view if your app requires email as a single column. Split
it like this:

So user@gmail.com becomes:

email_acct (user)
email_domain (gmail)
email_tld (com)

This would let you drop the first % on your like match and then
traditional indexes would work just fine. You could also differentiate
between domains with different TLDs without using wildcards, which is
always faster.

I might ask why you are checking email for wildcards after the TLD in
the first place. Is it really so common you are trying to match .com,
.com.au, .com.bar.baz.edu, or whatever? At the very least, splitting the
account from the domain+tld would be beneficial, as it would remove the
necessity of the first wildcard, which is really what's hurting you.

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604
312-676-8870
sthomas@peak6.com

______________________________________________

See  http://www.peak6.com/email_disclaimer.php
for terms and conditions related to this email

pgsql-performance by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Field wise checking the performance.
Next
From: Greg Smith
Date:
Subject: Re: How to boost performance of queries containing pattern matching characters