Thread: Search parameter optimization

Search parameter optimization

From
Bruce De Vries
Date:
I am using postgres to search databases with 1.75 million records. It's a 
yellow pages database.  I read through the mailing lists and learned how to 
use the lower function to speed case insensitive searches, but am still 
getting some results that don't make sence.  For example:

select * from ca where lower(company) like 'baskin robbins' and lower(city) 
~* '^anaheim';

executes 10 times faster than

select * from ca where lower(company) like 'baskin robbins' and lower(city) 
like '^anaheim';

even though

select * from ca where lower(city) like 'anaheim';

is faster than

select * from ca where lower(city) ~* '^anaheim';

I have issued a "vacuum analyze ca" command.

How to I get the fastest search results from complex expressions?

Is varchar or text better or is char better even though it takes up more 
space?  When searching a 5 digit zip code is char or int better?






-----
Bruce  De Vries, Proprietor          http://www.bpdconsulting.com
B. P. D. Consulting                  (714) 632-3841 Phone
2034 E. Lincoln Ave. PMB #344        (800) 828-9578 Toll Free
Anaheim, CA   92806                  (603) 452-8504 FAX




Re: [SQL] Search parameter optimization

From
Tom Lane
Date:
Bruce  De Vries <bpd@bpdconsulting.com> writes:
> I am using postgres to search databases with 1.75 million records. It's a 
> yellow pages database.  I read through the mailing lists and learned how to 
> use the lower function to speed case insensitive searches, but am still 
> getting some results that don't make sence.  For example:

> select * from ca where lower(company) like 'baskin robbins' and lower(city) 
> ~* '^anaheim';

> executes 10 times faster than

> select * from ca where lower(company) like 'baskin robbins' and lower(city) 
> like '^anaheim';

(I assume you didn't actually mean LIKE '^anaheim' there?  ^ is a plain
data character to LIKE, not a pattern anchor.)

You certainly couldn't get a 10x speed differential just from the
runtime of the matching operators.  That has to be due to one query
using a much more efficient query plan than the other.  Rather than
speculating about what the system is doing, how about you run an
"explain" on each of these queries and show us the results?

BTW, I trust you have indexes on lower(company) and lower(city)...
        regards, tom lane