Re: LIKE search and performance - Mailing list pgsql-performance
From | PFC |
---|---|
Subject | Re: LIKE search and performance |
Date | |
Msg-id | op.tsv6fevgcigqcu@apollo13 Whole thread Raw |
In response to | Re: LIKE search and performance (Richard Huxton <dev@archonet.com>) |
Responses |
Re: LIKE search and performance
|
List | pgsql-performance |
> OK - any application that allows user-built queries: <choose column: > foo> <choose filter: contains> <choose target: "bar"> > > Want another? Any application that has a "search by name" box - users > can (and do) put one letter in and hit enter. > > Unfortunately you don't always have control over the selectivity of > queries issued. -*- HOW TO MAKE A SEARCH FORM -*- Imagine you have to code the search on IMDB. This is what a smart developer would do First, he uses AJAX autocompletion, so the thing is reactive. Then, he does not bother the user with a many-fields form. Instead of forcing the user to think (users HATE that), he writes smart code. Does Google Maps have separate fields for country, city, street, zipcode ? No. Because Google is about as smart as it gets. So, you parse the user query. If the user types, for instance, less than 3 letters (say, spi), he probably wants stuff that *begins* with those letters. There is no point in searching for the letter "a" in a million movie titles database. So, if the user types "spi", you display "name LIKE spi%", which is indexed, very fast. And since you're smart, you use AJAX. And you display only the most popular results (ie. most clicked on). http://imdb.com/find?s=all&q=spi Since 99% of the time the user wanted "spiderman" or "spielberg", you're done and he's happy. Users like being happy. If the user just types "a", you display the first 10 things that start with "a", this is useless but the user will marvel at your AJAX skillz. Then he will probably type in a few other letters. Then, if the user uses his space bar and types "spi 1980" you'll recognize a year and display spielberg's movies in 1980. Converting your strings to phonetics is also a good idea since about 0.7% of the l33T teenagers can spell stuff especially spiElberg. Only the guy who wants to know who had sex with marilyn monroe on the 17th day of the shooting of Basic Instinct will need to use the Advanced search. If you detect several words, then switch to a prefix-based fulltext search like Xapian which utterly rocks. Example : the user types "savin priv", you search for "savin*" NEAR "priv*" and you display "saving private ryan" before he has even finished typing the second word of his query. Users love that, they feel understood, they will click on your ads and buy your products. In all cases, search results should be limited to less than 100 to be easy on the database. The user doesn't care about a search returning more than 10-20 results, he will just rephrase the query, and the time taken to fetch those thousands of records with name LIKE '%a%' will have been utterly lost. Who goes to page 2 in google results ? BOTTOM LINE : databases don't think, you do.
pgsql-performance by date: