Re: LIKE search and performance - Mailing list pgsql-performance

From PFC
Subject Re: LIKE search and performance
Date
Msg-id op.tsv6fevgcigqcu@apollo13
Whole thread Raw
In response to Re: LIKE search and performance  (Richard Huxton <dev@archonet.com>)
Responses Re: LIKE search and performance
List pgsql-performance

> OK - any application that allows user-built queries: <choose column:
> foo> <choose filter: contains> <choose target: "bar">
>
> Want another? Any application that has a "search by name" box - users
> can (and do) put one letter in and hit enter.
>
> Unfortunately you don't always have control over the selectivity of
> queries issued.

    -*- HOW TO MAKE A SEARCH FORM -*-

    Imagine you have to code the search on IMDB.

    This is what a smart developer would do

    First, he uses AJAX autocompletion, so the thing is reactive.
    Then, he does not bother the user with a many-fields form. Instead of
forcing the user to think (users HATE that), he writes smart code.
    Does Google Maps have separate fields for country, city, street, zipcode
? No. Because Google is about as smart as it gets.

    So, you parse the user query.

    If the user types, for instance, less than 3 letters (say, spi), he
probably wants stuff that *begins* with those letters. There is no point
in searching for the letter "a" in a million movie titles database.
    So, if the user types "spi", you display "name LIKE spi%", which is
indexed, very fast. And since you're smart, you use AJAX. And you display
only the most popular results (ie. most clicked on).

http://imdb.com/find?s=all&q=spi

    Since 99% of the time the user wanted "spiderman" or "spielberg", you're
done and he's happy. Users like being happy.
    If the user just types "a", you display the first 10 things that start
with "a", this is useless but the user will marvel at your AJAX skillz.
Then he will probably type in a few other letters.

    Then, if the user uses his space bar and types "spi 1980" you'll
recognize a year and display spielberg's movies in 1980.
    Converting your strings to phonetics is also a good idea since about 0.7%
of the l33T teenagers can spell stuff especially spiElberg.

    Only the guy who wants to know who had sex with marilyn monroe on the
17th day of the shooting of Basic Instinct will need to use the Advanced
search.

    If you detect several words, then switch to a prefix-based fulltext
search like Xapian which utterly rocks.
    Example : the user types "savin priv", you search for "savin*" NEAR
"priv*" and you display "saving private ryan" before he has even finished
typing the second word of his query. Users love that, they feel
understood, they will click on your ads and buy your products.

    In all cases, search results should be limited to less than 100 to be
easy on the database. The user doesn't care about a search returning more
than 10-20 results, he will just rephrase the query, and the time taken to
fetch those thousands of records with name LIKE '%a%' will have been
utterly lost. Who goes to page 2 in google results ?

    BOTTOM LINE : databases don't think, you do.

pgsql-performance by date:

Previous
From: "Peter T. Breuer"
Date:
Subject: Re: general PG network slowness (possible cure) (repost)
Next
From: mark@mark.mielke.cc
Date:
Subject: Re: LIKE search and performance