Re: optimizing SELECT with LIKE - Mailing list pgsql-general

From Ericson Smith
Subject Re: optimizing SELECT with LIKE
Date
Msg-id 1023902717.2047.16.camel@localhost.localdomain
Whole thread Raw
In response to optimizing SELECT with LIKE  (Jorge Sarmiento <jsarmiento@ccom.org>)
Responses Re: optimizing SELECT with LIKE  (Scott Marlowe <scott.marlowe@ihs.com>)
List pgsql-general
Hmmm....

That's gonna be super slow.

You might want to create 2 additional tables:

keywords
--------
id
keyword

searchmap
---------
keywordid
documentid

Then index your data so that
1. There is a unique entry in keywords for your existing data
2. Your searchmap would have an entry for keywordid, and a documentid

To search for data...
1. Find the ID of the keywords in question eg:
JORGE => 55
SARMIENTO => 89

2. Get data from searchmap like so:
SELECT documentid, count(*) as hits FROM searchmap WHERE keywordid IN
(55,89) GROUP BY documentid ORDER BY hits DESC

3. Get your documentid's from the document id in #2 above.

This will *ligntning* fast.

- Ericson Smith
eric@did-it.com
http://www.did-it.com

On Wed, 2002-06-12 at 11:43, Jorge Sarmiento wrote:
> I have 3000000 rows in a database where I have to make a:
>
> SELECT name FROM table WHERE name LIKE '%firstname%' AND name LIKE
> '%secondname%';
>
> to obtain the data I need.
>
> Due to the data nature, there is no other way to look for the data, it's old
> data that was registered in paper years ago, with no searchable index... just
> names, that sometimes are wrote like: JORGE SARMIENTO, other times like:
> SARMIENTO, JORGE and other times like JORGE LUIS SARMIENTO. and due to legal
> reasons, the data must be entered the exact way it was wroten.
>
> Is there any way to optimize postgres for doing this kind of SELECT?
>
> thanx in advance!
>
> Jorge S.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org



pgsql-general by date:

Previous
From: "Tom Burke"
Date:
Subject: Re: Updates are slow..
Next
From: Bruce Momjian
Date:
Subject: Re: [BUGS] createdb comments