Re: Writting a "search engine" for a pgsql DB - Mailing list pgsql-performance

From Mark Stosberg
Subject Re: Writting a "search engine" for a pgsql DB
Date
Msg-id 45E323AB.1040009@summersault.com
Whole thread Raw
In response to Re: Writting a "search engine" for a pgsql DB  (Madison Kelly <linux@alteeve.com>)
Responses Re: Writting a "search engine" for a pgsql DB  (Madison Kelly <linux@alteeve.com>)
List pgsql-performance
Madison Kelly wrote:
>
>   I think the more direct question I was trying to get at is "How do you
> build a 'relavence' search engine? One where results are returned/sorted
> by relevance of some sort?". At this point, the best I can think of,
> would be to perform multiple queries; first matching the whole search
> term, then the search term starting a row, then ending a row, then
> anywhere in a row and "scoring" the results based on which query they
> came out on. This seems terribly cumbersome (and probably slow, indexes
> be damned) though. I'm hoping there is a better way! :)

Madison,

I think your basic thinking is correct. However, the first "select" can
done "offline" -- sometime beforehand.

For example, you might create a table called "keywords" that includes
the list of words mined in the other tables, along with references to
where the words are found, and how many times they are mentioned.

Then, when someone actually searches, the search is primarily on the
"keywords" table, which is now way to sort by "rank", since the table
contains how many times each keyword matches. The final result can be
constructed by using the details in the keywords table to pull up the
actual records needed.

My expectation however is that there are enough details in the system,
that I would first look at trying a package like tsearch2 to help solve
the problem, before trying to write another system like this from scratch.

  Mark


pgsql-performance by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: Writting a "search engine" for a pgsql DB
Next
From: Madison Kelly
Date:
Subject: Re: Writting a "search engine" for a pgsql DB