Re: Search engine - Mailing list pgsql-www

From Oleg Bartunov
Subject Re: Search engine
Date
Msg-id Pine.LNX.4.64.0612212355400.16338@sn.sai.msu.ru
Whole thread Raw
In response to Re: Search engine  (Robert Treat <xzilla@users.sourceforge.net>)
List pgsql-www
On Thu, 21 Dec 2006, Robert Treat wrote:

> On Tuesday 19 December 2006 17:13, Magnus Hagander wrote:
>> John Hansen wrote:
>>> Magnus Hagander Wrote:
>>>>> Archives search is slow. (5+ seconds to search all lists)
>>>>
>>>> Depends on what you search for ;-) What did you search for?
>>>
>>> 'create table'
>>
>> Yeah, that one is definitely the sort. Just the search over all lists
>> for create table takes about 170ms and returns about 6000 hits.
>> Calculating rank value and sorting it takes the rest :(
>>
>
> Maybe we could add a "click here to see the explain analyze of your search
> query" button.  :-)

Just to make clear the problem, why ordinary SE doesn't have such problem -
it'so because all information needed for ranking is available from index
itself.

Database based SE, like tsearch2, should be able to work even
without an index, so after search is done, which is very fast, we need to
consult heap to get positional information, weights, etc.
Also, since GiST index is lossy, we must check heap to exclude false hits.

I'm wondering if we could store positional information in GiN index,
which is not lossy !

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

pgsql-www by date:

Previous
From: Robert Treat
Date:
Subject: Re: Search engine
Next
From: Robert Treat
Date:
Subject: Re: PlanetPG on the home page