Re: Postgresql.org search engine. - Mailing list pgsql-www

From Dave Page
Subject Re: Postgresql.org search engine.
Date
Msg-id 50076.80.177.99.193.1075486051.squirrel@ssl.vale-housing.co.uk
Whole thread Raw
In response to Re: Postgresql.org search engine.  (Oleg Bartunov <oleg@sai.msu.su>)
Responses Re: Postgresql.org search engine.  (Josh Berkus <josh@agliodbs.com>)
Re: Postgresql.org search engine.  (Oleg Bartunov <oleg@sai.msu.su>)
List pgsql-www
It's rumoured that Oleg Bartunov once said:
> On Fri, 30 Jan 2004, Dave Page wrote:
>
>> BTW, searching for 'database' really makes it think! Other queries
>> that generate less hits (eg. Mvcc or psqlodbc) seem to be far quicker.
>
> It would think much longer if you search 'pgsql database' :(
> Just tried and got ~100 sec.
>
Meep!

>
> I suggest to include 'postgresql', 'pgsql', 'postgres' into stop words
> list :(  btw, you may look at word statistics and let top N words
> as stop words.

OK, I'll look at that after dinner - thanks.

>> I have also added some weighting to the indexed sites to try to give
>> preference to those that are more 'authoritative' and of global
>> interest than others. Any comments or suggestions for changes welcome
>> as always!
>
> Hmm, I thought aspseek has sort of page rank, so let him works.

It does, but I'm trying to give a little preference to results on sites
with maximum appeal (ie. those in English), and the most authoritative
(ie. those that are published docs rather than list archives or user
docs).
Also, bear in mind that by default results are grouped by site on the main
search page, so generally you will see results from *all* sites indexed on
a single page (sorted with the site weighting factored in), but then drill
down into a specific site which is unaffected by the site weighting.
Regards, Dave.



pgsql-www by date:

Previous
From: Oleg Bartunov
Date:
Subject: Re: Postgresql.org search engine.
Next
From: Josh Berkus
Date:
Subject: Re: Postgresql.org search engine.