Thread: Weighted Searching

Weighted Searching

From
"Mitch Vincent"
Date:
I emailed the list a while back about doing some weighted searching, asking
if anyone had implemented any kind of weighted search in PostgreSQL.. I'm
still wondering the same thing and if anyone has, I would greatly appreciate
a private email, I'd like to discuss it in detail.. I have several ideas but
most of them are pretty dirty and slow..

What I need to do is allow the user to assign weights to fields and then
specify a minimum weight which would dictate results..

Example :

A search on two fields, degree and years_experience, location_state.

The user assigns degree a weight of 10, years_experience a weight of 10 and
location_state a weight of 10. Then specifies the minimum weight as 20,
meaning that any results returned would have to have at least two of the
fields an exact match (any two that where the sum of the weight equals 20).
This could be carried out to many, many fields and extremely high weights..

The problem I'm having is figuring out a good way to assign the weights to
individual fields and test to see if an individual field is exactly matched
in the query (without running a single query for each field searched on.

Example:

The SQL query for the search above might be :

SELECT * FROM people WHERE degree='MBA' and years_experience='5' and
location_state='Arizona'

I would want people that have an MBA and 5 years experience but they
wouldn't necessarily have to be in Arizona (because our minimum weight is
20, only two would have to match)..

Hopefully I'm not over-explaining to the point of confusion.. If anyone
would have any ideas, please drop me an email.. Thanks!!!

-Mitch






Re: Weighted Searching

From
Stephan Szabo
Date:
I'm not sure how fast it is, but something like this
would work, right?

WHERE
(CASE WHEN degree='MBA' THEN 10 ELSE 0 END +
CASE WHEN years_experience='5' THEN 10 ELSE 0 END _
CASE WHEN location_state='Arizona' THEN 10 ELSE 0 END)
>=20

Also, wouldn't you usually want to be searching where the
weight was given if you had the years_experience or greater?

Stephan Szabo
sszabo@bigpanda.com

On Tue, 12 Sep 2000, Mitch Vincent wrote:

> I emailed the list a while back about doing some weighted searching, asking
> if anyone had implemented any kind of weighted search in PostgreSQL.. I'm
> still wondering the same thing and if anyone has, I would greatly appreciate
> a private email, I'd like to discuss it in detail.. I have several ideas but
> most of them are pretty dirty and slow..
>
> What I need to do is allow the user to assign weights to fields and then
> specify a minimum weight which would dictate results..
>
> Example :
>
> A search on two fields, degree and years_experience, location_state.
>
> The user assigns degree a weight of 10, years_experience a weight of 10 and
> location_state a weight of 10. Then specifies the minimum weight as 20,
> meaning that any results returned would have to have at least two of the
> fields an exact match (any two that where the sum of the weight equals 20).
> This could be carried out to many, many fields and extremely high weights..
>
> The problem I'm having is figuring out a good way to assign the weights to
> individual fields and test to see if an individual field is exactly matched
> in the query (without running a single query for each field searched on.
>
> Example:
>
> The SQL query for the search above might be :
>
> SELECT * FROM people WHERE degree='MBA' and years_experience='5' and
> location_state='Arizona'
>
> I would want people that have an MBA and 5 years experience but they
> wouldn't necessarily have to be in Arizona (because our minimum weight is
> 20, only two would have to match)..
>
> Hopefully I'm not over-explaining to the point of confusion.. If anyone
> would have any ideas, please drop me an email.. Thanks!!!
>
> -Mitch
>
>
>
>
>


Re: Weighted Searching

From
Josh Berkus
Date:
Mr. Vincent,

> I emailed the list a while back about doing some weighted searching, asking
> if anyone had implemented any kind of weighted search in PostgreSQL.. I'm
> still wondering the same thing and if anyone has, I would greatly appreciate
> a private email, I'd like to discuss it in detail.. I have several ideas but
> most of them are pretty dirty and slow..

You really need to do this in PLSQL, Perl or C because there isn't any
good way to implement weighting in pure SQL -- weighting is a
procedureal thing.

I'm currently in the process of designling an HR app that will use quite
elaborate weighting scheme.  Candidates are compared against job
openings and the matches are weighted according to the degree of
similarity (e.g. Job A requires 5 years of experience and the candidate
has 4 or -20% for a weight of -1 but he is in the right location for a
weight of +2 etc.) with the summary of weights to be fudged by the user
according to what s/he considers most important for the job (location,
skills, etc.

I'm not done but I expect this function to be 4-5 pages of PL-SQL. 
Fuzzy logic eats processing power.  I'll be open-sourcing the program in
December, but you probably can't wait that long.

I'd be interested to hear from anyone who's written a web search
engine.  It seems to me that the algorithm for, say, google should be
fairly similar to what I'm doing for HR.
                -Josh Berkus

-- 
______AGLIO DATABASE SOLUTIONS___________________________                                       Josh Berkus  Complete
informationtechnology      josh@agliodbs.com   and data management solutions       (415) 436-9166  for law firms, small
businesses      fax  436-0137   and non-profit organizations.       pager 338-4078                               San
Francisco


Re: Weighted Searching

From
Oleg Bartunov
Date:
We did fulltext search using relevancy function implemented as SPI
postgres mechanism. We use coordinate information of words in text to
calculate weights. I'm harry right now, because is going to airport,
sorry, couldn't give any details.

    Regards,

        Oleg
On Tue, 12 Sep 2000, Mitch Vincent wrote:

> Date: Tue, 12 Sep 2000 09:22:12 -0700
> From: Mitch Vincent <mitch@venux.net>
> To: pgsql-general@postgresql.org
> Cc: pgsql-sql@postgresql.org
> Subject: [SQL] Weighted Searching
>
> I emailed the list a while back about doing some weighted searching, asking
> if anyone had implemented any kind of weighted search in PostgreSQL.. I'm
> still wondering the same thing and if anyone has, I would greatly appreciate
> a private email, I'd like to discuss it in detail.. I have several ideas but
> most of them are pretty dirty and slow..
>
> What I need to do is allow the user to assign weights to fields and then
> specify a minimum weight which would dictate results..
>
> Example :
>
> A search on two fields, degree and years_experience, location_state.
>
> The user assigns degree a weight of 10, years_experience a weight of 10 and
> location_state a weight of 10. Then specifies the minimum weight as 20,
> meaning that any results returned would have to have at least two of the
> fields an exact match (any two that where the sum of the weight equals 20).
> This could be carried out to many, many fields and extremely high weights..
>
> The problem I'm having is figuring out a good way to assign the weights to
> individual fields and test to see if an individual field is exactly matched
> in the query (without running a single query for each field searched on.
>
> Example:
>
> The SQL query for the search above might be :
>
> SELECT * FROM people WHERE degree='MBA' and years_experience='5' and
> location_state='Arizona'
>
> I would want people that have an MBA and 5 years experience but they
> wouldn't necessarily have to be in Arizona (because our minimum weight is
> 20, only two would have to match)..
>
> Hopefully I'm not over-explaining to the point of confusion.. If anyone
> would have any ideas, please drop me an email.. Thanks!!!
>
> -Mitch
>
>
>
>
>

_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83