Thread: Weighted Searching
I emailed the list a while back about doing some weighted searching, asking if anyone had implemented any kind of weighted search in PostgreSQL.. I'm still wondering the same thing and if anyone has, I would greatly appreciate a private email, I'd like to discuss it in detail.. I have several ideas but most of them are pretty dirty and slow.. What I need to do is allow the user to assign weights to fields and then specify a minimum weight which would dictate results.. Example : A search on two fields, degree and years_experience, location_state. The user assigns degree a weight of 10, years_experience a weight of 10 and location_state a weight of 10. Then specifies the minimum weight as 20, meaning that any results returned would have to have at least two of the fields an exact match (any two that where the sum of the weight equals 20). This could be carried out to many, many fields and extremely high weights.. The problem I'm having is figuring out a good way to assign the weights to individual fields and test to see if an individual field is exactly matched in the query (without running a single query for each field searched on. Example: The SQL query for the search above might be : SELECT * FROM people WHERE degree='MBA' and years_experience='5' and location_state='Arizona' I would want people that have an MBA and 5 years experience but they wouldn't necessarily have to be in Arizona (because our minimum weight is 20, only two would have to match).. Hopefully I'm not over-explaining to the point of confusion.. If anyone would have any ideas, please drop me an email.. Thanks!!! -Mitch
I'm not sure how fast it is, but something like this would work, right? WHERE (CASE WHEN degree='MBA' THEN 10 ELSE 0 END + CASE WHEN years_experience='5' THEN 10 ELSE 0 END _ CASE WHEN location_state='Arizona' THEN 10 ELSE 0 END) >=20 Also, wouldn't you usually want to be searching where the weight was given if you had the years_experience or greater? Stephan Szabo sszabo@bigpanda.com On Tue, 12 Sep 2000, Mitch Vincent wrote: > I emailed the list a while back about doing some weighted searching, asking > if anyone had implemented any kind of weighted search in PostgreSQL.. I'm > still wondering the same thing and if anyone has, I would greatly appreciate > a private email, I'd like to discuss it in detail.. I have several ideas but > most of them are pretty dirty and slow.. > > What I need to do is allow the user to assign weights to fields and then > specify a minimum weight which would dictate results.. > > Example : > > A search on two fields, degree and years_experience, location_state. > > The user assigns degree a weight of 10, years_experience a weight of 10 and > location_state a weight of 10. Then specifies the minimum weight as 20, > meaning that any results returned would have to have at least two of the > fields an exact match (any two that where the sum of the weight equals 20). > This could be carried out to many, many fields and extremely high weights.. > > The problem I'm having is figuring out a good way to assign the weights to > individual fields and test to see if an individual field is exactly matched > in the query (without running a single query for each field searched on. > > Example: > > The SQL query for the search above might be : > > SELECT * FROM people WHERE degree='MBA' and years_experience='5' and > location_state='Arizona' > > I would want people that have an MBA and 5 years experience but they > wouldn't necessarily have to be in Arizona (because our minimum weight is > 20, only two would have to match).. > > Hopefully I'm not over-explaining to the point of confusion.. If anyone > would have any ideas, please drop me an email.. Thanks!!! > > -Mitch > > > > >
Mr. Vincent, > I emailed the list a while back about doing some weighted searching, asking > if anyone had implemented any kind of weighted search in PostgreSQL.. I'm > still wondering the same thing and if anyone has, I would greatly appreciate > a private email, I'd like to discuss it in detail.. I have several ideas but > most of them are pretty dirty and slow.. You really need to do this in PLSQL, Perl or C because there isn't any good way to implement weighting in pure SQL -- weighting is a procedureal thing. I'm currently in the process of designling an HR app that will use quite elaborate weighting scheme. Candidates are compared against job openings and the matches are weighted according to the degree of similarity (e.g. Job A requires 5 years of experience and the candidate has 4 or -20% for a weight of -1 but he is in the right location for a weight of +2 etc.) with the summary of weights to be fudged by the user according to what s/he considers most important for the job (location, skills, etc. I'm not done but I expect this function to be 4-5 pages of PL-SQL. Fuzzy logic eats processing power. I'll be open-sourcing the program in December, but you probably can't wait that long. I'd be interested to hear from anyone who's written a web search engine. It seems to me that the algorithm for, say, google should be fairly similar to what I'm doing for HR. -Josh Berkus -- ______AGLIO DATABASE SOLUTIONS___________________________ Josh Berkus Complete informationtechnology josh@agliodbs.com and data management solutions (415) 436-9166 for law firms, small businesses fax 436-0137 and non-profit organizations. pager 338-4078 San Francisco
We did fulltext search using relevancy function implemented as SPI postgres mechanism. We use coordinate information of words in text to calculate weights. I'm harry right now, because is going to airport, sorry, couldn't give any details. Regards, Oleg On Tue, 12 Sep 2000, Mitch Vincent wrote: > Date: Tue, 12 Sep 2000 09:22:12 -0700 > From: Mitch Vincent <mitch@venux.net> > To: pgsql-general@postgresql.org > Cc: pgsql-sql@postgresql.org > Subject: [SQL] Weighted Searching > > I emailed the list a while back about doing some weighted searching, asking > if anyone had implemented any kind of weighted search in PostgreSQL.. I'm > still wondering the same thing and if anyone has, I would greatly appreciate > a private email, I'd like to discuss it in detail.. I have several ideas but > most of them are pretty dirty and slow.. > > What I need to do is allow the user to assign weights to fields and then > specify a minimum weight which would dictate results.. > > Example : > > A search on two fields, degree and years_experience, location_state. > > The user assigns degree a weight of 10, years_experience a weight of 10 and > location_state a weight of 10. Then specifies the minimum weight as 20, > meaning that any results returned would have to have at least two of the > fields an exact match (any two that where the sum of the weight equals 20). > This could be carried out to many, many fields and extremely high weights.. > > The problem I'm having is figuring out a good way to assign the weights to > individual fields and test to see if an individual field is exactly matched > in the query (without running a single query for each field searched on. > > Example: > > The SQL query for the search above might be : > > SELECT * FROM people WHERE degree='MBA' and years_experience='5' and > location_state='Arizona' > > I would want people that have an MBA and 5 years experience but they > wouldn't necessarily have to be in Arizona (because our minimum weight is > 20, only two would have to match).. > > Hopefully I'm not over-explaining to the point of confusion.. If anyone > would have any ideas, please drop me an email.. Thanks!!! > > -Mitch > > > > > _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83