Thread: relevance

relevance

From
"jtx"
Date:
I'm trying to build a really basic search engine for a site I'm using -
say I'm going to simplify this as much as I can.

Say I have a table with 2 columns: id, message

Person wants to search for the term 'sql'.  So, I'd do a simple search
like:

select id from tablename where message like '%sql%';

If there any way to determine exactly how many times 'sql' is matched in
that search in each particular row, and then sort by the most matches,
or am I going to have to write a script to do the sorting for me?

Thanks!



Re: relevance

From
Oleg Bartunov
Date:
Have you seen contrib/tsearch ?
http://www.sai.msu.su/~megera/postgres/gist/tsearch/

Oleg

On Wed, 16 Jul 2003, jtx wrote:

> I'm trying to build a really basic search engine for a site I'm using -
> say I'm going to simplify this as much as I can.
>
> Say I have a table with 2 columns: id, message
>
> Person wants to search for the term 'sql'.  So, I'd do a simple search
> like:
>
> select id from tablename where message like '%sql%';
>
> If there any way to determine exactly how many times 'sql' is matched in
> that search in each particular row, and then sort by the most matches,
> or am I going to have to write a script to do the sorting for me?
>
> Thanks!
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly
>
Regards,    Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83


Re: relevance

From
Terence Kearns
Date:
>>select id from tablename where message like '%sql%';
>>
>>If there any way to determine exactly how many times 'sql' is matched in
>>that search in each particular row, and then sort by the most matches,
>>or am I going to have to write a script to do the sorting for me?
>>

You could probably write a function in postgres (say, "matchcount()") 
which returns the match count (possibly using perl and a regex).

SELECT matchcount(message,'sql') AS matchcount, id
FROM tablename
WHERE message LIKE '%sql%'
ORDER BY matchcount(message,'sql') DESC

The ORDER BY will probably fail, but you can try :)


-- 
Terence Kearns ~ ph: +61 2 6201 5516
IT Database/Applications Developer
Enterprise Information Systems
Client Services Division
University of Canberra
www.canberra.edu.au



Re: relevance

From
Rajesh Kumar Mallah
Date:


On Thursday 17 Jul 2003 8:21 am, Terence Kearns wrote:
> >>select id from tablename where message like '%sql%';
> >>
> >>If there any way to determine exactly how many times 'sql' is matched in
> >>that search in each particular row, and then sort by the most matches,
> >>or am I going to have to write a script to do the sorting for me?
>
> You could probably write a function in postgres (say, "matchcount()")
> which returns the match count (possibly using perl and a regex).


Why reinvent the wheel when tsearch already does the job perfectly 
and is PostgreSQL compaitable.

Regds
Mallah.


>
> SELECT matchcount(message,'sql') AS matchcount, id
> FROM tablename
> WHERE message LIKE '%sql%'
> ORDER BY matchcount(message,'sql') DESC
>
> The ORDER BY will probably fail, but you can try :)



Re: relevance

From
Terence Kearns
Date:
Rajesh Kumar Mallah wrote:
> 
> 
> On Thursday 17 Jul 2003 8:21 am, Terence Kearns wrote:
> 
>>>>select id from tablename where message like '%sql%';
>>>>
>>>>If there any way to determine exactly how many times 'sql' is matched in
>>>>that search in each particular row, and then sort by the most matches,
>>>>or am I going to have to write a script to do the sorting for me?
>>
>>You could probably write a function in postgres (say, "matchcount()")
>>which returns the match count (possibly using perl and a regex).
> 
> 
> 
> Why reinvent the wheel when tsearch already does the job perfectly 
> and is PostgreSQL compaitable.
> 

Tsearch2 looks like it would be very useful but it's a lot of work to 
implement IMHO. At least it will probably (assumption) be a lot more 
efficient for massive amounts of data.

Remember the original question was regarding how to list results in 
order of how many hits were returned for each matched record.


-- 
Terence Kearns ~ ph: +61 2 6201 5516
IT Database/Applications Developer
Enterprise Information Systems
Client Services Division
University of Canberra
www.canberra.edu.au