Re: Ranking search results using multiple fields in PostgreSQL fulltext search - Mailing list pgsql-general

From Gaini Rajeshwar
Subject Re: Ranking search results using multiple fields in PostgreSQL fulltext search
Date
Msg-id 56b36eb60910122312o1684d6dcrbd1f2e966e47ba0d@mail.gmail.com
Whole thread Raw
In response to Re: Ranking search results using multiple fields in PostgreSQL fulltext search  (Ivan Sergio Borgonovo <mail@webthatworks.it>)
List pgsql-general


On Mon, Oct 12, 2009 at 8:02 PM, Ivan Sergio Borgonovo <mail@webthatworks.it> wrote:
On Mon, 12 Oct 2009 19:26:55 +0530
Gaini Rajeshwar <raja.rajeshwar2006@gmail.com> wrote:

> Ivan,
> If i create a tsvector as you mentioned with concatenation
> operator, my search query will search in any of these fields which
> are concatenated in my tsvector.
> For example, if i create tsvector like this,
> UPDATE document_table SET search_col =
>   setweight(to_tsvector(coalesce(title,'')), 'A') ||
>   setweight(to_tsvector(coalesce(summary,'')), 'B'));
>
> and do a query like this
> select title, ts_rank(search_col, to_tsquery('this is my text
> search') AS rank
> FROM search_col @@ to_tsvector('this & is & my & text & search')
> ORDER BY rank DESC
> the above query will search in title and summary and will give me
> the results. But i dont want in that way.When a user wants to
> search in title, it should just search in title but the results
> should be ranked based on * title* and *summary* field.

Search *just* in title specifying the weight in the input query and
rank on title and summary.

/*
-- somewhere else in your code...
search_col := setweight(cfg, title, 'A', '&');
search_col := search_col && setweight(cfg, summary, 'B', '&');
*/


select rank(search_col, to_tsquery(inputtitle)) as rank
-- rank on both if search_col just contains title and summary
...
where search_col @@ setweight(cfg, inputtitle, 'A', '&')
-- return just matching title
order by ts_rank(...)
Yes, it is true.but there is bit difficulty in using this method to my application. As i want to rank results based on many fields, if i concatenate all these fields into search_col, it can be a problematic. It will be problematic, because PostgreSQL by default supports 256 positions for lexeme and 1MB for ts_vector() size. If i concatenate in this way, then it can be a very much lossy, and my ranking may not be perfect.
Instead of that way, i am just wondering if i can specify manually more than one fields in the ts_rank() function itself, rather than specifying search_col which is prepared by contactenating other fields.
I hope, i am clear from my side. let me know if i am not making sense.
 
is it what you need?
 

This is just one of the possible way to rank something...

otherwise: really understand how rank is computed, keep
columns/ts_vector separated, compute rank for each column and pass
the result to some magic function that will compute a "cumulative"
ranking...
Or you could write your own ts_rank... but I tend to trust Oleg and
common practice with pg rather than inventing my own ranking
function.

Right now ts_rank* are black boxes for me. I envisioned I may enjoy
some finer tuning on ranking... but currently they really do a good
job.

--
Ivan Sergio Borgonovo
http://www.webthatworks.it


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

pgsql-general by date:

Previous
From: Gaini Rajeshwar
Date:
Subject: Re: Are there only 4 weights in PostgreSQL fulltext search?
Next
From: John R Pierce
Date:
Subject: Re: Cannot start the postgres service