Thread: Improve Full text rank in a query

Improve Full text rank in a query

From

"b wragg"

Date:

07 March 2008, 00:31:55

Hi all,

I'm running the following query to match a supplied text string to an actual
place name which is recorded in a table with extra info like coordinates,
etc.

SELECT ts_rank_cd(textsearchable_index_col , query, 32 /* rank/(rank+1) */)
AS rank,*
FROM gazetteer, to_tsquery('Gunbower|Island|Vic') query
WHERE query @@ textsearchable_index_col    order by rank desc, concise_ga desc,
auda_alloc desc LIMIT 10

When I run this I get the following top two results:

Pos    Rank        Name
State
1    0.23769    Gunbower Island Primary School    Vic
2    0.23769    Gunbower Island                Vic

The textsearchable_index_col for each of these looks like this:

'vic':6 '9999':5 'gunbow':1 'island':2 'school':4 'primari':3 'victoria':7
'vic':4 '9999':3 'gunbow':1 'island':2 'victoria':5

I'm new to this, but I can't figure out why the "Gunbower Island Primary
School" is getting top place. How do I get the query to improve the ranking
so that an exact match (like "Gunbower|Island|Vic") gets a higher position?

Thanks,

bw




No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.516 / Virus Database: 269.21.4/1309 - Release Date: 3/03/2008
6:50 PM

Re: Improve Full text rank in a query

From

Tom Lane

Date:

07 March 2008, 01:40:22

"b wragg" <bwragg@tpg.com.au> writes:
> I'm new to this, but I can't figure out why the "Gunbower Island Primary
> School" is getting top place. How do I get the query to improve the ranking
> so that an exact match (like "Gunbower|Island|Vic") gets a higher position?

I'm new at this too, but AFAICS these are both exact matches: they have
the same matching lexemes at the same positions, so the basic rank
calculation is going to come out exactly the same.  Normalization option
32 doesn't help (as the manual notes, it's purely cosmetic).  So it's
random chance which one comes out first.

What I think you might want is one of the other normalization options,
so that shorter documents are preferred.  Either 1, 2, 8, or 16 would
do fine for this simple example --- which one you want depends on just
how heavily you want to favor shorter documents.

            regards, tom lane

Re: Improve Full text rank in a query

From

Oleg Bartunov

Date:

07 March 2008, 04:44:30

On Fri, 7 Mar 2008, b wragg wrote:

> Hi all,
>
> I'm running the following query to match a supplied text string to an actual
> place name which is recorded in a table with extra info like coordinates,
> etc.
>
> SELECT ts_rank_cd(textsearchable_index_col , query, 32 /* rank/(rank+1) */)
> AS rank,*
> FROM gazetteer, to_tsquery('Gunbower|Island|Vic') query
> WHERE query @@ textsearchable_index_col    order by rank desc, concise_ga desc,
> auda_alloc desc LIMIT 10
>
> When I run this I get the following top two results:
>
> Pos    Rank        Name
> State
> 1    0.23769    Gunbower Island Primary School    Vic
> 2    0.23769    Gunbower Island                Vic
>
> The textsearchable_index_col for each of these looks like this:
>
> 'vic':6 '9999':5 'gunbow':1 'island':2 'school':4 'primari':3 'victoria':7
> 'vic':4 '9999':3 'gunbow':1 'island':2 'victoria':5
>
> I'm new to this, but I can't figure out why the "Gunbower Island Primary
> School" is getting top place. How do I get the query to improve the ranking
> so that an exact match (like "Gunbower|Island|Vic") gets a higher position?

you can read documentation and use document length normalization flag,
or write your own ranking function.

>
> Thanks,
>
> bw
>
>
>
>
> No virus found in this outgoing message.
> Checked by AVG Free Edition.
> Version: 7.5.516 / Virus Database: 269.21.4/1309 - Release Date: 3/03/2008
> 6:50 PM
>
>
>
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83