Thread: Improve Full text rank in a query
Hi all, I'm running the following query to match a supplied text string to an actual place name which is recorded in a table with extra info like coordinates, etc. SELECT ts_rank_cd(textsearchable_index_col , query, 32 /* rank/(rank+1) */) AS rank,* FROM gazetteer, to_tsquery('Gunbower|Island|Vic') query WHERE query @@ textsearchable_index_col order by rank desc, concise_ga desc, auda_alloc desc LIMIT 10 When I run this I get the following top two results: Pos Rank Name State 1 0.23769 Gunbower Island Primary School Vic 2 0.23769 Gunbower Island Vic The textsearchable_index_col for each of these looks like this: 'vic':6 '9999':5 'gunbow':1 'island':2 'school':4 'primari':3 'victoria':7 'vic':4 '9999':3 'gunbow':1 'island':2 'victoria':5 I'm new to this, but I can't figure out why the "Gunbower Island Primary School" is getting top place. How do I get the query to improve the ranking so that an exact match (like "Gunbower|Island|Vic") gets a higher position? Thanks, bw No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.5.516 / Virus Database: 269.21.4/1309 - Release Date: 3/03/2008 6:50 PM
"b wragg" <bwragg@tpg.com.au> writes: > I'm new to this, but I can't figure out why the "Gunbower Island Primary > School" is getting top place. How do I get the query to improve the ranking > so that an exact match (like "Gunbower|Island|Vic") gets a higher position? I'm new at this too, but AFAICS these are both exact matches: they have the same matching lexemes at the same positions, so the basic rank calculation is going to come out exactly the same. Normalization option 32 doesn't help (as the manual notes, it's purely cosmetic). So it's random chance which one comes out first. What I think you might want is one of the other normalization options, so that shorter documents are preferred. Either 1, 2, 8, or 16 would do fine for this simple example --- which one you want depends on just how heavily you want to favor shorter documents. regards, tom lane
On Fri, 7 Mar 2008, b wragg wrote: > Hi all, > > I'm running the following query to match a supplied text string to an actual > place name which is recorded in a table with extra info like coordinates, > etc. > > SELECT ts_rank_cd(textsearchable_index_col , query, 32 /* rank/(rank+1) */) > AS rank,* > FROM gazetteer, to_tsquery('Gunbower|Island|Vic') query > WHERE query @@ textsearchable_index_col order by rank desc, concise_ga desc, > auda_alloc desc LIMIT 10 > > When I run this I get the following top two results: > > Pos Rank Name > State > 1 0.23769 Gunbower Island Primary School Vic > 2 0.23769 Gunbower Island Vic > > The textsearchable_index_col for each of these looks like this: > > 'vic':6 '9999':5 'gunbow':1 'island':2 'school':4 'primari':3 'victoria':7 > 'vic':4 '9999':3 'gunbow':1 'island':2 'victoria':5 > > I'm new to this, but I can't figure out why the "Gunbower Island Primary > School" is getting top place. How do I get the query to improve the ranking > so that an exact match (like "Gunbower|Island|Vic") gets a higher position? you can read documentation and use document length normalization flag, or write your own ranking function. > > Thanks, > > bw > > > > > No virus found in this outgoing message. > Checked by AVG Free Edition. > Version: 7.5.516 / Virus Database: 269.21.4/1309 - Release Date: 3/03/2008 > 6:50 PM > > > > Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83