Re: Full text search ordering question - Mailing list pgsql-sql

From John Lister
Subject Re: Full text search ordering question
Date
Msg-id 57A8B1F6E8C948359ADF3D141B78C8B5@squarepi.com
Whole thread Raw
In response to Full text search ordering question  ("John Lister" <john.lister-ps@kickstone.com>)
List pgsql-sql
Thanks for the reply and apologies, it was my first post.

I'm running on PG 8.3.3 (ubuntu), i haven't tried gin as that doesn't 
support multi-column. I haven't used 8.4 as it is still in development?

a sample query is as follows

select node_id from breadcrumbs where textsearchable @@ to_tsquery('book') 
order by views desc limit 100;

explain analyze results in this:

Limit  (cost=10300.58..10300.83 rows=100 width=381) (actual 
time=69887.851..69887.880 rows=100 loops=1) ->  Sort  (cost=10300.58..10307.61 rows=2812 width=381) (actual 
time=69887.849..69887.862 rows=100 loops=1)       Sort Key: views       Sort Method:  top-N heapsort  Memory: 84kB
->  Bitmap Heap Scan on breadcrumbs  (cost=171.49..10193.10 
 
rows=2812 width=381) (actual time=60311.197..69574.742 rows=569519 loops=1)             Filter: (textsearchable@@
to_tsquery('book'::text))"            ->  Bitmap Index Scan on idx_breadcr  (cost=0.00..170.79 
 
rows=2812 width=0) (actual time=60261.959..60261.959 rows=569519 loops=1)                   Index Cond: (textsearchable
@@to_tsquery('book'::text))
 
Total runtime: 69896.896 ms

As you can see it sorts the full result set from the search. Ideally i'd 
like to use an index on the views.

How stable is 8.4? Is it worth trying that or is the multi-column gin likely 
to be back-ported?

Thanks


> John,
>
> it's a good tradition to include query and their EXPLAIN ANALYZE. Pg 
> version is also useful.
> Did you try GIN index ?
> In 8.4 you can use gin index on (views,tsvector)
>
> Oleg
>
> On Tue, 25 Nov 2008, John Lister wrote:
>
>> Hi, is it possible to order the results of a full text search using 
>> another field?
>>
>> for example with the following table:
>>
>> CREATE TABLE breadcrumbs (
>> node_id integer NOT NULL,
>> breadcrumb character varying,
>> textsearchable tsvector,
>> views integer,
>> CONSTRAINT pk_breadcrumbs PRIMARY KEY (node_id)
>> )
>>
>> I'd like to do something like this
>>
>> select node_id, views from breadcrumbs  where textsearchable @@ 
>> to_tsquery('word') order by views desc limit 100;
>>
>> As such I'd like to create a fts index on the textsearchable field and 
>> views field such that it orders the results by the views column.
>>
>> atm, this table has over 3M rows (and is likely to b magnitudes bigger) 
>> and some words match hundreds of thousands of rows, The best i've got so 
>> far is to create a fts index which is used and then the resulting rows 
>> are sorted in memory. Unfortunately because of the number of rows 
>> returned this takes a few seconds.
>>
>> With a btree index i could index on the 2 columns and it would only hit 
>> the index and take a fraction of a second.
>>
>> I've tried the btree_gist module, but it doesn't make any difference 
>> (except in letting me use an int in the gist index)
>>
>> Any ideas or is this simply not possible?
>>
>> Thanks
>>
>>
>> --
>>
>> Got needs? Get Goblin'! - http://www.pricegoblin.co.uk/
>>
>
>  Regards,
>  Oleg
> _____________________________________________________________
> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
> Sternberg Astronomical Institute, Moscow University, Russia
> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
> phone: +007(495)939-16-83, +007(495)939-23-83
> 



pgsql-sql by date:

Previous
From: Oleg Bartunov
Date:
Subject: Re: Full text search ordering question
Next
From: Tk421
Date:
Subject: EXECUTE query INTO problem