Re: Full text search with ORDER BY performance issue - Mailing list pgsql-performance

From Oleg Bartunov
Subject Re: Full text search with ORDER BY performance issue
Date
Msg-id Pine.LNX.4.64.0907201604000.8065@sn.sai.msu.ru
Whole thread Raw
In response to Full text search with ORDER BY performance issue  (Krade <krade@krade.com>)
Responses Re: Full text search with ORDER BY performance issue  (Krade <krade@krade.com>)
Re: Full text search with ORDER BY performance issue  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-performance
Krade,

On Sat, 18 Jul 2009, Krade wrote:

> Here's a couple of queries:
>
> archive=> explain analyze select * from a where  comment_tsv @@
> plainto_tsquery('love') order by timestamp desc limit 24 offset 0;
>
> QUERY PLAN
> ----------
> Limit  (cost=453248.73..453248.79 rows=24 width=281) (actual
> time=188441.047..188441.148 rows=24 loops=1)
>   ->  Sort  (cost=453248.73..453882.82 rows=253635 width=281) (actual
> time=188441.043..188441.079 rows=24 loops=1)
>         Sort Key: "timestamp"
>         Sort Method:  top-N heapsort  Memory: 42kB
>         ->  Bitmap Heap Scan on a  (cost=17782.16..446166.02 rows=253635
> width=281) (actual time=2198.930..187948.050 rows=256378 loops=1)
>               Recheck Cond: (comment_tsv @@ plainto_tsquery('love'::text))
>               ->  Bitmap Index Scan on timestamp_comment_gin
> (cost=0.00..17718.75 rows=253635 width=0) (actual time=2113.664..2113.664
> rows=259828 loops=1)
>                     Index Cond: (comment_tsv @@
> plainto_tsquery('love'::text))
> Total runtime: 188442.617 ms
> (9 rows)
>
> archive=> explain analyze select * from a where  comment_tsv @@
> plainto_tsquery('love') limit 24 offset 0;
>
> QUERY PLAN
> ----------
> Limit  (cost=0.00..66.34 rows=24 width=281) (actual time=14.632..53.647
> rows=24 loops=1)
>   ->  Seq Scan on a  (cost=0.00..701071.49 rows=253635 width=281) (actual
> time=14.629..53.588 rows=24 loops=1)
>         Filter: (comment_tsv @@ plainto_tsquery('love'::text))
> Total runtime: 53.731 ms
> (4 rows)
>
> First one runs painfully slow.

Hmm, everything is already written in explain :) In the first query
253635 rows should be readed from disk and sorted, while in the
second query only 24 (random) rows readed from disk, so there is 4 magnitudes
difference and in the worst case you should expected time for the 1st query
about 53*10^4 ms.

>
> Is there really no way to have efficient full text search results ordered by
> a separate field? I'm really open to all possibilities, at this point.
>
> Thanks.
>
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

pgsql-performance by date:

Previous
From: Matthew Wakeling
Date:
Subject: Re: Calling conventions
Next
From: Marcin Stępnicki
Date:
Subject: Re: Full text search with ORDER BY performance issue