On Mon, Jul 20, 2009 at 8:12 AM, Oleg Bartunov<oleg@sai.msu.su> wrote:
>> Here's a couple of queries:
>>
>> archive=> explain analyze select * from a where comment_tsv @@
>> plainto_tsquery('love') order by timestamp desc limit 24 offset 0;
>>
>> QUERY PLAN
>> ----------
>> Limit (cost=453248.73..453248.79 rows=24 width=281) (actual
>> time=188441.047..188441.148 rows=24 loops=1)
>> -> Sort (cost=453248.73..453882.82 rows=253635 width=281) (actual
>> time=188441.043..188441.079 rows=24 loops=1)
>> Sort Key: "timestamp"
>> Sort Method: top-N heapsort Memory: 42kB
>> -> Bitmap Heap Scan on a (cost=17782.16..446166.02 rows=253635
>> width=281) (actual time=2198.930..187948.050 rows=256378 loops=1)
>> Recheck Cond: (comment_tsv @@ plainto_tsquery('love'::text))
>> -> Bitmap Index Scan on timestamp_comment_gin
>> (cost=0.00..17718.75 rows=253635 width=0) (actual time=2113.664..2113.664
>> rows=259828 loops=1)
>> Index Cond: (comment_tsv @@
>> plainto_tsquery('love'::text))
>> Total runtime: 188442.617 ms
>> (9 rows)
>>
>> archive=> explain analyze select * from a where comment_tsv @@
>> plainto_tsquery('love') limit 24 offset 0;
>>
>> QUERY PLAN
>> ----------
>> Limit (cost=0.00..66.34 rows=24 width=281) (actual time=14.632..53.647
>> rows=24 loops=1)
>> -> Seq Scan on a (cost=0.00..701071.49 rows=253635 width=281) (actual
>> time=14.629..53.588 rows=24 loops=1)
>> Filter: (comment_tsv @@ plainto_tsquery('love'::text))
>> Total runtime: 53.731 ms
>> (4 rows)
>>
>> First one runs painfully slow.
>
> Hmm, everything is already written in explain :) In the first query 253635
> rows should be readed from disk and sorted, while in the
> second query only 24 (random) rows readed from disk, so there is 4
> magnitudes
> difference and in the worst case you should expected time for the 1st query
> about 53*10^4 ms.
If love is an uncommon word, there's no help for queries of this type
being slow unless the GIN index can return the results in order. But
if love is a common word, then it would be faster to do an index scan
by timestamp on the baserel and then treat comment_tsv @@
plainto_tsquery('love') as a filter condition. Is this a selectivity
estimation bug?
...Robert