Re: Queryplan within FTS/GIN index -search. - Mailing list pgsql-performance

From Jesper Krogh
Subject Re: Queryplan within FTS/GIN index -search.
Date
Msg-id 4AEB429D.5000809@krogh.cc
Whole thread Raw
In response to Queryplan within FTS/GIN index -search.  (Jesper Krogh <jesper@krogh.cc>)
Responses Re: Queryplan within FTS/GIN index -search.
List pgsql-performance
Hi.

I've now got a test-set that can reproduce the problem where the two
fully equivalent queries (
body_fts @@ to_tsquery("commonterm & nonexistingterm")
 and
body_fts @@ to_tsquery("coomonterm") AND body_fts @@
to_tsquery("nonexistingterm")

give a difference of x300 in execution time. (grows with
document-base-size).

this can now be reproduced using:

* http://krogh.cc/~jesper/fts-queryplan.pl  and
http://krogh.cc/~jesper/words.txt

It build up a table with 200.000 documents where "commonterm" exists in
all of them. "nonexistingterm" is in 0.

To get the query-planner get a "sane" query I need to do a:
ftstest# set enable_seqscan=off

Then:
 ftstest=# explain analyze select id from ftstest where body_fts @@
to_tsquery('nonexistingterm & commonterm');
                                                           QUERY PLAN


--------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on ftstest  (cost=5563.09..7230.93 rows=1000 width=4)
(actual time=30.861..30.861 rows=0 loops=1)
   Recheck Cond: (body_fts @@ to_tsquery('nonexistingterm &
commonterm'::text))
   ->  Bitmap Index Scan on ftstest_gin_idx  (cost=0.00..5562.84
rows=1000 width=0) (actual time=30.856..30.856 rows=0 loops=1)
         Index Cond: (body_fts @@ to_tsquery('nonexistingterm &
commonterm'::text))
 Total runtime: 30.907 ms
(5 rows)

ftstest=# explain analyze select id from ftstest where body_fts @@
to_tsquery('nonexistingterm') and body_fts @@ to_tsquery('commonterm');
                                                          QUERY PLAN


------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on ftstest  (cost=5565.59..7238.43 rows=1000 width=4)
(actual time=0.059..0.059 rows=0 loops=1)
   Recheck Cond: ((body_fts @@ to_tsquery('nonexistingterm'::text)) AND
(body_fts @@ to_tsquery('commonterm'::text)))
   ->  Bitmap Index Scan on ftstest_gin_idx  (cost=0.00..5565.34
rows=1000 width=0) (actual time=0.057..0.057 rows=0 loops=1)
         Index Cond: ((body_fts @@ to_tsquery('nonexistingterm'::text))
AND (body_fts @@ to_tsquery('commonterm'::text)))
 Total runtime: 0.111 ms
(5 rows)


Run repeatedly to get a full memory recident dataset.

In this situation the former query end up being 300x slower than the
latter allthough they are fully equivalent.



--
Jesper

pgsql-performance by date:

Previous
From: Jeremy Harris
Date:
Subject: Re: database size growing continously
Next
From: Anj Adu
Date:
Subject: Re: database size growing continously