Re: Full text search - query plan? PG 8.4.1 - Mailing list pgsql-performance

From Tom Lane
Subject Re: Full text search - query plan? PG 8.4.1
Date
Msg-id 8636.1255890019@sss.pgh.pa.us
Whole thread Raw
In response to Full text search - query plan? PG 8.4.1  (Jesper Krogh <jesper@krogh.cc>)
Responses Re: Full text search - query plan? PG 8.4.1  (Jesper Krogh <jesper@krogh.cc>)
List pgsql-performance
Jesper Krogh <jesper@krogh.cc> writes:
> "commonterm" matches 37K of the 50K documents (majority), but the query
> plan is "odd" in my eyes.

> * Why does it mis-guess the cost of a Seq Scan on textbody so much?

The cost looks about right to me.  The cost units are not milliseconds.

> * Why doesn't it use the index in "id" to fetch the 10 records?

You haven't *got* an index on id, according to the \d output.

The only part of your results that looks odd to me is the very high cost
estimate for the bitmapscan:

>          ->  Bitmap Heap Scan on textbody  (cost=267377.23..269147.80
> rows=36771 width=5) (actual time=15.763..30.576 rows=37133 loops=1)
>                Recheck Cond: (textbody_body_fts @@
> to_tsquery('commonterm'::text))
>                ->  Bitmap Index Scan on textbody_tfs_idx
> (cost=0.00..267368.04 rows=36771 width=0) (actual time=15.419..15.419
> rows=37134 loops=1)
>                      Index Cond: (textbody_body_fts @@
> to_tsquery('commonterm'::text))

When I try this with a 64K-row table having 'commonterm' in half of the
rows, what I get is estimates of 1530 cost units for the seqscan and
1405 for the bitmapscan (so it prefers the latter).  It will switch over
to using an index on id if I add one, but that's not the point at the
moment.  There's something strange about your tsvector index.  Maybe
it's really huge because the documents are huge?

            regards, tom lane

pgsql-performance by date:

Previous
From: Jeff Janes
Date:
Subject: Re: Calculation of unused columns
Next
From: Jesper Krogh
Date:
Subject: Re: Full text search - query plan? PG 8.4.1