Hi, I am looking to improve the initial query speed for the following query:
select email_id from email, to_tsquery('default','example') as q where
q@@fts;
This is running on 8.2.4 on Windows Server 2K3.
The initial output from explain analyse is as follows.
"Nested Loop (cost=8.45..76.70 rows=18 width=8) (actual
time=5776.347..27364.248 rows=14938 loops=1)"
" -> Function Scan on q (cost=0.00..0.01 rows=1 width=32) (actual
time=0.023..0.024 rows=1 loops=1)"
" -> Bitmap Heap Scan on email (cost=8.45..76.46 rows=18 width=322)
(actual time=5776.314..27353.344 rows=14938 loops=1)"
" Filter: (q.q @@ email.fts)"
" -> Bitmap Index Scan on email_fts_index (cost=0.00..8.44
rows=18 width=0) (actual time=5763.355..5763.355 rows=15118 loops=1)"
" Index Cond: (q.q @@ email.fts)"
"Total runtime: 27369.091 ms"
Subsequent output is considerably faster. (I am guessing that is because
email_fts_index is cached.
"Nested Loop (cost=8.45..76.70 rows=18 width=8) (actual
time=29.241..264.712 rows=14938 loops=1)"
" -> Function Scan on q (cost=0.00..0.01 rows=1 width=32) (actual
time=0.008..0.010 rows=1 loops=1)"
" -> Bitmap Heap Scan on email (cost=8.45..76.46 rows=18 width=322)
(actual time=29.224..256.135 rows=14938 loops=1)"
" Filter: (q.q @@ email.fts)"
" -> Bitmap Index Scan on email_fts_index (cost=0.00..8.44
rows=18 width=0) (actual time=28.344..28.344 rows=15118 loops=1)"
" Index Cond: (q.q @@ email.fts)"
"Total runtime: 268.663 ms"
The table contains text derived from emails and therefore its contents
and the searches can vary wildly.
Table construction as follows:
CREATE TABLE email
(
email_id bigint NOT NULL DEFAULT
nextval(('public.email_email_id_seq'::text)::regclass),
send_to text NOT NULL DEFAULT ''::text,
reply_from character varying(100) NOT NULL DEFAULT ''::character varying,
cc text NOT NULL DEFAULT ''::text,
bcc text NOT NULL DEFAULT ''::text,
subject text NOT NULL DEFAULT ''::text,
"content" text NOT NULL DEFAULT ''::text,
time_tx_rx timestamp without time zone NOT NULL DEFAULT now(),
fts tsvector,
CONSTRAINT email_pkey PRIMARY KEY (email_id),
)
WITH (OIDS=FALSE);
-- Index: email_fts_index
CREATE INDEX email_fts_index
ON email
USING gist
(fts);
CREATE INDEX email_mailbox_id_idx
ON email
USING btree
(mailbox_id);
-- Trigger: fts_trigger on email
CREATE TRIGGER fts_trigger
BEFORE INSERT OR UPDATE
ON email
FOR EACH ROW
EXECUTE PROCEDURE tsearch2('fts', 'send_to', 'reply_from', 'cc',
'content', 'subject');