Tsearch2 Initial Search Speed - Mailing list pgsql-performance

From Howard Cole
Subject Tsearch2 Initial Search Speed
Date
Msg-id 4856A91F.1020001@selestial.com
Whole thread Raw
Responses Re: Tsearch2 Initial Search Speed
List pgsql-performance
Hi, I am looking to improve the initial query speed for the following query:

select email_id from email, to_tsquery('default','example') as q where
q@@fts;

This is running on 8.2.4 on Windows Server 2K3.

The initial output from explain analyse is as follows.

"Nested Loop  (cost=8.45..76.70 rows=18 width=8) (actual
time=5776.347..27364.248 rows=14938 loops=1)"
"  ->  Function Scan on q  (cost=0.00..0.01 rows=1 width=32) (actual
time=0.023..0.024 rows=1 loops=1)"
"  ->  Bitmap Heap Scan on email  (cost=8.45..76.46 rows=18 width=322)
(actual time=5776.314..27353.344 rows=14938 loops=1)"
"        Filter: (q.q @@ email.fts)"
"        ->  Bitmap Index Scan on email_fts_index  (cost=0.00..8.44
rows=18 width=0) (actual time=5763.355..5763.355 rows=15118 loops=1)"
"              Index Cond: (q.q @@ email.fts)"
"Total runtime: 27369.091 ms"

Subsequent output is considerably faster. (I am guessing that is because
email_fts_index is cached.

"Nested Loop  (cost=8.45..76.70 rows=18 width=8) (actual
time=29.241..264.712 rows=14938 loops=1)"
"  ->  Function Scan on q  (cost=0.00..0.01 rows=1 width=32) (actual
time=0.008..0.010 rows=1 loops=1)"
"  ->  Bitmap Heap Scan on email  (cost=8.45..76.46 rows=18 width=322)
(actual time=29.224..256.135 rows=14938 loops=1)"
"        Filter: (q.q @@ email.fts)"
"        ->  Bitmap Index Scan on email_fts_index  (cost=0.00..8.44
rows=18 width=0) (actual time=28.344..28.344 rows=15118 loops=1)"
"              Index Cond: (q.q @@ email.fts)"
"Total runtime: 268.663 ms"

The table contains text derived from emails and therefore its contents
and the searches can vary wildly.

Table construction as follows:

CREATE TABLE email
(
  email_id bigint NOT NULL DEFAULT
nextval(('public.email_email_id_seq'::text)::regclass),
  send_to text NOT NULL DEFAULT ''::text,
  reply_from character varying(100) NOT NULL DEFAULT ''::character varying,
  cc text NOT NULL DEFAULT ''::text,
  bcc text NOT NULL DEFAULT ''::text,
  subject text NOT NULL DEFAULT ''::text,
  "content" text NOT NULL DEFAULT ''::text,
  time_tx_rx timestamp without time zone NOT NULL DEFAULT now(),
  fts tsvector,
  CONSTRAINT email_pkey PRIMARY KEY (email_id),
)
WITH (OIDS=FALSE);

-- Index: email_fts_index

CREATE INDEX email_fts_index
  ON email
  USING gist
  (fts);

CREATE INDEX email_mailbox_id_idx
  ON email
  USING btree
  (mailbox_id);


-- Trigger: fts_trigger on email
CREATE TRIGGER fts_trigger
  BEFORE INSERT OR UPDATE
  ON email
  FOR EACH ROW
  EXECUTE PROCEDURE tsearch2('fts', 'send_to', 'reply_from', 'cc',
'content', 'subject');


pgsql-performance by date:

Previous
From: "luke.78@libero.it"
Date:
Subject: function difference(geometry,geometry) is SLOW!
Next
From: Alan Hodgson
Date:
Subject: Re: Tsearch2 Initial Search Speed