Creation of tsearch2 index is very slow - Mailing list pgsql-general

From Stephan Vollmer
Subject Creation of tsearch2 index is very slow
Date
Msg-id 43D0ED57.1020608@gmx.de
Whole thread Raw
Responses Re: Creation of tsearch2 index is very slow
Re: Creation of tsearch2 index is very slow
List pgsql-general
Hello!

I noticed that the creation of a GIST index for tsearch2 takes very
long - about 20 minutes. CPU utilization is 100 %, the resulting
index file size is ~25 MB. Is this behaviour normal?

Full text columns:   title          author_list
tsearch2 word lists: fti_title      fti_author_list
tsearch2 indexes:    idx_fti_title  idx_fti_author_list

The table has 700,000 records. When I create a normal B-Tree index
on the same column for testing purposes, it works quite fast
(approx. 30 seconds).

The columns that should be indexed are small, only about 10 words on
average.

System specs:
Athlon64 X2 3800+, 2 GB RAM
PostgreSQL 8.1.2, Windows XP SP2

I've never noticed this problem before, so could it probably be
related to v8.1.2? Or is something in my configuration or table
definition that causes this sluggishness?

Thanks very much in advance for your help!

- Stephan



This is the table definition:
-----------------------------------------------------------------
CREATE TABLE publications
(
  id int4 NOT NULL DEFAULT nextval('publications_id_seq'::regclass),
  publication_type_id int4 NOT NULL DEFAULT 0,
  keyword text NOT NULL,
  mdate date,
  "year" date,
  title text,
  fti_title tsvector,
  author_list text,
  fti_author_list tsvector,
  overview_timestamp timestamp,
  overview_xml text,
  CONSTRAINT publications_pkey PRIMARY KEY (keyword) USING INDEX
      TABLESPACE dblp_index,
  CONSTRAINT publications_publication_type_id_fkey FOREIGN KEY
     (publication_type_id)
      REFERENCES publication_types (id) MATCH SIMPLE
      ON UPDATE RESTRICT ON DELETE RESTRICT,
  CONSTRAINT publications_year_check CHECK (date_part('month'::text,
"year") = 1::double precision AND date_part('day'::text, "year") =
1::double precision)
)
WITHOUT OIDS;

CREATE INDEX fki_publications_publication_type_id
  ON publications
  USING btree
  (publication_type_id)
  TABLESPACE dblp_index;

CREATE INDEX idx_fti_author_list
  ON publications
  USING gist
  (fti_author_list)
  TABLESPACE dblp_index;

CREATE INDEX idx_fti_title
  ON publications
  USING gist
  (fti_title)
  TABLESPACE dblp_index;

CREATE INDEX idx_publications_year
  ON publications
  USING btree
  ("year")
  TABLESPACE dblp_index;

CREATE INDEX idx_publications_year_part
  ON publications
  USING btree
  (date_part('year'::text, "year"))
  TABLESPACE dblp_index;


CREATE TRIGGER tsvectorupdate_all
  BEFORE INSERT OR UPDATE
  ON publications
  FOR EACH ROW
  EXECUTE PROCEDURE multi_tsearch2();



Attachment

pgsql-general by date:

Previous
From: Martijn van Oosterhout
Date:
Subject: Re: How to convert Big5 to UTF8
Next
From: Stephan Vollmer
Date:
Subject: Re: Creation of tsearch2 index is very slow