tsearch2 and gist index bloat - Mailing list pgsql-general

From George Essig
Subject tsearch2 and gist index bloat
Date
Msg-id 20031105202705.86860.qmail@web80204.mail.yahoo.com
Whole thread Raw
List pgsql-general
I have installed tsearch2 and have noticed that the gist index used to do searches grows and grows
as I update rows, delete rows, or run VACUUM FULL ANALYZE.  Below are some details:

PostgreSQL 7.4RC1
Red Hat 9

                                   Table "public.series"
    Column     |       Type        |                       Modifiers
---------------+-------------------+--------------------------------------------------------
 id            | integer           | not null default nextval('public.series_id_seq'::text)
 database      | character varying | not null
 name          | character varying | not null
 description   | character varying |
 documentation | character varying |
 ts_vec        | tsvector          |
Indexes:
    "series_pkey" primary key, btree (id)
    "series_db_name_un" unique, btree ("database", name)
    "ts_in" gist (ts_vec)
Triggers:
    ts_update BEFORE INSERT OR UPDATE ON series FOR EACH ROW EXECUTE PROCEDURE tsearch2('ts_vec',
'description', 'documentation')

There are 110,873 rows in this table and 13398 unique words indexed by ts_in.  Using oid2name, I
monitored the size of the index ts_in as I performed different operations:

154 MB After the index was created.
190 MB After updating 40,422 rows.
243 MB After VACUUM FULL
275 MB After deleting 40,422 rows & again VACUUM FULL

Below is some output from VACUUM FULL ANALYZE VERBOSE after the 40,422 rows were deleted.

INFO:  index "ts_in" now contains 70451 row versions in 2969 pages
DETAIL:  40422 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.27s/0.44u sec elapsed 1.03 sec.

As the index size grows, search performance slows to a crawl because it's too to fit in RAM.  Is
the only solution to drop and recreate the index after large updates?

Thanks,

George Essig

pgsql-general by date:

Previous
From: Alejandro Javier Pomeraniec
Date:
Subject: Getting time from date field
Next
From: Bruno Wolff III
Date:
Subject: Re: Getting time from date field