Tsearch2 performance on big database - Mailing list pgsql-performance
From | Rick Jansen |
---|---|
Subject | Tsearch2 performance on big database |
Date | |
Msg-id | 42400F57.3040604@rockingstone.nl Whole thread Raw |
Responses |
Re: Tsearch2 performance on big database
|
List | pgsql-performance |
Hi, I'm looking for a *fast* solution to search thru ~ 4 million records of book descriptions. I've installed PostgreSQL 8.0.1 on a dual opteron server with 8G of memory, running Linux 2.6. I haven't done a lot of tuning on PostgreSQL itself, but here's the settings I have changed so far: shared_buffers = 2000 (anything much bigger says the kernel doesnt allow it, still have to look into that) effective_cache_size = 32768 Here's my table: ilab=# \d books Table "public.books" Column | Type | Modifiers ---------------+------------------------+---------------------------------------------------------- recordnumber | integer | not null default nextval('books_recordnumber_seq'::text) membernumber | integer | not null default 0 booknumber | character varying(20) | not null default ''::character varying author | character varying(60) | not null default ''::character varying titel | text | not null description | character varying(100) | not null default ''::character varying descriprest | text | not null price | bigint | not null default 0::bigint keywords | character varying(100) | not null default ''::character varying dollarprice | bigint | not null default 0::bigint countrynumber | smallint | not null default 0::smallint entrydate | date | not null status | smallint | not null default 0::smallint recordtype | smallint | not null default 0::smallint bookflags | smallint | not null default 0::smallint year | smallint | not null default 0::smallint firstedition | smallint | not null default 0::smallint dustwrapper | smallint | not null default 0::smallint signed | smallint | not null default 0::smallint cover | smallint | not null default 0::smallint specialfield | smallint | not null default 0::smallint idxfti | tsvector | Indexes: "recordnumber_idx" unique, btree (recordnumber) "idxfti_idx" gist (idxfti) idxfti is a tsvector of concatenated description and descriprest. ilab=# select avg(character_length(description)),avg(character_length(descriprest)) from books; avg | avg ---------------------+---------------------- 89.1596992873947218 | 133.0468689304200538 Queries take forever to run. Right now we run a MySQL server, on which we maintain our own indices (we split the description fields by word and have different tables for words and the bookdescriptions they appear in). For example, a query for the word 'terminology' on our MySQL search takes 5.8 seconds and returns 375 results. The same query on postgresql using the tsearch2 index takes 30802.105 ms and returns 298 results. How do I speed this up? Should I change settings, add or change indexes or.. what? Rick Jansen -- Systems Administrator for Rockingstone IT http://www.rockingstone.com http://www.megabooksearch.com - Search many book listing sites at once
pgsql-performance by date: