Periodically slow inserts - Mailing list pgsql-performance
From | Gael Le Mignot |
---|---|
Subject | Periodically slow inserts |
Date | |
Msg-id | plop8762wvspfb.fsf@aoskar.kilobug.org Whole thread Raw |
Responses |
Re: Periodically slow inserts
Re: Periodically slow inserts Re: Periodically slow inserts Re: Periodically slow inserts Re: Periodically slow inserts |
List | pgsql-performance |
Hello, We are using PostgreSQL for storing data and full-text search indexes for the webiste of a daily newspaper. We are very happy overall with the results, but we have one "weird" behaviour that we would like to solve. The problem is when we index objects into the full-text search part of the database (which a DELETE and then an INSERT into a specific table), the INSERT sometimes take a long time (from 10s to 20s), but the same insert (and many other similar ones) are fast (below 0.2s). This slowness comes regularly, about every 200 objects indexed, regardless of the frequency of the inserts. If I reindex one object every 5 seconds for one hour, or one object every second for 10 minutes, I've the same kind of results : around 0.5% of the time, indexing took more than 10s. The positive point is that this slowness doesn't block the rest of queries to the database, but it's still painful to have to wait (even if only once in a while) for 10s or 20s when the end-user hits the "save" button. This slowness is associated with very high IO load on the operating system. I tried playing with checkpoint parameters (making them more frequent or less frequent, but I didn't notice any siginificant difference). Do you have any hint on how to smooth the process, so we don't have this regular huge slowdown ? If you want more details about the setup : - server is a Xen virtual machine with 8Gb of memory, disks being 15000 rpm SAS disks on RAID 1, and CPU being one core of a Nehalem processor (but CPU load is low anyway). - the database schema is like : CREATE TABLE sesql_index ( classname varchar(255), id integer, created_at timestamp, modified_at timestamp, created_by integer, modified_by integer, workflow_state integer, site_id integer, title_text text, title_tsv tsvector, subtitle_text text, subtitle_tsv tsvector, fulltext_text text, fulltext_tsv tsvector, authors integer[], folders integer[], [...] indexed_at timestamp DEFAULT NOW(), PRIMARY KEY (classname, id) ); CREATE TABLE sesql_author (CHECK (classname = 'Author'), PRIMARY KEY (classname, id)) INHERITS (sesql_index); CREATE TABLE sesql_program (CHECK (classname = 'Program'), PRIMARY KEY (classname, id)) INHERITS (sesql_index); CREATE TABLE sesql_default (CHECK (classname = 'Slideshow' OR classname = 'Book' OR classname = 'Article' OR classname = 'Publication' OR classname = 'Forum'), PRIMARY KEY (classname, id)) INHERITS (sesql_index); (with a few other similar tables for different objects). Inserts/deletes are done directly into the child tables, searches are done either on the master table (sesql_index) or on the child tables depending of the use case (but search works fine anyway). In addition to that we have several indexes, created on each child tables : CREATE INDEX sesql_default_classname_index ON sesql_default (classname); CREATE INDEX sesql_default_id_index ON sesql_default (id); CREATE INDEX sesql_default_created_at_index ON sesql_default (created_at); CREATE INDEX sesql_default_modified_at_index ON sesql_default (modified_at); CREATE INDEX sesql_default_created_by_index ON sesql_default (created_by); CREATE INDEX sesql_default_modified_by_index ON sesql_default (modified_by); CREATE INDEX sesql_default_workflow_state_index ON sesql_default (workflow_state); CREATE INDEX sesql_default_site_id_index ON sesql_default (site_id); CREATE INDEX sesql_default_publication_date_index ON sesql_default (publication_date); CREATE INDEX sesql_default_authors_index ON sesql_default USING GIN (authors); CREATE INDEX sesql_default_folders_index ON sesql_default USING GIN (folders); And the heavy ones, for each fulltext field, we have two columns, the text and the tsv, with an index on the tsv, and the tsv itself is updated via a trigger : CREATE INDEX sesql_default_fulltext_index ON sesql_default USING GIN (fulltext_tsv); CREATE TRIGGER sesql_default_fulltext_update BEFORE INSERT OR UPDATE ON sesql_default FOR EACH ROW EXECUTE PROCEDURE tsvector_update_trigger(fulltext_tsv, 'public.lem_french', fulltext_text); Thanks a lot for reading me until here ;) Regards, -- Gaël Le Mignot - gael@pilotsystems.net Pilot Systems - 9, rue Desargues - 75011 Paris Tel : +33 1 44 53 05 55 - www.pilotsystems.net Gérez vos contacts et vos newsletters : www.cockpit-mailing.com
pgsql-performance by date: