Re: CREATE INDEX rather sluggish - Mailing list pgsql-performance
From | Simon Riggs |
---|---|
Subject | Re: CREATE INDEX rather sluggish |
Date | |
Msg-id | 1143738524.13549.103.camel@localhost.localdomain Whole thread Raw |
In response to | CREATE INDEX rather sluggish (Gavin Hamill <gdh@laterooms.com>) |
Responses |
Re: CREATE INDEX rather sluggish
|
List | pgsql-performance |
On Thu, 2006-03-30 at 09:26 +0100, Gavin Hamill wrote: > Hullo, I have pg 8.1.3 on an 8-CPU AIX 5.3 box with 16GB of RAM, and I'm > finding that it's taking an age to CREATE INDEX on a large table: > > Column | Type | Modifiers > ----------------+------------------------+--------------------------------------------------------------------- > ID | integer | not null default nextval(('public.keyword_id_seq'::text)::regclass) > Text | character varying(200) | > Longitude | numeric(16,5) | > Latitude | numeric(16,5) | > AreaID | integer | > SearchCount | integer | not null default 0 > Radius | integer | > LanguageID | integer | > KeywordType | character varying(20) | > LowerText | character varying(200) | > NumberOfHotels | integer | > CountryID | integer | > FriendlyText | character varying(200) | > Indexes: > > > 2006-03-29 21:39:38 BST LOG: duration: 41411.625 ms statement: CREATE INDEX ix_keyword_areaid ON "Keyword" USING btree("AreaID"); > 2006-03-29 21:42:46 BST LOG: duration: 188550.644 ms statement: CREATE INDEX ix_keyword_lonlat ON "Keyword" USING btree("Longitude", "Latitude"); > 2006-03-29 21:46:41 BST LOG: duration: 234864.571 ms statement: CREATE INDEX ix_keyword_lowertext ON "Keyword" USINGbtree ("LowerText"); > 2006-03-29 21:52:32 BST LOG: duration: 350757.565 ms statement: CREATE INDEX ix_keyword_type ON "Keyword" USING btree("KeywordType"); > > The table has just under six million rows - should it really be taking > nearly six minutes to add an index? These log snippets were taking > during a pg_restore on a newly created db, so there should be no issues > with the table needing vacuuming. The index build time varies according to the number and type of the datatypes, as well as the distribution of values in the table. As well as the number of rows in the table. Note the x10 factor to index AreaID (integer) v KeywordType (vchar(20)) > What parameters in the postgresql.conf are pertinent here? I have > > shared_buffers 120000 > work_mem 16384 > maintenance_work_mem = 262144 Try trace_sort = on and then rerun the index builds to see what's happening there. We've speeded sort up by about 2.5 times in the current development version, but it does just run in single threaded mode so your 8 CPUs aren't helping there. Looks like you might be just over the maintenance_work_mem limit for the last index builds. You can try doubling maintenance_work_mem. The extended runtime for KeywordType is interesting in comparison to LowerText, which on the face of it is a longer column. My guess would be that LowerText is fairly unique and sorts quickly, whereas KeywordType is fairly non-unique with a high average row length that require complete string comparison before deciding it is actually the same value. You might want to try using codes rather than textual KeywordTypes. You might try using partial indexes also, along the lines of CREATE INDEX ix_keyword_type ON "Keyword" USING btree ("KeywordType") WHERE KeywordType IS NOT NULL; Best Regards, Simon Riggs
pgsql-performance by date: