Re: GIN index creation extremely slow ? - Mailing list pgsql-hackers

From Teodor Sigaev
Subject Re: GIN index creation extremely slow ?
Date
Msg-id 44B40322.60704@sigaev.ru
Whole thread Raw
In response to GIN index creation extremely slow ?  (Stefan Kaltenbrunner <stefan@kaltenbrunner.cc>)
List pgsql-hackers
Try again, today's patch solves the problem.

Stefan Kaltenbrunner wrote:
> on IRC somebody mentioned that it took >34h to greate a GIN index (on a
> tsvector) on a ~3 Million column table (wikipedia dump) with a
> reasonable speced box (AMD 3400+).
> After getting hold of a dump of said table (around 4,1GB in size) I
> managed to get the following timings:
> 
> test=# CREATE INDEX idxFTI_idx ON wikipedia USING gist(vector);
> CREATE INDEX
> Time: 416122.896 ms
> 
> so about 7 minutes - sounds very reasonable
> 
> test=# CREATE INDEX idxFTI2_idx ON wikipedia USING gin(vector);
> CREATE INDEX
> Time: 52681605.101 ms
> 
> ouch - that makes for a whoppy 14,6hours(!). During that time the box is
> completely CPU bottlenecked and during virtually no IO at all - (varing
> maintainance_work_mem does not seem to make any noticable difference).
> 
> That box is a fast Dual Opteron 2.6Ghz with 8GB RAM and a 4 disk RAID10
> for the WAL and 12 disks for the data running a very recent -HEAD
> checkout ...
> 
> It looks like we still don't have any docs for GIN in the tree so I
> don't know if those timings are expected or not ...
> 
> 
> Stefan
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster

-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
  WWW: http://www.sigaev.ru/
 


pgsql-hackers by date:

Previous
From: David Fetter
Date:
Subject: Re: pgsql-patches considered harmful
Next
From: Hannu Krosing
Date:
Subject: Re: Max size of a btree index entry