Re: GIN index build speed - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: GIN index build speed
Date
Msg-id 1229905340.2285.39.camel@jdavis
Whole thread Raw
In response to GIN index build speed  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
List pgsql-hackers
On Tue, 2008-12-02 at 12:12 +0200, Heikki Linnakangas wrote:
> CREATE TABLE foo (bar tsvector);
> INSERT INTO foo SELECT to_tsvector('foo' || a) FROM generate_series(1, 
> 200000) a;
> CREATE INDEX foogin ON foo USING gin (bar);
> 
> The CREATE INDEX step takes about 40 seconds on my laptop, which seems 
> excessive.
> 

There seems to be a performance cliff right around the value you chose.
On my system:

100000     2 s
125000     9 s
135000    22 s
150000    56 s

I suppose that makes sense, but I was a little surprised the drop-off
was so sharp.

Seems like it would be a useful patch for next version. It may not be
useful for text search in normal situations (as Teodor mentioned), but
it may be useful for indexing arrays, which might be more likely to be
inserted in order.

Regards,Jeff Davis



pgsql-hackers by date:

Previous
From: "Jaime Casanova"
Date:
Subject: Re: rules regression test failed on mingw
Next
From: Jeff Davis
Date:
Subject: Re: [PATCHES] GIN improvements