Re: WIP: Fast GiST index build - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: WIP: Fast GiST index build
Date
Msg-id 4DEA4B86.9080900@enterprisedb.com
Whole thread Raw
In response to WIP: Fast GiST index build  (Alexander Korotkov <aekorotkov@gmail.com>)
List pgsql-hackers
On 03.06.2011 14:02, Alexander Korotkov wrote:
> Hackers,
>
> WIP patch of fast GiST index build is attached. Code is dirty and comments
> are lacking, but it works. Now it is ready for first benchmarks, which
> should prove efficiency of selected technique. It's time to compare fast
> GiST index build with repeat insert build on large enough datasets (datasets
> which don't fit to cache). There are following aims of testing:
> 1) Measure acceleration of index build.
> 2) Measure change in index quality.
> I'm going to do first testing using synthetic datasets. Everybody who have
> interesting real-life datasets for testing are welcome.

I did some quick performance testing of this. I installed postgis 1.5, 
and loaded an extract of the OpenStreetMap data covering Finland. The 
biggest gist index in that data set is the idx_nodes_geom index on nodes 
table. I have maintenance_work_mem and shared_buffers both set to 512 
MB, and this laptop has 4GB of RAM.

Without the patch, reindexing the index takes about 170 seconds and the 
index size is 321 MB. And with the patch, it takes about 150 seconds, 
and the resulting index size is 319 MB.

The nodes table is 618MB in size, so it fits in RAM. I presume the gain 
would be bigger if it doesn't, as the random I/O to update the index 
starts to hurt more. But this shows that even when it does, this patch 
helps a little bit, and the resulting index size is comparable.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: storing TZ along timestamps
Next
From: Radosław Smogura
Date:
Subject: Re: BLOB support