GSoC 2011: Fast GiST index build - Mailing list pgsql-hackers

From Alexander Korotkov
Subject GSoC 2011: Fast GiST index build
Date
Msg-id BANLkTi=kUBOX2e9TP6BmmPRDso70vn8KEw@mail.gmail.com
Whole thread Raw
Responses Re: GSoC 2011: Fast GiST index build  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
List pgsql-hackers
Hackers!

I was happy to know that my proposal "Fast GiST index build" was accepted to GSoC 2011! Thank you very much for support! Especially thanks to Heikki Linnakangas for becoming my mentor! 

The first question that I would like to discuss is the node buffer storage. During index build each index page (except leaf) should have several pages of buffer. So my question is where to store buffers and how to operate with them? It is somewhat similar to GIN fastupdate buffer, but have differences. At first, we should take care about many buffers instead of only one. At second, I belive that we shouldn't take care about concurrency so much, because algorithm assume to perform relatively huge operations in memory (entries relocation between several buffers). That require locking of whole of currently operated buffers. I'm going to store buffers separetely from index itself, because we should free all of them when index is built.

I found some very simple solution about dealing with varlena keys. The greatest buffer size and minimal level step are achived when key size is minimal. Thereby, minimal key size is worst case. Since minimal varlena size is 4 bytes, we can use it in initial calculations. I'm going to hold on this assumption in first implementation.

----
With best regards,
Alexander Korotkov.

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: branching for 9.2devel
Next
From: Pavel Stehule
Date:
Subject: Re: SQLERRD and dump of variables