Thread: GSoC proposal: Fast GiST index build
Hello!
Here is the text of my proposal which I've recently applied to GSoC and have mentioned before in
Any comments are welcome.
Fast GiST index build
Synopsis
Currently GiST index don't have any bulk load functionality. It have to create new index by entry insertion one by one. This makes new index creation relatively slow in comparison with btree and even GIN. There are various works in computer science about bulk operation on R-tree. Since GiST in basic is generalization of R-tree, some of them seems to be applicable to GiST. In [2] R-tree bulk operations techniques are presented. Despite some issues related with GiST distinction from R-tree, techniques of this paper seems to be applicable to GiST.
Benefits to the PostgreSQL Community
Faster GiST index build would be actual for many PostgreSQL applications. For example, some GIS systems suffers from index building in may hours and even days.
Quantifiable results
Acceleration of GiST index build.
Project Details
Paper [2] present R-tree bulk operations techniques which are, in general, application of buffer tree [1] to R-tree. The technique proposed in the paper [2] can be very briefly described as following. Additional buffers is attached to nodes in some levels (levels are selected with some step). When entry fall into node with buffer, it is places into buffer. Then buffer is overflowed it runs down into lower buffers or leaf pages.
The results of the paper [2] can have following applications for PostgreSQL GiST:
1) Fast GiST index creation. The proposed technique of bulk insert can be most straight-forwardly applied to GiST index creation. Since R-tree was cosidered in the paper, key length is fixed. In GiST we can deal with varlena keys that can leads complexities. For example, page split can occurs during placing index entry into appropriate buffers. Another difficulty with varlena keys is that size of buffer and level step of buffers are calculated by the number of entries in page. When using varlena keys this number is not constant. Since, these calculations is used only for guarantee operation performance, we can estimate number of entries in page (using average key length from some keys). Herewith performance guarantees wouldn't be so strong.
2) Bulk insert. Since access method interfae in PostgreSQL doesn't contain bulk insert function, bulk insert operations are performed by entries insertion one by one. And access method doesn't know how many entries is expected. This circumstance makes application of bulk insert technique difficult for PostgreSQL GiST. With current access method interface bulk insert can be implemented using technique like fastupdate in GIN. But let us note that such technique lead concurrent index searches to scan buffers. Since, buffer size is significant (the size of most lowest buffer is comparable with size of whole subtree), it can lead to significant slowdown. The finding of compromise between index build acceleration and concurrent search slowdown is a separate research, and it doesn't seem to be feasible to fit it in the GSoC.
3) Bulk delete. Current bulk delete operation in PostgreSQL GiST is fast, but it doesn't perform index restructuring. These can lead to worsening of index during bulk delete. The bulk delete technique in the paper [2] focused on acceleration of bulk delete procedure which perform index restructuring. The problem of application of this approach to GiST is that GiST implementation doesn't provide any explicit guarantees about storage utilization. If such guarantees exists then they are hidden in the picksplit access method. Application of bulk delete technique requires GiST access methos to know about storage utilization guarantees of implementation. In turn it require a GiST operator class refactoring, and it doesn't seem to be feasible to fit it in the GSoC.
Accordingly, in this GSoC project fast GiST index creation would be implemented. The results of this implementation can help to find a way to implement additional advantages noted above.
Inch-stones
1) Solve architecture questions with help of commutiny. I'm going to produce approach to varlena keys handling on this step and discuss issues of buffer handling.
2) First, approximate implementation. On this step implementation might not always properly work with varlena keys, on some corner cases significant slowdown is possible.
3) Accurate implementation of varlena keys handling and detail testing on corner cases. This step supposes detail consideration on particular test cases of corner cases which might cause dip in performance and their solution implementation.
4) Final refactoring, documentation, testing and commit.
Project Schedule
until May 31
Solve architecture questions with help of commutiny.
1 june - 30 june
First, approximate implementation.
1 july - 31 july
Implementation of varlena keys handling and detail testing on corner cases.
August 1 - August 16:
Final refactoring, testing and commit.
Completeness Criteria
Fast GiST index creation is implemented, working and commited.
Links
1) L. Arge. The Bu er Tree: A new technique for optimal I O-algorithms. In S. G. Akl, F. Dehne, J.-R. Sack, and N. Santoro, editors, Algorithms and Data Structures, 4th International Work-shop, WADS '95, volume 955 of Lecture Notes in Computer Science, pages 334 345. Springer, 1993.
2) Lars Arge, Klaus Hinrichs, Jan Vahrenhold, Jeffrey Scott Vitter. Efficient Bulk Operations on Dynamic R-trees. ALENEX '99 Selected papers from the International Workshop on Algorithm Engineering and Experimentation
With best regards,
Alexander Korotkov.
On Mon, Apr 4, 2011 at 7:16 AM, Alexander Korotkov <aekorotkov@gmail.com> wrote: > Project name > Fast GiST index build Would/could/should this be implemented in a manner similar to the existing "GIN fast update" feature? It's occurred to me to wonder whether even btree indexes would benefit from this type of optimization. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Apr 4, 2011 at 7:16 AM, Alexander Korotkov <aekorotkov@gmail.com> wrote: >> Project name >> Fast GiST index build > Would/could/should this be implemented in a manner similar to the > existing "GIN fast update" feature? Fast build and fast update tend to be two different problems ... > It's occurred to me to wonder whether even btree indexes would benefit > from this type of optimization. GIN fast update is a win when you can optimize the insertion of multiple occurrences of the same key. There isn't really any corresponding optimization possible in btree, AFAICS. (Heikki did some work awhile back on btrees with multiple TIDs per key, for low-cardinality tables, which might conceivably admit of a similar optimization. But I haven't heard anything about that in a long time. It wasn't real clear to me where the win over GIN would be for that.) regards, tom lane
On Mon, Apr 4, 2011 at 7:04 PM, Robert Haas <robertmhaas@gmail.com> wrote:
----On Mon, Apr 4, 2011 at 7:16 AM, Alexander Korotkov <aekorotkov@gmail.com> wrote:Would/could/should this be implemented in a manner similar to the
> Project name
> Fast GiST index build
existing "GIN fast update" feature?
I've mentioned this problem in item #2 in project details. In short. Problem is in concurrent selects. Buffers size is significant and their scan in concurrent select can cause significant slow down. Probably, compromise can be achived by using for smaller buffers or something like this, but it's topic of separate research. It doesn't seems to be feasible for me to give a production solution of this problem during GSoC.
With best regards,
Alexander Korotkov.
On Mon, Apr 4, 2011 at 12:46 PM, Alexander Korotkov <aekorotkov@gmail.com> wrote: > On Mon, Apr 4, 2011 at 7:04 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> >> On Mon, Apr 4, 2011 at 7:16 AM, Alexander Korotkov <aekorotkov@gmail.com> >> wrote: >> > Project name >> > Fast GiST index build >> >> Would/could/should this be implemented in a manner similar to the >> existing "GIN fast update" feature? > > I've mentioned this problem in item #2 in project details. In short. Problem > is in concurrent selects. Buffers size is significant and their scan in > concurrent select can cause significant slow down. Probably, compromise can > be achived by using for smaller buffers or something like this, but it's > topic of separate research. It doesn't seems to be feasible for me to give a > production solution of this problem during GSoC. OK. Could you briefly describe the algorithm you propose to implement, bearing in mind that I haven't read the paper? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Apr 4, 2011 at 8:50 PM, Robert Haas <robertmhaas@gmail.com> wrote:
----
With best regards,
Alexander Korotkov.
OK. Could you briefly describe the algorithm you propose to
implement, bearing in mind that I haven't read the paper?
The technique can be very briefly described in following rules.
M = number of index keys fitting in RAM;
B = number of index keys in one page;
1) Additional buffers of M/(2*B) pages each is attached to all nodes of some levels. Levels are selected with step floor(log(M/4B, B))), leaf nodes don't contain buffers. I.e. nodes in levels i*floor(log(M/4B, B))), i = 1,2,3,... contain buffers (numbering is going from down to up, level 0 contain leaf nodes).
2) When entry reaches node with buffer, it is placed into buffer.
3) When buffer is overflowed it runs down into lower buffers or leaf pages.
4) When split occurs in node with buffer, then this buffers splits into two buffers using penalty function.
----
With best regards,
Alexander Korotkov.
Just to clarify situation a bit. I noticed buffer tree technique while reseaching sp-gist and got an idea to use it for improving CREATE INDEX for GiST, which is what we were looking many times. Alexander is working on his thesis and this project suits ideally for him and community. Since I and Teodor are very busy in the moment, it's very important to have one more gist developer available, especially, keeping in mind the energy and motivation of Alexander. He already did several contributions and I have no doubt his work will be useful for us. So, I suggest support his work ! Oleg On Wed, 6 Apr 2011, Alexander Korotkov wrote: > On Mon, Apr 4, 2011 at 8:50 PM, Robert Haas <robertmhaas@gmail.com> wrote: > >> OK. Could you briefly describe the algorithm you propose to >> implement, bearing in mind that I haven't read the paper? >> > > The technique can be very briefly described in following rules. > M = number of index keys fitting in RAM; > B = number of index keys in one page; > 1) Additional buffers of M/(2*B) pages each is attached to all nodes of some > levels. Levels are selected with step floor(log(M/4B, B))), leaf nodes don't > contain buffers. I.e. nodes in levels i*floor(log(M/4B, B))), i = 1,2,3,... > contain buffers (numbering is going from down to up, level 0 contain leaf > nodes). > 2) When entry reaches node with buffer, it is placed into buffer. > 3) When buffer is overflowed it runs down into lower buffers or leaf pages. > 4) When split occurs in node with buffer, then this buffers splits into two > buffers using penalty function. > > ---- > With best regards, > Alexander Korotkov. > Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83