Re: Fast insertion indexes: why no developments - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Fast insertion indexes: why no developments |
Date | |
Msg-id | CA+TgmoaSxhFjGVYJgWBQGVwH=EzEk+VJp5QSK9h-uZg+H+KkRA@mail.gmail.com Whole thread Raw |
In response to | Re: Fast insertion indexes: why no developments (Simon Riggs <simon@2ndQuadrant.com>) |
Responses |
Re: Fast insertion indexes: why no developments
Re: Fast insertion indexes: why no developments Re: Fast insertion indexes: why no developments |
List | pgsql-hackers |
On Sat, Nov 2, 2013 at 6:07 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > On 29 October 2013 16:10, Peter Geoghegan <pg@heroku.com> wrote: >> On Tue, Oct 29, 2013 at 7:53 AM, Leonardo Francalanci <m_lists@yahoo.it> wrote: >>> I don't see much interest in insert-efficient indexes. >> >> Presumably someone will get around to implementing a btree index >> insertion buffer one day. I think that would be a particularly >> compelling optimization for us, because we could avoid ever inserting >> index tuples that are already dead when the deferred insertion >> actually occurs. > > That's pretty much what the LSM-tree is. What is pretty cool about this sort of thing is that there's no intrinsic reason the insertion buffer needs to be block-structured or disk-backed. In theory, you can structure the in-memory portion of the tree any way you like, using pointers and arbitrary-size memory allocations and all that fun stuff. You need to log that there's a deferred insert (or commit to flushing the insertion buffer before every commit, which would seem to miss the point) so that recovery can reconstruct the in-memory data structure and flush it, but that's it: the WAL format need not know any other details of the in-memory portion of the tree. I think that, plus the ability to use pointers and so forth, might lead to significant performance gains. In practice, the topology of our shared memory segment makes this a bit tricky. The problem isn't so much that it's fixed size as that it lacks a real allocator, and that all the space used for shared_buffers is nailed down and can't be borrowed for other purposes. I'm very interested in solving the problem of getting a real allocator for shared memory because I think I need it for parallel sort, and even if there's a way to avoid needing it there I have a strong feeling that we'll want it for other applications of dynamic shared memory. But it should be possible to write the allocator in such a way that you just give it a chunk of memory to manage, and it does, remaining agnostic about whether the memory is from the main shared memory segment or a dynamic one. Of course, it's possible that even we do get a shared memory allocator, a hypothetical person working on this project might prefer to make the data block-structured anyway and steal storage from shared_buffers. So my aspirations in this area may not even be relevant. But I wanted to mention them, just in case anyone else is thinking about similar things, so that we can potentially coordinate. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: