Re: Fractal tree indexing - Mailing list pgsql-hackers

From Andrew Borodin
Subject Re: Fractal tree indexing
Date
Msg-id CAJEAwVE_8VDTV1Utfe1CmbVRCvVPtvHZnSKFdMf0rB5mELSLeA@mail.gmail.com
Whole thread Raw
In response to Re: Fractal tree indexing  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
Hi hackers!

Here is prototype of procrastinating GiST. Or lazy GiST. Or buffered GiST. Indeed, there is nothing fractal about it.

The concept is leaf tuples stall on internal pages. From time to time they are pushed down in batches. This code is far from perfect, I've coded this only for the research purposes.

I have tested code with insertion of random 3D cubes. On 1 million of insertions classic GiST is 8% faster, on 3M performance is equal, on 30M lazy GiST if 12% faster.
I've tested it with SSD disk, may be with HDD difference will be more significant.

In theory, this is cache friendly implementation (but not cache oblivious) and it must be faster even for small datasets without huge disk work. But it has there extra cost of coping batch of tuples to palloc`d array, couln't avoid that.

This is just a proof-of-concept for performance measrures:
1. make check fails for two tests
2. WAL is broken
3. code is a mess in some places

I'm not sure 12% of performance worth it. I'll appreciate any ideas on what to do next.

Best regards, Andrey Borodin.

2013-02-13 22:54 GMT+05:00 Simon Riggs <simon@2ndquadrant.com>:
On 13 February 2013 16:48, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
> On 13.02.2013 18:20, Tom Lane wrote:
>>
>> Heikki Linnakangas<hlinnakangas@vmware.com>  writes:
>>>
>>> The basic idea of a fractal tree index is to attach a buffer to every
>>> non-leaf page. On insertion, instead of descending all the way down to
>>> the correct leaf page, the new tuple is put on the buffer at the root
>>> page. When that buffer fills up, all the tuples in the buffer are
>>> cascaded down to the buffers on the next level pages. And recursively,
>>> whenever a buffer fills up at any level, it's flushed to the next level.
>>
>>
>> [ scratches head... ]  What's "fractal" about that?  Or is that just a
>> content-free marketing name for this technique?
>
>
> I'd call it out as a marketing name. I guess it's fractal in the sense that
> all levels of the tree can hold "leaf tuples" in the buffers; the structure
> looks the same no matter how deep you zoom, like a fractal.. But "Buffered"
> would be more appropriate IMO.

I hope for their sake there is more to it than that. It's hard to see
how buffering can be patented.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Attachment

pgsql-hackers by date:

Previous
From: Rushabh Lathia
Date:
Subject: Re: Gather Merge
Next
From: Ashutosh Bapat
Date:
Subject: Re: pg_hba_file_settings view patch