Re: WIP: Fast GiST index build - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: WIP: Fast GiST index build
Date
Msg-id 4E2EF83A.8000304@enterprisedb.com
Whole thread Raw
In response to Re: WIP: Fast GiST index build  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses Re: WIP: Fast GiST index build
Re: WIP: Fast GiST index build
Re: WIP: Fast GiST index build
List pgsql-hackers
On 25.07.2011 21:52, Heikki Linnakangas wrote:
> On 22.07.2011 12:38, Alexander Korotkov wrote:
>> Patch with my try to detect ordered datasets is attached. The implemented
>> idea is desribed below.
>> Index tuples are divided by chunks of 128. On each chunk we measure
>> how much
>> leaf pages where index tuples was inserted don't match those of previous
>> chunk. Based on statistics of several chunks we estimate distribution of
>> accesses between lead pages (exponential distribution law is accumed and
>> it's seems to be an error). After that we can estimate portion of index
>> tuples which can be processed without actual IO. If this estimate exceeds
>> threshold then we should switch to buffering build.
>> Now my implementation successfully detects randomly mixed datasets and
>> well
>> ordered datasets, but it's seems to be too optimistic about intermediate
>> cases. I believe it's due to wrong assumption about distribution law.
>> Do you think this approach is acceptable? Probably there are some
>> researches
>> about distribution law for such cases (while I didn't find anything
>> relevant
>> in google scholar)?
>
> Great! It would be nice to find a more scientific approach to this, but
> that's probably fine for now. It's time to start cleaning up the patch
> for eventual commit.
>
> You got rid of the extra page pins, which is good, but I wonder why you
> still pre-create all the GISTLoadedPartItem structs for the whole
> subtree in loadTreePart() ? Can't you create those structs on-the-fly,
> when you descend the tree? I understand that it's difficult to update
> all the parent-pointers as trees are split, but it feels like there's
> way too much bookkeeping going on. Surely it's possible to simplify it
> somehow..

That was a quite off-the-cuff remark, so I took the patch and culled out
loaded-tree bookkeeping. A lot of other changes fell off from that, so
it took me quite some time to get it working again, but here it is. This
is a *lot* smaller patch, although that's partly explained by the fact
that I left out some features: prefetching and the neighbor relocation
code is gone.

I'm pretty exhausted by this, so I just wanted to send this out without
further analysis. Let me know if you have questions on the approach
taken. I'm also not sure how this performs compared to your latest
patch, I haven't done any performance testing. Feel free to use this as
is, or as a source of inspiration :-).

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: write scalability
Next
From: Robert Haas
Date:
Subject: Re: sinval synchronization considered harmful