Re: WIP: Fast GiST index build - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: WIP: Fast GiST index build
Date
Msg-id CAPpHfdvSC3p7zR_BwbNhm+3fvg8b-rwe+p=HJ9Gii8_rethbuw@mail.gmail.com
Whole thread Raw
In response to Re: WIP: Fast GiST index build  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses Re: WIP: Fast GiST index build
List pgsql-hackers
On Wed, Jul 13, 2011 at 5:59 PM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:
One thing that caught my eye is that when you empty a buffer, you load the entire subtree below that buffer, down to the next buffered or leaf level, into memory. Every page in that subtree is kept pinned. That is a problem; in the general case, the buffer manager can only hold a modest number of pages pinned at a time. Consider that the minimum value for shared_buffers is just 16. That's unrealistically low for any real system, but the default is only 32MB, which equals to just 4096 buffers. A subtree could easily be larger than that.
With level step = 1 we need only 2 levels in subtree. With mininun index tuple size (12 bytes) each page can have at maximum 675. Thus I think default shared_buffers is enough for level step = 1. I believe it's enough to add check we have sufficient shared_buffers, isn't it?
 
I don't think you're benefiting at all from the buffering that BufFile does for you, since you're reading/writing a full block at a time anyway. You might as well use the file API in fd.c directly, ie. OpenTemporaryFile/FileRead/FileWrite.
BufFile is distributing temporary data through several files. AFAICS postgres avoids working with files larger than 1GB. Size of tree buffers can easily be greater. Without BufFile I need to maintain set of files manually.

------
With best regards,
Alexander Korotkov. 

pgsql-hackers by date:

Previous
From: "Fernando Acosta Torrelly"
Date:
Subject: help with sending email
Next
From: Simon Riggs
Date:
Subject: Re: Reduced power consumption in WAL Writer process