Re: GiST buffering build, bug in levelStep calculation - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: GiST buffering build, bug in levelStep calculation
Date
Msg-id 4FC73B84.7020502@enterprisedb.com
Whole thread Raw
In response to Re: GiST buffering build, bug in levelStep calculation  (Alexander Korotkov <aekorotkov@gmail.com>)
Responses Re: GiST buffering build, bug in levelStep calculation  (Alexander Korotkov <aekorotkov@gmail.com>)
List pgsql-hackers
On 29.05.2012 23:46, Alexander Korotkov wrote:
> On Wed, May 30, 2012 at 12:25 AM, Tom Lane<tgl@sss.pgh.pa.us>  wrote:
>
>> Alexander Korotkov<aekorotkov@gmail.com>  writes:
>>> On Tue, May 29, 2012 at 11:42 PM, Tom Lane<tgl@sss.pgh.pa.us>  wrote:
>>>> While I'm looking at this, is the first test involving
>>>> effective_cache_size bulletproof either?  In particular, is
>>>> avgIndexTuplesPerPage clamped to be strictly greater than 1?
>>
>>> It's based on collected statistics on already inserted tuple sizes. Since
>>> tuple sizes are measured after possible toasting, I don't see the way
>>> for avgIndexTuplesPerPage to be less than 1.
>>
>> Yeah, but if it could be *equal* to one, you've got a zero-divide there.
>>
>
> avgIndexTuplesPerPage is calculated as:
>
> avgIndexTuplesPerPage = pageFreeSpace / itupAvgSize;
>
> I think size of each index tuple must be at least few times lower
> than pageFreeSpace to let us create any index.

Hmm, in theory, it seems possible that every leaf level index tuple 
would completely fill an index page. Not sure how useful such an index 
would be, though. On internal pages, at least, you have to fit at least 
two tuples on a page or you can't build a tree.

I note that the calculations assume that leaf tuples and internal tuples 
have similar sizes. We calculate the average leaf tuple size, and use 
that to calculate the fan-out of internal pages. On some GiST opclasses, 
the values stored on internal pages might be quite different from the 
leaf tuples. I don't think we need to worry about that in practice, 
these calculations are not very accurate anyway, but perhaps a comment 
would be in order.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Tatsuo Ishii
Date:
Subject: Re: [PERFORM] pg_dump and thousands of schemas
Next
From: Simon Riggs
Date:
Subject: Re: Uh, I change my mind about commit_delay + commit_siblings (sort of)