Re: Documentation of bt_page_items()'s ctid field - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Documentation of bt_page_items()'s ctid field
Date
Msg-id 54A30945.60306@vmware.com
Whole thread Raw
In response to Re: Documentation of bt_page_items()'s ctid field  (Peter Geoghegan <pg@heroku.com>)
Responses Re: Documentation of bt_page_items()'s ctid field  (Peter Geoghegan <pg@heroku.com>)
List pgsql-hackers
On 12/30/2014 10:07 PM, Peter Geoghegan wrote:
> On Tue, Dec 30, 2014 at 8:59 AM, Heikki Linnakangas
> <hlinnakangas@vmware.com> wrote:
>> How much detail on the b-tree internals do we want to put in the pageinspect
>> documentation? I can see that being useful, but should we also explain e.g.
>> that the first item on each (non-rightmost) page is the high key?
>
> Maybe we should. I see no reason not to, and I think that it makes
> sense to explain things at that level without going into flags and so
> on. But don't forget that that isn't quite the full story if we're
> going to talk about high keys at all; we must also explain "minus
> infinity" keys, alongside any explanation of the high key:

Yeah, good point.

>   * CRUCIAL NOTE: on a non-leaf page, the first data key is assumed to be
>   * "minus infinity": this routine will always claim it is less than the
>   * scankey.  The actual key value stored (if any, which there probably isn't)
>   * does not matter.  This convention allows us to implement the Lehman and
>   * Yao convention that the first down-link pointer is before the first key.
>   * See backend/access/nbtree/README for details.
>
> In particular, this means that the key data is garbage, which is
> something I've also seen causing confusion [1].

In practice, we never store any actual key value for the "minus 
infinity" key. I guess the code would ignore it if it was there, but it 
would make more sense to explain that the first data key on an internal 
page does not have a key value. If there is a value there, it's a sign 
that something's wrong.

> I would like to make it easier for competent non-experts on the B-Tree
> code to eyeball a B-Tree with pageinspect, and be reasonably confident
> that things add up. In order for such people to know that something is
> wrong, we should explain what "right" looks like in moderate detail.

Makes sense.

>> I had a hard time understanding the remark about the root page. But in any
>> case, if you look at the flags set e.g. with bt_page_stats(), the root page
>> is flagged as also being a leaf page, when it is the only page in the index.
>> So the root page is considered also a leaf page in that case.
>
> I think that a better way of handling that originally would have been
> to make root-ness a separate property from leaf-ness/internal-ness.

Hmm, yeah, bt_page_stats() currently returns 'l' in the type column when 
(BTP_ROOT | BTP_LEAF).

- Heikki




pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Documentation of bt_page_items()'s ctid field
Next
From: Peter Geoghegan
Date:
Subject: Re: Documentation of bt_page_items()'s ctid field