Thread: RE: btree split logic is fragile in the presence of lar ge index items

RE: btree split logic is fragile in the presence of lar ge index items

From
"Mikheev, Vadim"
Date:
> > While implementing multi-key btree-s for 6.1 I found problems with
> > duplicates handling and this is why extra logic was added. 
> > But I never was happy with this logic -:)
> 
> Do you not like the proposal I was suggesting?  I thought it 
> was pretty much what you said yourself a few months ago...

Do not add TID to key but store key anywhere in duplicate chain and
just read lefter child page while positioning index scan, as we do
right now for partial keys?

This will result in additional reads but I like it much more than
current "logic"... especially keeping in mind that we'll have to
implement redo/undo for btree very soon and this would be nice
to simplify things -:) But if you're going to change btree then
please do it asap - I hope to begin btree redo/undo implementation
in 2-3 days, just after heap...

Vadim


"Mikheev, Vadim" <vmikheev@SECTORBASE.COM> writes:
>> Do you not like the proposal I was suggesting?  I thought it 
>> was pretty much what you said yourself a few months ago...

> Do not add TID to key but store key anywhere in duplicate chain and
> just read lefter child page while positioning index scan, as we do
> right now for partial keys?

> This will result in additional reads but I like it much more than
> current "logic"...

Offhand it seems good to me too.  For the normal case where there are
many keys per page and not so many duplicates, an unneeded read should
be rare anyway.

I will need to study Lehman & Yao a little more to ensure they don't
have a problem with it, but if not I'll do it that way.  (I was
surprised to realize that Lehman is the same Phil Lehman I knew in
grad school ... in fact he was probably working on this paper when
I knew him.  Small world ain't it.)

> But if you're going to change btree then
> please do it asap - I hope to begin btree redo/undo implementation
> in 2-3 days, just after heap...

Slavedriver ;-) ... I'll see what I can do ...
        regards, tom lane


RE: btree split logic is fragile in the presence of lar ge index items

From
"Hiroshi Inoue"
Date:
> -----Original Message-----
> From: pgsql-hackers-owner@hub.org [mailto:pgsql-hackers-owner@hub.org]On
> Behalf Of Tom Lane
> 
> "Mikheev, Vadim" <vmikheev@SECTORBASE.COM> writes:
> >> Do you not like the proposal I was suggesting?  I thought it 
> >> was pretty much what you said yourself a few months ago...
> 
> > Do not add TID to key but store key anywhere in duplicate chain and
> > just read lefter child page while positioning index scan, as we do
> > right now for partial keys?
> 
> > This will result in additional reads but I like it much more than
> > current "logic"...
>

What about unique key insertions ?

Regards.

Hiroshi Inoue


"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
>>>> Do not add TID to key but store key anywhere in duplicate chain and
>>>> just read lefter child page while positioning index scan, as we do
>>>> right now for partial keys?
>> 
>>>> This will result in additional reads but I like it much more than
>>>> current "logic"...

> What about unique key insertions ?

What about 'em?  Doesn't seem to make any difference as far as I can
see.  You still need to be prepared to scan right to see all of the
potential matches.


I have been digging in the index code and am thinking of reinstating
another aspect of the older implementation.  Once upon a time, the code
was set up to treat the leftmost key on any non-leaf tree level as
minus-infinity.  That seems to me to agree with the data structure
Lehman and Yao have in mind: in their drawings, each down-link pointer
on a non-leaf level is "between" two keys, except that the leftmost
downlink has no key to its left.  Their drawings all show n+1 downlinks
in an n-key non-leaf node.  We didn't match that representation, so we
need to fake it with a dummy key associated with the first downlink.
The original code did that, but it got taken out at some point and
replaced with pretty ugly code to propagate minimum-key changes back up
the tree when the leftmost child node has to be split.  But I think the
only thing wrong with it was that not all the comparison routines knew
they needed to do that.  btree seems to be suffering from an abundance
of comparison routines ... I'm going to try to eliminate some of the
redundancy.

Actually, we could shave some storage by not storing a key in the
first data item of any non-leaf page, whether leftmost on its level
or not.  That would correspond exactly to L&Y's drawings.
        regards, tom lane