RE: btree split logic is fragile in the presence of lar ge index items - Mailing list pgsql-hackers

From Mikheev, Vadim
Subject RE: btree split logic is fragile in the presence of lar ge index items
Date
Msg-id 8F4C99C66D04D4118F580090272A7A23018C63@SECTORBASE1
Whole thread Raw
Responses Re: btree split logic is fragile in the presence of lar ge index items
Re: btree split logic is fragile in the presence of lar ge index items
List pgsql-hackers
> 1. The PG 4.2 code used an OID appended to the key to ensure that
> btree items are unique.  This fixes the Lehman and Yao algorithm's
> assumption of unique keys *for insertions*, since the item being
> inserted must have an OID different from existing items even if the
> key values are equal.  However there is still an issue for *lookups*,
> since we are necessarily doing a lookup with just a key value, and
> we want to find all index items with that same key value regardless
> of OID.

This is not right. OIDs were used *only* to find parent tuple in
_bt_getstackbuf, where only *per level* uniqueness is required.
I removed OIDs because of on any level there are no two (or more)
tuples pointing to the same place - i.e. TID may be used.
BTW, there were only single-key indices in Postgres-95 (and 4.2 too?) -
i.e. OID could not be used in key.

> 3. Awhile back Vadim removed the added-OID code and added a bunch of
> logic for explicit management of chains of duplicate keys.  In
> retrospect this change was probably a mistake.  For btree items that
> point to heap tuples we can use the tuple TID as tie-breaker (since
> an index should never contain two items with the same key and TID).
> Btree internal pages need an added field since they have to 
> consider the TID as part of the key (and the field that is the TID in a 
> leaf page is used as the down-link pointer in non-leaf pages).

While implementing multi-key btree-s for 6.1 I found problems with
duplicates handling and this is why extra logic was added. But I never
was happy with this logic -:)

Note that using TID as part of key would give us additional feature:
fast heap tuple --> index tuple look up. With this feature vacuum wouldn't
have to read entire index to delete a few items... and this will be required
for space re-using without vacuum...

Vadim


pgsql-hackers by date:

Previous
From: Vince Vielhaber
Date:
Subject: Re: TUPLE SIZE HELP
Next
From: Hannu Krosing
Date:
Subject: Re: TUPLE SIZE HELP