Re: Making all nbtree entries unique by having heap TIDs participatein comparisons - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Making all nbtree entries unique by having heap TIDs participatein comparisons
Date
Msg-id a7e95343-ec6b-d204-63ec-27966a853a3f@iki.fi
Whole thread Raw
In response to Re: Making all nbtree entries unique by having heap TIDs participatein comparisons  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Making all nbtree entries unique by having heap TIDs participatein comparisons
List pgsql-hackers
On 16/03/2019 18:55, Peter Geoghegan wrote:
> On Sat, Mar 16, 2019 at 9:17 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> Then, a cosmic ray changes the 20 on the root page to 18. That causes
>> the the leaf tuple 19 to become non-re-findable; if you descend the for
>> 19, you'll incorrectly land on page 6. But it also causes the high key
>> on page 2 to be different from the downlink on the root page. Wouldn't
>> the existing checks catch this?
> 
> No, the existing checks will not check that. I suppose something
> closer to the existing approach *could* detect this issue, by making
> sure that the "seam of identical high keys" from the root to the leaf
> are a match, but we don't use the high key outside of its own page.
> Besides, there is something useful about having the code actually rely
> on _bt_search().
> 
> There are other similar cases, where it's not obvious how you can do
> verification without either 1) crossing multiple levels, or 2)
> retaining a "low key" as well as a high key (this is what Goetz Graefe
> calls "retaining fence keys to solve the cousin verification problem"
> in Modern B-Tree Techniques). What if the corruption was in the leaf
> page 6 from your example, which had a 20 at the start? We wouldn't
> have compared the downlink from the parent to the child, because leaf
> page 6 is the leftmost child, and so we only have "-inf". The lower
> bound actually comes from the root page, because we truncate "-inf"
> attributes during page splits, even though we don't have to. Most of
> the time they're not "absolute minus infinity" -- they're "minus
> infinity in this subtree".

AFAICS, there is a copy of every page's high key in its immediate 
parent. Either in the downlink of the right sibling, or as the high key 
of the parent page itself. Cross-checking those would catch any 
corruption in high keys.

Note that this would potentially catch some corruption that the 
descend-from-root would not. If you have a mismatch between the high key 
of a leaf page and its parent or grandparent, all the current tuples 
might be pass the rootdescend check. But a tuple might get inserted to 
wrong location later.

> Maybe you could actually do something with the high key from leaf page
> 5 to detect the stray value "20" in leaf page 6, but again, we don't
> do anything like that right now.

Hmm, yeah, to check for stray values, you could follow the left-link, 
get the high key of the left sibling, and compare against that.

- Heikki


pgsql-hackers by date:

Previous
From: Chapman Flack
Date:
Subject: Re: Fix XML handling with DOCTYPE
Next
From: Tom Lane
Date:
Subject: Re: Fix XML handling with DOCTYPE