Re: Improve search for missing parent downlinks in amcheck - Mailing list pgsql-hackers
From | Alexander Korotkov |
---|---|
Subject | Re: Improve search for missing parent downlinks in amcheck |
Date | |
Msg-id | CAPpHfduTWj00vPXTBfGd=apEiqyO3Hf+PEXgHC=6wDuJ8RBZHg@mail.gmail.com Whole thread Raw |
In response to | Re: Improve search for missing parent downlinks in amcheck (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: Improve search for missing parent downlinks in amcheck
|
List | pgsql-hackers |
On Tue, Aug 13, 2019 at 11:44 PM Peter Geoghegan <pg@bowt.ie> wrote: > > In this revision check for missing downlinks is combined with > > bt_downlink_check(). So, pages visited by bt_downlink_check() patch > > doesn't cause extra accessed. It only causes following additional > > page accesses: > > 1) Downlinks corresponding to "negative infinity" keys, > > 2) Pages of child level, which are not referenced by downlinks. > > > > But I think these two kinds are very minority, and those accesses > > could be trade off with more precise missing downlink check without > > bloom filter. What do you think? > > I am generally in favor of making the downlink check absolutely > reliable, and am not too worried about the modest additional overhead. > After all, bt_index_parent_check() is supposed to be thorough though > expensive. The only reason that I used a Bloom filter for > fingerprinting downlink blocks was that it seemed important to quickly > get amcheck coverage for subtle multi-level page deletion bugs just > after v11 feature freeze. We can now come up with a better design for > that. Great! > I was confused about how this patch worked at first. But then I > remembered that Lehman and Yao consider downlinks to be distinct > things to separator keys in internal pages. The high key of an > internal page in the final separator key, so you have n downlinks and > n + 1 separator keys per internal page -- two distinct things that > appear in alternating order (the negative infinity item is not > considered to have any separator key here). So, while internal page > items are explicitly "(downlink, separator)" pairs that are grouped > into a single tuple in nbtree, with a separate tuple just for the high > key, Lehman and Yao would find it just as natural to treat them as > "(separator, downlink)" pairs. You have to skip the first downlink on > each internal level to make that work, but this makes our > bt_downlink_check() check have something from the target page (child's > parent page) that is like the high key in the child. > > It's already really confusing that we don't quite mean the same thing > as Lehman and Yao when we say downlink (See also my long "why a > highkey is never truly a copy of another item" comment block within > bt_target_page_check()), and that is not your patch's fault. But maybe > we need to fix that to make your patch easier to understand. (i.e. > maybe we need to go over every use of the word "downlink" in nbtree, > and make it say something more precise, to make everything less > confusing.) I agree that current terms nbtree use to describe downlinks and separator keys may be confusing. I'll try to fix this and come up with patch if succeed. > Other feedback: > > * Did you miss the opportunity to verify that every high key matches > its right sibling page's downlink tuple in parent page? We talked > about this already, but you don't seem to match the key (you only > match the downlink block). > > * You are decoupling the new check from the bt_index_parent_check() > "heapallindexed" option. That's probably a good thing, but you need to > update the sgml docs. > > * Didn't you forget to remove the BtreeCheckState.rightsplit flag? > > * You've significantly changed the behavior of bt_downlink_check() -- > I would note this in its header comments. This is where ~99% of the > new work happens. > > * I don't like that you use the loaded term "target" here -- anything > called "target" should be BtreeCheckState.target: > > > static void > > -bt_downlink_missing_check(BtreeCheckState *state) > > +bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit, > > + BlockNumber targetblock, Page target) > > { > > If it's unclear what I mean, this old comment should make it clearer: > > /* > * State associated with verifying a B-Tree index > * > * target is the point of reference for a verification operation. > * > * Other B-Tree pages may be allocated, but those are always auxiliary (e.g., > * they are current target's child pages). Conceptually, problems are only > * ever found in the current target page (or for a particular heap tuple during > * heapallindexed verification). Each page found by verification's left/right, > * top/bottom scan becomes the target exactly once. > */ The revised patch seems to fix all of above. ------ Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
pgsql-hackers by date: