Re: Improve search for missing parent downlinks in amcheck - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: Improve search for missing parent downlinks in amcheck |
Date | |
Msg-id | CAH2-WzmRtYu-HXEb0o_zMzaHWbft7Lv=F25B=Cba200hQcsm3w@mail.gmail.com Whole thread Raw |
In response to | Re: Improve search for missing parent downlinks in amcheck (Alexander Korotkov <a.korotkov@postgrespro.ru>) |
Responses |
Re: Improve search for missing parent downlinks in amcheck
|
List | pgsql-hackers |
On Mon, Aug 12, 2019 at 12:01 PM Alexander Korotkov <a.korotkov@postgrespro.ru> wrote: > BTW, there is next revision of patch I'm proposing for v13. Cool. > In this revision check for missing downlinks is combined with > bt_downlink_check(). So, pages visited by bt_downlink_check() patch > doesn't cause extra accessed. It only causes following additional > page accesses: > 1) Downlinks corresponding to "negative infinity" keys, > 2) Pages of child level, which are not referenced by downlinks. > > But I think these two kinds are very minority, and those accesses > could be trade off with more precise missing downlink check without > bloom filter. What do you think? I am generally in favor of making the downlink check absolutely reliable, and am not too worried about the modest additional overhead. After all, bt_index_parent_check() is supposed to be thorough though expensive. The only reason that I used a Bloom filter for fingerprinting downlink blocks was that it seemed important to quickly get amcheck coverage for subtle multi-level page deletion bugs just after v11 feature freeze. We can now come up with a better design for that. I was confused about how this patch worked at first. But then I remembered that Lehman and Yao consider downlinks to be distinct things to separator keys in internal pages. The high key of an internal page in the final separator key, so you have n downlinks and n + 1 separator keys per internal page -- two distinct things that appear in alternating order (the negative infinity item is not considered to have any separator key here). So, while internal page items are explicitly "(downlink, separator)" pairs that are grouped into a single tuple in nbtree, with a separate tuple just for the high key, Lehman and Yao would find it just as natural to treat them as "(separator, downlink)" pairs. You have to skip the first downlink on each internal level to make that work, but this makes our bt_downlink_check() check have something from the target page (child's parent page) that is like the high key in the child. It's already really confusing that we don't quite mean the same thing as Lehman and Yao when we say downlink (See also my long "why a highkey is never truly a copy of another item" comment block within bt_target_page_check()), and that is not your patch's fault. But maybe we need to fix that to make your patch easier to understand. (i.e. maybe we need to go over every use of the word "downlink" in nbtree, and make it say something more precise, to make everything less confusing.) Other feedback: * Did you miss the opportunity to verify that every high key matches its right sibling page's downlink tuple in parent page? We talked about this already, but you don't seem to match the key (you only match the downlink block). * You are decoupling the new check from the bt_index_parent_check() "heapallindexed" option. That's probably a good thing, but you need to update the sgml docs. * Didn't you forget to remove the BtreeCheckState.rightsplit flag? * You've significantly changed the behavior of bt_downlink_check() -- I would note this in its header comments. This is where ~99% of the new work happens. * I don't like that you use the loaded term "target" here -- anything called "target" should be BtreeCheckState.target: > static void > -bt_downlink_missing_check(BtreeCheckState *state) > +bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit, > + BlockNumber targetblock, Page target) > { If it's unclear what I mean, this old comment should make it clearer: /* * State associated with verifying a B-Tree index * * target is the point of reference for a verification operation. * * Other B-Tree pages may be allocated, but those are always auxiliary (e.g., * they are current target's child pages). Conceptually, problems are only * ever found in the current target page (or for a particular heap tuple during * heapallindexed verification). Each page found by verification's left/right, * top/bottom scan becomes the target exactly once. */ -- Peter Geoghegan
pgsql-hackers by date: