Re: Deleting older versions in unique indexes to avoid page splits - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: Deleting older versions in unique indexes to avoid page splits |
Date | |
Msg-id | CAH2-WzkHF1VzZOOw2N-pc1q4kdeH7+62SwdqvbCVHeiB_radwg@mail.gmail.com Whole thread Raw |
In response to | Re: Deleting older versions in unique indexes to avoid page splits (Andy Fan <zhihuifan1213@163.com>) |
List | pgsql-hackers |
Hi Andy, On Thu, Nov 7, 2024 at 3:05 AM Andy Fan <zhihuifan1213@163.com> wrote: > So my questions are: (a) How does the "logically unchanged index" hint > can be helpful for this purpose? It's the main trigger for bottom-up index deletion. It is taken as a signal that the leaf page might have garbage duplicates from non-HOT updates. What the hint *actually* indicates is that the incoming index tuple is a new, unchanged version duplicate. But it's reasonable to guess that this will also be true of existing tuples located on the same leaf page as the one that the new, incoming tuple will go on. It's especially worth making that guess when the only alternative is to split the page (most of the time the hint isn't used at all, since the incoming item already fits on its leaf page). > (b). What does the latestRemovedXid > means, and which variable in code is used for this purpose. I searched > "latestRemovedXid" but nothing is found. That's because the symbol name changed a year or two after this commit went in. It is now snapshotConflictHorizon, but it has the same purpose as before (to generate conflicts on hot standys when needed). The commit message points out that simple deletion inherently requires that we visit the heap to generate this snapshotConflictHorizon value for the WAL record. And so the cost of considering whether we can delete additional index tuples (not just those index tuples already marked LP_DEAD in the index leaf page) is very low. We were already visiting the heap pages that are used to check the "extra" index tuples (if we weren't then those index tuples wouldn't be checked at all). In short, since the *added* cost of checking extra related index tuples is very low, then it doesn't really matter if there are few or no extra tuples that actually turn out to be deletable. We can be aggressive because the cost of being wrong is hardly noticeable at all. > (c) What is the relationship > between a and b. I mean if we *have to visit" the same table blocks (in > the case of index-split?), then the IO-cost is paid anyway, do we still > need the "logically unchanged index hint"? Well, the commit message was written at a time when the only form of deletion was deletion of index tuples marked LP_DEAD (now called simple deletion). Prior to this work, we didn't attempt to delete any extra tuples in passing during simple deletion. It is easy to justify checking those extra index tuples/batching up work in that context. It is a little harder to justify bottom-up index deletion (or was at the time), because it will do work that is totally speculative -- the entire process might not even delete a single index tuple. It is important that we give up quickly when it isn't possible to delete a single index tuple. We're better off falling back on nbtree index deduplication. That way another bottom-up index deletion pass over the same leaf page may succeed in the future (or may never be required). > At last, appreciated for your effort on making this part much better! Thanks -- Peter Geoghegan
pgsql-hackers by date: