Thread: Removing BTScanPosUnpinIfPinned idiom from nbtree, simplifying mark/restore support
Removing BTScanPosUnpinIfPinned idiom from nbtree, simplifying mark/restore support
From
Peter Geoghegan
Date:
As already discussed on another nearby thread [1], I think that we should get rid of nbtree's BTScanPosUnpinIfPinned idiom (which originated in 2015 commit 2ed5b87f), since it is rather brittle. Recent experience from today's bugfix commit 7c319f54 made that pretty clear. Attached patch gets rid of nbtree's use of BTScanPosUnpinIfPinned. Rather than deciding whether or not to unpin a saved position's buffer on the basis of it having a buffer set at all (at any point where we invalidate the buffer), the patch has us condition everything on the scan-level so->dropPin field. This so->dropPin field was added by my recent commit e6eed40e; you can think of this patch as finishing off the work started in that commit. I'll add this patch to the Postgres 19 cycle's first commitfest. More on the underlying motivation: Removing BTScanPosUnpinIfPinned allows me to significantly simplify the management of buffer pins with mark/restore. The patch also gets rid of all of the calls to IncrBufferRefCount made from nbtree, since it's no longer necessary to support a so->markPos representation of a mark that needs its own pin, independent of the pin held by/for so->currPos (if so->markPos needs its own pin it'll be because it has its own page). In other words, with the patch, there is exactly one way of representing that there's a mark held on the current page. We still need multiple pins (one on currPos, another on markPos) during !so->dropPin scans, but now that only happens when they are on different pages. In other other words, the only valid representation of a mark held on the page currently in so->currPos is the "so->markItemIndex" field (we're now consistent about that). I also believe that this simplified approach to holding buffer pins in mark/restore will facilitate better nbtree designs. Having "exactly one way of representing that there's a mark held on the current page" helps with designs that do more heap prefetching by reading multiple leaf pages, before earlier pages have performed all of their required heap page accesses. I'm thinking of new designs that maintain multiple leaf page details/returned items within one big currPos-like position (the current so->currPos just maintains a single leaf page). The main thing that makes that difficult right now is the complex rules around buffer pins, particularly with mark/restore, which is generally a very hard code path to test. [1] https://postgr.es/m/CAH2-Wzm3a8i31aBXi6oQt9uG7m1-xpX-MXjMMYoDJH=sBj1jnA@mail.gmail.com -- Peter Geoghegan