Re: Optimize single tuple fetch from nbtree index - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: Optimize single tuple fetch from nbtree index |
Date | |
Msg-id | CAH2-Wz=m674-RKQdCG+jCD9QGzN1Kcg-FOdYw4-j+5_PfcHbpQ@mail.gmail.com Whole thread Raw |
In response to | Re: Optimize single tuple fetch from nbtree index (Anastasia Lubennikova <a.lubennikova@postgrespro.ru>) |
List | pgsql-hackers |
On Fri, Aug 23, 2019 at 10:14 AM Anastasia Lubennikova <a.lubennikova@postgrespro.ru> wrote: > Though, I got interested in the comment inconsistency you have found. > I added debug message into this code branch of the patch and was able to > see it in regression.diffs after 'make check': > Speaking of your patch, it seems that the buffer was unpinned and pinned > again between two reads, > and the condition of holding it continuously has not been met. See commit 2ed5b87f. The code is supposed to do that, but it might do it more often than is truly necessary. We don't want to block VACUUM by holding a buffer pin for a very long time, which is theoretically possible here. Do you think that it is actually unnecessary here? In other words, do you think that we can fix this without breaking cases that commit 2ed5b87f cares about? I have been suspicious of this commit all along. For example, I noticed that it can cause the kill_prior_tuple mechanism to be ineffective in a way that didn't happen prior to Postgres 9.5: https://postgr.es/m/CAH2-Wz=SfAKVMv1x9Jh19EJ8am8TZn9f-yECipS9HrrRqSswnA@mail.gmail.com That particular complaint was never addressed. I meant to do more on commit 2ed5b87f. > I didn't dig into the code, but this line looks suspicious (see my > findings about BTScanPosIsPinned below): > > /* bump pin on current buffer for assignment to mark buffer */ > if (BTScanPosIsPinned(so->currPos)) > IncrBufferRefCount(so->currPos.buf); > > > While reading the code to answer your question, I noticed that > BTScanPosIsPinned macro name is misleading. > It calls BufferIsValid(), not BufferIsPinned() as one could expect. > And BufferIsValid in bufmgr.h comment explicitly states that it > shouldn't be confused with BufferIsPinned. > The same goes for BTScanPosUnpinIfPinned(). I have always hated this macro. I think that code like the specific code you quoted might be correct, kind of, but it looks like the author was trying to change as little as possible about the code as it existed in 2015, rather than changing things so that everything made sense. It looks like a messy accretion. Let me see if I can get it straight: We're incrementing the ref count on the buffer if and only if it is pinned (by which we mean valid), though only when the scan is valid (which is not the same as pinned). Whether or not we increment the count of a valid scan doesn't affect anything else we do (i.e. we still restore a marked position either way). This is just awful. > I propose that we update BTScanPosIsPinned macro. Or, at least write a > comment, why its current behavior is fine. > There are a few existing callers, that are definitely expecting that > this macro checks a pin, which it doesn't do. > I don't quite understand if that already causes any subtle bug, or the > current algorithm is fine. I think that you're right -- at a minimum, this requires more documentation. This code is a few years old, but I still wouldn't be surprised if it turned out to be slightly wrong in a way that was important. We still have no way of detecting if a buffer is accessed without a pin. There have been numerous bugs like that before. (We have talked about teaching Valgrind to detect the case, but that never actually happened.) -- Peter Geoghegan
pgsql-hackers by date: