Re: BufferAlloc: don't take two simultaneous locks - Mailing list pgsql-hackers
From | Kyotaro Horiguchi |
---|---|
Subject | Re: BufferAlloc: don't take two simultaneous locks |
Date | |
Msg-id | 20220217.141647.512059035403445205.horikyota.ntt@gmail.com Whole thread Raw |
In response to | Re: BufferAlloc: don't take two simultaneous locks (Yura Sokolov <y.sokolov@postgrespro.ru>) |
List | pgsql-hackers |
At Wed, 16 Feb 2022 10:40:56 +0300, Yura Sokolov <y.sokolov@postgrespro.ru> wrote in > Hello, all. > > I thought about patch simplification, and tested version > without BufTable and dynahash api change at all. > > It performs suprisingly well. It is just a bit worse > than v1 since there is more contention around dynahash's > freelist, but most of improvement remains. > > I'll finish benchmarking and will attach graphs with > next message. Patch is attached here. Thanks for the new patch. The patch as a whole looks fine to me. But some comments needs to be revised. (existing comments) > * To change the association of a valid buffer, we'll need to have > * exclusive lock on both the old and new mapping partitions. ... > * Somebody could have pinned or re-dirtied the buffer while we were > * doing the I/O and making the new hashtable entry. If so, we can't > * recycle this buffer; we must undo everything we've done and start > * over with a new victim buffer. We no longer take a lock on the new partition and have no new hash entry (if others have not yet done) at this point. + * Clear out the buffer's tag and flags. We must do this to ensure that + * linear scans of the buffer array don't think the buffer is valid. We The reason we can clear out the tag is it's safe to use the victim buffer at this point. This comment needs to mention that reason. + * + * Since we are single pinner, there should no be PIN_COUNT_WAITER or + * IO_IN_PROGRESS (flags that were not cleared in previous code). + */ + Assert((oldFlags & (BM_PIN_COUNT_WAITER | BM_IO_IN_PROGRESS)) == 0); It seems like to be a test for potential bugs in other functions. As the comment is saying, we are sure that no other processes are pinning the buffer and the existing code doesn't seem to be care about that condition. Is it really needed? + /* + * Try to make a hashtable entry for the buffer under its new tag. This + * could fail because while we were writing someone else allocated another The most significant point of this patch is the reason that the victim buffer is protected from stealing until it is set up for new tag. I think we need an explanation about the protection here. + * buffer for the same block we want to read in. Note that we have not yet + * removed the hashtable entry for the old tag. Since we have removed the hash table entry for the old tag at this point, the comment got wrong. + * the first place. First, give up the buffer we were planning to use + * and put it to free lists. .. + StrategyFreeBuffer(buf); This is one downside of this patch. But it seems to me that the odds are low that many buffers are freed in a short time by this logic. By the way it would be better if the sentence starts with "First" has a separate comment section. (existing comment) | * Okay, it's finally safe to rename the buffer. We don't "rename" the buffer here. And the safety is already establishsed at the end of the oldPartitionLock section. So it would be just something like "Now allocate the victim buffer for the new tag"? regards. -- Kyotaro Horiguchi NTT Open Source Software Center
pgsql-hackers by date: