Re: BufferAlloc: don't take two simultaneous locks - Mailing list pgsql-hackers
From | Kyotaro Horiguchi |
---|---|
Subject | Re: BufferAlloc: don't take two simultaneous locks |
Date | |
Msg-id | 20220419.104515.841299122302516369.horikyota.ntt@gmail.com Whole thread Raw |
In response to | Re: BufferAlloc: don't take two simultaneous locks (Robert Haas <robertmhaas@gmail.com>) |
List | pgsql-hackers |
At Mon, 18 Apr 2022 09:53:42 -0400, Robert Haas <robertmhaas@gmail.com> wrote in > On Fri, Apr 15, 2022 at 4:29 AM Kyotaro Horiguchi > <horikyota.ntt@gmail.com> wrote: > > The patch removes buftable entry frist then either inserted again or > > returned to freelist. I don't understand how it can be in both > > buftable and freelist.. What kind of trouble do you have in mind for > > example? > > I'm not sure. I'm just thinking about potential dangers. I was more > worried about it ending up in neither place. I think that is more likely to happen. But I think that that can happen also by the current code if it had exits on the way. And the patch does not add a new exit. > > So, does this get progressed if someone (maybe Yura?) runs a > > benchmarking with this method? > > I think we're talking about theoretical concerns about safety here, > and you can't resolve that by benchmarking. Tom or others may have a Yeah.. I didn't mean that benchmarking resolves the concerns. I meant that if benchmarking shows that the safer (or cleaner) way give sufficient gain, we can take that direction. > different view, but IMHO the issue with this patch isn't that there > are no performance benefits, but that the patch needs to be fully > safe. He and I may disagree on how likely it is that it can be made > safe, but it can be a million times faster and if it's not safe it's > still dead. Right. > Something clearly needs to be done to plug the specific problem that I > mentioned earlier, somehow making it so we never need to grow the hash > table at runtime. If anyone can think of other such hazards those also > need to be fixed. - Running out of buffer mapping entries? It seems to me related to "runtime growth of the table mapping hash table". Does the runtime growth of the hash mean that get_hash_entry may call element_alloc even if the hash is created with a sufficient number of elements? If so, it's not the fault of this patch. We can search all freelists before asking element_alloc() (maybe) in exchange of some extent of potential temporary degradation. That being said, I don't think it's good that we call element_alloc for shared hashes after creation. - Is the collision handling correct that just returning the victimized buffer to freelist? Potentially the patch can over-vicitimzie buffers up to max_connections-1. Is this what you are concerned about? A way to preveint over-victimization was raised upthread, that is, we insert a special table mapping entry that signals "this page is going to be available soon." before releasing newPartitionLock. This prevents over-vicitimaztion. - Doesn't buffer-leak or duplicate mapping happen? This patch does not change the order of the required steps, and there's no exit on the way (if the current code doesn't have.). No two processes victimize the same buffer since the victimizing steps are protected by oldPartitionLock (and header lock) same as the current code, and no two processes insert different buffers for the same page since the inserting steps are protected by newPartitionLock. No vicitmized buffer gets orphaned *if* that doesn't happen by the current code. So *I* am at loss how *I* can make it clear that they don't happenX( (Of course Yura might think differently.) regards. -- Kyotaro Horiguchi NTT Open Source Software Center
pgsql-hackers by date: