Re: WIP: dynahash replacement for buffer table - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: WIP: dynahash replacement for buffer table |
Date | |
Msg-id | CA+TgmobKOF=gV2ue5p14iAJj0QuQdL+eBLXsM8+rogMjABmG8A@mail.gmail.com Whole thread Raw |
In response to | Re: WIP: dynahash replacement for buffer table (Andres Freund <andres@2ndquadrant.com>) |
Responses |
Re: WIP: dynahash replacement for buffer table
|
List | pgsql-hackers |
On Tue, Oct 14, 2014 at 10:47 AM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2014-10-14 09:30:58 -0400, Robert Haas wrote: >> I took the basic infrastructure from before and used it to replace the >> buffer table. Patch is attached. > > Hm. Unless I missed something you pretty much completely eradicated > locks from the buffer mapping code? That'd be pretty cool, but it's also > scary. > > How confident are you that your conversion is correct? Not in the sense > that there aren't any buglets, but fundamentally. It doesn't look particularly dangerous to me. Famous last words. Basically, I think what we're doing right now is holding the buffer mapping lock so that the buffer can't be renamed under us while we're pinning it. If we don't do that, I think the only consequence is that, by the time we get the buffer pin, the buffer might no longer be the one we looked up. So we need to recheck. But assuming we do that, I don't see an issue. In fact, it occurred to me while I was cobbling this together that we might want to experiment with that change independently of chash. We already know (from the StrategyGetBuffer locking changes) that holding system-wide locks to prevent people from twaddling the state of individual buffers can be a loser. This case isn't nearly as bad, because we're only pinning one buffer, rather than potentially all of them multiple times, and the lock we're holding only affects 1/128th of the buffers, not all of them. The other application I had in mind for chash was SLRU stuff. I haven't tried to write the code yet, but fundamentally, the same considerations apply there. You do the lookup locklessly to find out which buffer is thought to contain the SLRU page you want, but by the time you lock the page the answer might be out of date. Assuming that this doesn't happen many times in a row and that lookups are relatively cheap, you could get rid of having any centralized lock for SLRU, and just have per-page locks. I'm not quite sure what fundamental dangers you're thinking about here, but from what I understand from reading the literature, doing lookups in a lock-free hash table needs to be thought about in a sort of relativistic way. The different CPUs have slightly different views of the world, so any answer you get may be obsolete by the time you get it, and may according to some other CPU's view of the world have been obsolete even at the time that it was copied. That's OK, because it's just equivalent to some other CPU doing its hash table update slightly sooner or later than it actually did it, within the bounds imposed by memory barriers, which it could have done anyway. There is no one source of truth. The result of all this is that some of the synchronization responsibility is transferred to the caller. That's obviously a bit more complex to reason about, but it hasn't stopped lock-free hash tables from being wildly popular data structures. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: