Re: WIP: dynahash replacement for buffer table - Mailing list pgsql-hackers
From | Ryan Johnson |
---|---|
Subject | Re: WIP: dynahash replacement for buffer table |
Date | |
Msg-id | 543FE52F.9050300@cs.utoronto.ca Whole thread Raw |
In response to | Re: WIP: dynahash replacement for buffer table (Robert Haas <robertmhaas@gmail.com>) |
List | pgsql-hackers |
On 16/10/2014 7:19 AM, Robert Haas wrote: > On Thu, Oct 16, 2014 at 8:03 AM, Ryan Johnson > <ryan.johnson@cs.utoronto.ca> wrote: >> Why not use an RCU mechanism [1] and ditch the hazard pointers? Seems like >> an ideal fit... >> >> In brief, RCU has the following requirements: >> >> Read-heavy access pattern >> Writers must be able to make dead objects unreachable to new readers (easily >> done for most data structures) >> Writers must be able to mark dead objects in such a way that existing >> readers know to ignore their contents but can still traverse the data >> structure properly (usually straightforward) >> Readers must occasionally inform the system that they are not currently >> using any RCU-protected pointers (to allow resource reclamation) > Have a look at http://lwn.net/Articles/573424/ and specifically the > "URCU overview" section. Basically, that last requirement - that > readers inform the system tat they are not currently using any > RCU-protected pointers - turns out to require either memory barriers > or signals. > All of the many techniques that have been developed in this area are > merely minor variations on a very old theme: set some kind of flag > variable in shared memory to let people know that you are reading a > shared data structure, and clear it when you are done. Then, other > people can figure out when it's safe to recycle memory that was > previously part of that data structure. Sure, but RCU has the key benefit of decoupling its machinery (esp. that flag update) from the actual critical section(s) it protects. In a DBMS setting, for example, once per transaction or SQL statement would do just fine. The notification can be much better than a simple flag---you want to know whether the thread has ever quiesced since the last reclaim cycle began, not whether it is currently quiesced (which it usually isn't). In the implementation I use, a busy thread (e.g. not about to go idle) can "chain" its RCU "transactions." In the common case, a chained quiesce call comes when the RCU epoch is not trying to change, and the "flag update" degenerates to a simple load. Further, the only time it's critical to have that memory barrier is if the quiescing thread is about to go idle. Otherwise, missing a flag just imposes a small delay on resource reclamation (and that's assuming the flag in question even belonged to a straggler process). How you implement epoch management, especially the handling of stragglers, is the deciding factor in whether RCU works well. The early URCU techniques were pretty terrible, and maybe general-purpose URCU is doomed to stay that way, but in a DBMS core it can be done very cleanly and efficiently because we can easily add the quiescent points at appropriate locations in the code. > In Linux's RCU, the flag > variable is "whether the process is currently scheduled on a CPU", > which is obviously not workable from user-space. Lacking that, you > need an explicit flag variable, which means you need memory barriers, > since the protected operation is a load and the flag variable is > updated via a store. You can try to avoid some of the overhead by > updating the flag variable less often (say, when a signal arrives) or > you can make it more fine-grained (in my case, we only prevent reclaim > of a fraction of the data structure at a time, rather than all of it) > or various other variants, but none of this is unfortunately so simple > as "apply technique X and your problem just goes away". Magic wand, no (does nothing for update contention, for example, and requires some care to apply). But from a practical perspective RCU, properly implemented, does make an awful lot of problems an awful lot simpler to tackle. Especially for the readers. Ryan
pgsql-hackers by date: