Home > mailing lists

Re: WIP: dynahash replacement for buffer table - Mailing list pgsql-hackers

From	Ryan Johnson
Subject	Re: WIP: dynahash replacement for buffer table
Date	October 16, 2014 15:33:24
Msg-id	543FE52F.9050300@cs.utoronto.ca Whole thread Raw
In response to	Re: WIP: dynahash replacement for buffer table (Robert Haas <robertmhaas@gmail.com>)
List	pgsql-hackers

Tree view

On 16/10/2014 7:19 AM, Robert Haas wrote:
> On Thu, Oct 16, 2014 at 8:03 AM, Ryan Johnson
> <ryan.johnson@cs.utoronto.ca> wrote:
>> Why not use an RCU mechanism [1] and ditch the hazard pointers? Seems like
>> an ideal fit...
>>
>> In brief, RCU has the following requirements:
>>
>> Read-heavy access pattern
>> Writers must be able to make dead objects unreachable to new readers (easily
>> done for most data structures)
>> Writers must be able to mark dead objects in such a way that existing
>> readers know to ignore their contents but can still traverse the data
>> structure properly (usually straightforward)
>> Readers must occasionally inform the system that they are not currently
>> using any RCU-protected pointers (to allow resource reclamation)
> Have a look at http://lwn.net/Articles/573424/ and specifically the
> "URCU overview" section.  Basically, that last requirement - that
> readers inform the system tat they are not currently using any
> RCU-protected pointers - turns out to require either memory barriers
> or signals.
> All of the many techniques that have been developed in this area are
> merely minor variations on a very old theme: set some kind of flag
> variable in shared memory to let people know that you are reading a
> shared data structure, and clear it when you are done.  Then, other
> people can figure out when it's safe to recycle memory that was
> previously part of that data structure.
Sure, but RCU has the key benefit of decoupling its machinery (esp. that 
flag update) from the actual critical section(s) it protects. In a DBMS 
setting, for example, once per transaction or SQL statement would do 
just fine. The notification can be much better than a simple flag---you 
want to know whether the thread has ever quiesced since the last reclaim 
cycle began, not whether it is currently quiesced (which it usually 
isn't). In the implementation I use, a busy thread (e.g. not about to go 
idle) can "chain" its RCU "transactions." In the common case, a chained 
quiesce call comes when the RCU epoch is not trying to change, and the 
"flag update" degenerates to a simple load. Further, the only time it's 
critical to have that memory barrier is if the quiescing thread is about 
to go idle. Otherwise, missing a flag just imposes a small delay on 
resource reclamation (and that's assuming the flag in question even 
belonged to a straggler process). How you implement epoch management, 
especially the handling of stragglers, is the deciding factor in whether 
RCU works well. The early URCU techniques were pretty terrible, and 
maybe general-purpose URCU is doomed to stay that way, but in a DBMS 
core it can be done very cleanly and efficiently because we can easily 
add the quiescent points at appropriate locations in the code.

>   In Linux's RCU, the flag
> variable is "whether the process is currently scheduled on a CPU",
> which is obviously not workable from user-space.  Lacking that, you
> need an explicit flag variable, which means you need memory barriers,
> since the protected operation is a load and the flag variable is
> updated via a store.  You can try to avoid some of the overhead by
> updating the flag variable less often (say, when a signal arrives) or
> you can make it more fine-grained (in my case, we only prevent reclaim
> of a fraction of the data structure at a time, rather than all of it)
> or various other variants, but none of this is unfortunately so simple
> as "apply technique X and your problem just goes away".
Magic wand, no (does nothing for update contention, for example, and 
requires some care to apply). But from a practical perspective RCU, 
properly implemented, does make an awful lot of problems an awful lot 
simpler to tackle. Especially for the readers.

Ryan

pgsql-hackers by date:

From: Stephen Frost
Date: 16 October 2014, 15:28:35
Subject: Re: Review of GetUserId() Usage

From: Stephen Frost
Date: 16 October 2014, 15:47:57
Subject: Re: Additional role attributes && superuser review

Re: WIP: dynahash replacement for buffer table - Mailing list pgsql-hackers

Previous

Next