Re: Shared locking in slru.c - Mailing list pgsql-hackers

From Kenneth Marshall
Subject Re: Shared locking in slru.c
Date
Msg-id 20051130201839.GD3765@it.is.rice.edu
Whole thread Raw
In response to Shared locking in slru.c  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Wed, Nov 30, 2005 at 01:53:13PM -0500, Tom Lane wrote:
> I've been looking at various ways to resolve this, but one thing that
> seems promising is to hack slru.c to take the control lock in shared
> mode, not exclusive mode, for read-only accesses to pages that are
> already in memory.  The vast majority of accesses to pg_subtrans in the
> problem scenario are read-only, and so this reduces the need to block
> and the consequent CS storm.
> 
> A quick-hack prototype patch for this is attached.  (It changes only
> SubTransGetParent, but if applied of course TransactionIdGetStatus and
> so on would get the same treatment.)  The idea is that we take the lock
> in shared mode, look to see if the required page is in memory, and if so
> just return it; if not, release the lock, reacquire in exclusive mode,
> proceed with the existing code.
> 
> The reason that the existing code always takes the lock in exclusive
> mode, even if there's no intention to change data, is that it also needs
> to adjust the LRU information to reflect the page access, and the way
> that we're doing that requires exclusive access.  So to make the idea
> work at all, we need some alternative way of managing recency-of-use
> info.
> 
> The way the attached patch attacks this is for the shared-lock access
> case to simply set the page's LRU counter to zero, without bumping up
> the LRU counters of the other pages as the normal adjustment would do.
> This is safe to execute in a shared environment since even if someone
> else is concurrently touching the same page, they'll also be trying
> to set its counter to zero, and so there's no possibility of ending
> up in a bad state.  However, this leaves us with the likelihood that
> multiple pages will have equal LRU counters when it comes time to
> throw out a page to make room for another.  The patch deals with that
> by selecting the furthest-back page of those with the highest LRU
> setting.  I'm not totally happy with this heuristic, though, and was
> wondering if anyone had a better idea.  Anyone seen a lock-free
> data structure for LRU or approximately-LRU state tracking?
> 
>             regards, tom lane

Tom,

In this situation, if this code does not need the ability to queue
access to the control lock, it may be reasonable to use a latch process
to update the control information. In pseudo-code, the operations to
read the control information are:

WriteControl:
1. Set latch.
2. Update control information
3. Increment latch version number.
4. Unset latch.

ReadControl:
1. Read latch version number.
2. Read control information.
3. Check latch. If locked go to step 1.
4. Read latch version number. If the value is different from the  value read in step 1, go to step 1.

In this way, a read operation will only need two reads of the version
number and one of the shared data. A write operation will be very
lightweight. Setting the latch can be a single TAS/CAS operation and
unsetting the latch and incrementing the version number can be done
in a single write. Of course, if you need to be able to queue up lock
requests this will not work.

Ken


pgsql-hackers by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Re: [ADMIN] ERROR: could not read block
Next
From: "Kevin Grittner"
Date:
Subject: Re: [ADMIN] ERROR: could not read block