Re: Clock sweep not caching enough B-Tree leaf pages? - Mailing list pgsql-hackers

From Jim Nasby
Subject Re: Clock sweep not caching enough B-Tree leaf pages?
Date
Msg-id 552D97D4.7010600@BlueTreble.com
Whole thread Raw
In response to Re: Clock sweep not caching enough B-Tree leaf pages?  (Peter Geoghegan <pg@heroku.com>)
Responses Re: Clock sweep not caching enough B-Tree leaf pages?
List pgsql-hackers
On 4/14/15 5:22 PM, Peter Geoghegan wrote:
> As long as we're doing random brainstorming, I'd suggest looking at
> making clocksweep actually approximate LRU-K/LRU-2 (which, again, to
> be clear, my prototype did not do). The clocksweep could maintain
> statistics about the recency of the second-to-last access across all
> buffers, and discriminate against buffers according to what bucket of
> the population they fit in to. Not sure how aggressively we'd penalize
> those buffers that had very old penultimate references (or credit
> those that had very recent penultimate references), or what the bucket
> partitioning scheme is, but that's probably where'd I'd take it next.
> For example, buffers with a penultimate reference that is more than a
> standard deviation below the mean would be double penalized (and maybe
> the opposite, for those buffers with penultimate accesses a stddev
> above the mean). If that didn't work so well, then I'd look into an
> ARC style recency and frequency list (while remembering things about
> already evicted blocks, which LRU-K does not do....although that paper
> is from the early 1990s).

Along the lines of brainstorming... why do we even allow usage_count > 
1? Clocksweep was used pretty successfully by at least FreeBSD, but they 
simply used a bit to indicate recently used. Anything that wasn't 
recently used moved from the active pull to the inactive pool (which 
tended to be far larger than the active pool with decent amounts of 
memory), and a small number of buffers were keep on the 'free' list by 
pulling them out of the inactive pool and writing them if they were 
dirty. All of this was done on an LRU basis.

Given how common it is for the vast bulk of shared_buffers in an install 
to be stuck at 5, I'd think the first thing we should try is a 
combination of greatly reducing the max for usage_count (maybe to 2 
instead of 1 to simulate 2 pools), and running the clock sweep a lot 
more aggressively in a background process.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Clock sweep not caching enough B-Tree leaf pages?
Next
From: Greg Stark
Date:
Subject: Re: Clock sweep not caching enough B-Tree leaf pages?