Re: Design notes for BufMgrLock rewrite - Mailing list pgsql-hackers
From | Jim C. Nasby |
---|---|
Subject | Re: Design notes for BufMgrLock rewrite |
Date | |
Msg-id | 20050216172009.GR52357@decibel.org Whole thread Raw |
In response to | Re: Design notes for BufMgrLock rewrite (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Design notes for BufMgrLock rewrite
|
List | pgsql-hackers |
On Sun, Feb 13, 2005 at 06:56:47PM -0500, Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > Tom Lane wrote: > >> One thing I realized quickly is that there is no natural way in a clock > >> algorithm to discourage VACUUM from blowing out the cache. I came up > >> with a slightly ugly idea that's described below. Can anyone do better? > > > Uh, is the clock algorithm also sequential-scan proof? Is that > > something that needs to be done too? > > If you can think of a way. I don't see any way to make the algorithm > itself scan-proof, but if we modified the bufmgr API to tell ReadBuffer > (or better ReleaseBuffer) that a request came from a seqscan, we could > do the same thing as for VACUUM. Whether that's good enough isn't > clear --- for one thing it would kick up the contention for the > BufFreelistLock, and for another it might mean *too* short a lifetime > for blocks fetched by seqscan. Is there anything (in the buffer headers?) that keeps track of buffer access frequency? *BSD uses a mechanism to track roughly how often a page in memory has been accessed, and uses that to determine what pages to free. In 4.3BSD, a simple 2 hand clock sweep is used; the first hand sets a not-used bit in each page, the second hand (which sweeps a fixed distance behind the 1st hand) checks this bit and if it's still clear moves the page either to the inactive list if it's dirty, or to the cache list if it's clean. There is also a free list, which is generally fed by the cache and inactive lists. Postgresql has a big advantage over an OS though, in that it can tolerate much more overhead in buffer access code than an OS can in it's vm system. If I understand correctly, any use of a buffer at all means a lock needs to be aquired on it's buffer header. As part of this access, a counter could be incremented with very little additional cost. A background process would then sweep through 'active' buffers, decrementing this counter by some amount. Any buffer that was decremented below 0 would be considered inactive, and a candidate for being freed. The advantage of using a counter instead of a simple active bit is that buffers that are (or have been) used heavily will be able to go through several sweeps of the clock before being freed. Infrequently used buffers (such as those from a vacuum or seq. scan), would get marked as inactive the first time they were hit by the clock hand. -- Jim C. Nasby, Database Consultant decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
pgsql-hackers by date: