Re: PG 8.3 and large shared buffer settings - Mailing list pgsql-performance

From Greg Smith
Subject Re: PG 8.3 and large shared buffer settings
Date
Msg-id alpine.GSO.2.01.0909261006390.11378@westnet.com
Whole thread Raw
In response to Re: PG 8.3 and large shared buffer settings  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: PG 8.3 and large shared buffer settings  (Gerhard Wiesinger <lists@wiesinger.com>)
List pgsql-performance
On Fri, 25 Sep 2009, Jeff Janes wrote:

> Does it do this even if the block was already in shared_buffers?

Usually not.  The buffer ring algorithm is used to manage pages that are
read in specifically to satisfy a sequential scan (there's a slightly
different ring method used for VACUUM too).  If the buffer you need is
already available and not "pinned" (locked by someone else), it's not read
from disk again.  Instead, its usage count is incremently only if it's at
zero (this doesn't count as a use unless it's about to be evicted as
unused), and it's returned without being added to the ring.

There's a section about this ("Buffer Ring Replacement Strategy") in the
source code:
http://git.postgresql.org/gitweb?p=postgresql.git;a=blob_plain;f=src/backend/storage/buffer/README;hb=HEAD

The commit that added the feature is at
http://git.postgresql.org/gitweb?p=postgresql.git;a=commit;h=ebf3d5b66360823edbdf5ac4f9a119506fccd4c0

The basic flow of this code is that backends ask for buffers using
BufferAlloc, which then calls StrategyGetBuffer (where the ring list is
managed) only if it doesn't first find the page in the buffer cache.  You
get what you'd hope for here:  a sequential scan will use blocks when
they're already available in the cache, while reading in less popular
blocks that weren't cached into the temporary ring area.  There's always
the OS cache backing the PostrgreSQL one to handle cases where the working
set you're using is just a bit larger than shared_buffers.  The ring read
requests may very well be satisfied by that too if there was a recent
sequential scan the OS is still caching.

You can read a high-level summary of the algorithm used for ring
management (with an intro to buffer management in general) in my "Inside
the PostgreSQL Buffer Cache" presentation at
http://www.westnet.com/~gsmith/content/postgresql/ on P10 "Optimizations
for problem areas".  That doesn't specifically cover the "what if it's in
the cache already?" case though.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

pgsql-performance by date:

Previous
From: Claus Guttesen
Date:
Subject: Re: Bad performance of SELECT ... where id IN (...)
Next
From: Greg Smith
Date:
Subject: Re: PG 8.3 and large shared buffer settings