Re: Used Memory - Mailing list pgsql-performance

From Scott Marlowe
Subject Re: Used Memory
Date
Msg-id 1130185499.15546.120.camel@state.g2switchworks.com
Whole thread Raw
In response to Re: Used Memory  ("Craig A. James" <cjames@modgraph-usa.com>)
Responses Re: Used Memory  ("Craig A. James" <cjames@modgraph-usa.com>)
List pgsql-performance
On Mon, 2005-10-24 at 12:00, Craig A. James wrote:
> Kevin Grittner wrote:
> > In addition to what Mark pointed out, there is the possibility that a
> > query
> > is running which is scanning a large table or otherwise bringing in a
> > large number of pages from disk.  That would first use up all available
> > unused cache space, and then may start replacing some of your
> > frequently used data.
>
> An LRU cache is often a bad strategy for database applications.  There are two illustrations that show why.
>
> 1. You have an index that's used for EVERY search, but each search returns a large and unique set of rows.  If it
happensthat the rows returned exceed the systems cache size, the part or all of your index will be flushed with EVERY
query.
>
> 2. You do a sequential scan of a table that's one block bigger than the file system cache, then you do it again.  At
thebeginning of the second scan, the first block of the table will have just been swapped out because it was the
oldest,so the file system brings it back in, replacing the second block, which is now the oldest.  As you scan the
table,each block of the table is swapped out JUST BEFORE you get to it.  At the start of your query, the file system
mighthave had 99.9% of the relevant data in memory, but it swaps out 100% of it as your query progresses. 
>
> Scenario 2 above is interesting because a system that is performing very well can suddenly experience a catastrophic
performancedecline when the size of the data exceeds a critical limit - the file system's avaliable cache. 
>
> LRU works well if your frequently-used data is used often enough to keep it in memory.  But many applications don't
havethat luxury.  It's often the case that a single query will exceed the file system's cache size.  The file system
cacheis "dumb" -- it's strategy is too simple for a relational database. 
>
> What's needed is a way for the application developer to explicitely say, "This object is frequenly used, and I want
itkept in memory." 

There's an interesting conversation happening on the linux kernel
hackers mailing list right about now that applies:

http://www.gossamer-threads.com/lists/linux/kernel/580789

pgsql-performance by date:

Previous
From: Rodrigo Madera
Date:
Subject: Re: Inefficient escape codes.
Next
From: "Craig A. James"
Date:
Subject: Re: Used Memory