Re: DB cache size strategies - Mailing list pgsql-general

From Jenny Zhang
Subject Re: DB cache size strategies
Date
Msg-id 1076460887.17273.22.camel@ibm-a.pdx.osdl.net
Whole thread Raw
In response to Re: DB cache size strategies  ("scott.marlowe" <scott.marlowe@ihs.com>)
List pgsql-general
On Tue, 2004-02-10 at 16:17, scott.marlowe wrote:
> On Tue, 10 Feb 2004, Ed L. wrote:
>
> > On Tuesday February 10 2004 3:48, scott.marlowe wrote:
> > > On Tue, 10 Feb 2004, Ed L. wrote:
> > > > Interesting.  Why leave very large tables to the kernel instead of the
> > > > db cache?  Assuming a dedicated DB server and a DB smaller than
> > > > available RAM, why not give the DB enough RAM to get the entire DB into
> > > > the DB cache? (Assuming you have the RAM).
> > >
> > > Because the kernel is more efficient (right now) at caching large data
> > > sets.
> > >
> > > With the ARC cache manager that will likely wend it's way into 7.5, it's
> > > quite a likely possibility that postgresql will be able to efficiently
> > > handle a larger cache, but it will still be a shared memory cache, and
> > > those are still usually much slower than the kernel's cache.
> >
> > Hmmm.  Others have asserted/assumed they'd be roughly equivalent.  It'd be
> > interesting to see some real data measuring the performance of the shared
> > mem cache vs. kernel cache.  Anyone know of existing benchmarks?
>
> Testing from various folks has shown that there is a maximum size of
> somewhere around 256 meg or so where the returns diminish pretty quickly,
> unless your working data sets are larger than that.  I.e. setting a
> database up that needs 20 megs to throw data around in with 200 megs of
> buffers is usually slower than handing it 32 megs of buffer and letting
> the kernel do it.
>
> But ARC may change all of that.  I'll have to grab a CVS snapshot and see
> how it behaves with large datasets and shared buffers...
>
The latest osdl-dbt3 run on our stp system used about 250M for shared
buffer.  The database was about 3.8G.  The system has 8G of RAM, most of
which are used by the database.  I did try to increase the shared_buffer
but the performance decreased.  For this workload, the bottleneck is the
I/O for the pgsql_tmp directory, since those are complicated queries
with group by and sorting.

The result is at:
http://khack.osdl.org/stp/287555/

It would be interesting to see how 7.5 handles this workload.  Also you
are welcome to use STP to try out different database parameters with
this workload.  Let me know if you are interested, I will follow up with
more instructions.

Jenny


pgsql-general by date:

Previous
From: "scott.marlowe"
Date:
Subject: Re: DB cache size strategies
Next
From: Joe Conway
Date:
Subject: Re: PQunescapeBytea