Re: DB cache size strategies - Mailing list pgsql-general

From Jan Wieck
Subject Re: DB cache size strategies
Date
Msg-id 401E6B91.1060104@Yahoo.com
Whole thread Raw
In response to Re: DB cache size strategies  (Martijn van Oosterhout <kleptog@svana.org>)
List pgsql-general
But keep in mind that all these strategies need to be reevaluated with
7.5 and its ARC.


Jan

Martijn van Oosterhout wrote:
> On Fri, Jan 30, 2004 at 03:19:56PM -0700, Ed L. wrote:
>> On Friday January 30 2004 2:33, Ed L. wrote:
>> >
>> > One key aspect of pgsql performance tuning is to adjust the memory
>> > consumption settings (shared_buffers, sort_mem, etc) large enough to hold
>> > as much of the database in shared memory as possible while not causing
>> > page swap-ins.  I understand that both page swap-outs and swap space
>> > usage is normal and OK, but lots of page swap-ins are bad).  In other
>> > words, for absolute fastest performance, we want a database buffer cache
>> > hit rate of as close to 100% as possible.
>>
>> I'm also curious about the relationship of DB shared buffer cache to the
>> linux/hpux kernel caches.  In particular, if the block being retrieved in
>> pgsql was in the kernel's cache but not in the DB cache, thereby forcing a
>> read() system call, what kind of quantitative difference in performance
>> would one expect when comparing with block retrievals coming from the
>> cache?  I would think they'd differ only by something on the order of
>> microseconds.  Is the linux kernel disk cache normally a duplicate of much
>> of what is in the DB cache?  For linux, does the kernel cache use only
>> "available" memory until a program needs it, while the pgsql DB cache
>> memory is guaranteed at startup?
>
> Whenever the database needs a block not in memory it get loaded and ends up
> in both the OS cache and the DB cache. The difference between getting a
> block out of DB cache and OS cache is very, very small compared to loading
> off disk.
>
> Hence, the larger your DB cache, the less overall data you can cache due to
> the duplication. OTOH, postgres can only directly use the DB cache and can
> only access the OS cache indirectly.
>
> What I try to aim for is the keep the DB cache somewhere near the "working
> set", ie the amount of space used by all the system tables and commonly used
> tables in the system. It's shared by all postgres servers in a cluster so
> keep that in mind. Also the OS cache may be shared by other processes, so
> your results will depend on what other things are happening.
>
> Finally, at least on Linux, the shared memory postgres uses for cache can
> also be swapped out making it very difficult to determine the correct value.
>
> Hope this helps,
>


--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


pgsql-general by date:

Previous
From: Richard Huxton
Date:
Subject: Re: Return Value of a Function
Next
From: Tom Lane
Date:
Subject: Re: Return Value of a Function