Thread: Howto to clear PostgreSQL cache and how to preload all tuples of a table into the cache?

I'd like to clear the PostgreSQL cache (e.g. for benchmarking purposes).
And I'd like to preload all tuples of a table (say mytable_one) into the cache.

AFAIK there is no way to force all caches to be cleared in PostgreSQL
with an SQL command.
The only way to achieve this, seems to restart PG (server), which is
neither an option for benchmarking purposes nor for production.

But:
1. Isn't it possible to achieve a kind-of cache clearing (in a
reliable way) by simply doing a "select * from mytable_two" given
mytable_two is at least as large as mytable_one (which is the one we
want  to benchmark)?

2. I assume that "select * from mytable_one" loads as much of the
tuples as it can into the cache. Are there better ways for preloading
the contents of a table?

Yours, Stefan

On 12/12/2011 08:42 AM, Stefan Keller wrote:
> I'd like to clear the PostgreSQL cache (e.g. for benchmarking purposes).
> And I'd like to preload all tuples of a table (say mytable_one) into the cache.
>
> AFAIK there is no way to force all caches to be cleared in PostgreSQL
> with an SQL command.
> The only way to achieve this, seems to restart PG (server), which is
> neither an option for benchmarking purposes nor for production.
You haven't specified your OS, Pg version, etc. This is important.

Pg relies on the OS's disk cache, which it has no way to clear or control.

On Linux, check out the "drop_caches"

> But:
> 1. Isn't it possible to achieve a kind-of cache clearing (in a
> reliable way) by simply doing a "select * from mytable_two" given
> mytable_two is at least as large as mytable_one (which is the one we
> want  to benchmark)?
No.

If mytable_two is significantly larger than the system memory then
maybe, and only maybe, you'll clear the cache. Nothing stops Pg from
setting posix_fadvise(..., ..., ...,
POSIX_FADV_SEQUENTIAL|POSIX_FADV_NOREUSE) to help the OS more efficently
do the seqscan, though. Even if Pg doesn't do that, nothing stops the OS
from figuring out Pg's intent and limiting how much it caches.

>
> 2. I assume that "select * from mytable_one" loads as much of the
> tuples as it can into the cache. Are there better ways for preloading
> the contents of a table?
>

Nope, again because Pg largely relies on the OS cache.

The OS will *probably* cache the contents of mytable_one when you do a
seqscan on it, but it might not, and it might be correct in not doing so.

--
Craig Ringer



On Mon, Dec 12, 2011 at 8:33 AM, Craig Ringer <ringerc@ringerc.id.au> wrote:
On 12/12/2011 08:42 AM, Stefan Keller wrote:
I'd like to clear the PostgreSQL cache (e.g. for benchmarking purposes).
And I'd like to preload all tuples of a table (say mytable_one) into the cache.

AFAIK there is no way to force all caches to be cleared in PostgreSQL
with an SQL command.
The only way to achieve this, seems to restart PG (server), which is
neither an option for benchmarking purposes nor for production.
You haven't specified your OS, Pg version, etc. This is important.

Pg relies on the OS's disk cache, which it has no way to clear or control.

On Linux, check out the "drop_caches"


But:
1. Isn't it possible to achieve a kind-of cache clearing (in a
reliable way) by simply doing a "select * from mytable_two" given
mytable_two is at least as large as mytable_one (which is the one we
want  to benchmark)?
No.

If mytable_two is significantly larger than the system memory then maybe, and only maybe, you'll clear the cache. Nothing stops Pg from setting posix_fadvise(..., ..., ..., POSIX_FADV_SEQUENTIAL|POSIX_FADV_NOREUSE) to help the OS more efficently do the seqscan, though. Even if Pg doesn't do that, nothing stops the OS from figuring out Pg's intent and limiting how much it caches.



2. I assume that "select * from mytable_one" loads as much of the
tuples as it can into the cache. Are there better ways for preloading
the contents of a table?


Nope, again because Pg largely relies on the OS cache.

The OS will *probably* cache the contents of mytable_one when you do a seqscan on it, but it might not, and it might be correct in not doing so.

--
Craig Ringer


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


For caching you can use pgmemcache a contrib module, however you need to change few thing in your application bcoz its not  application transparent.  For application transparency you can use infinite-cache. 

I believe pgmemcache will do cache clearing, not sure whether its own cache or OS cache.

Very informative discussion on Caching in PostgreSQL archives. 
Le 12 décembre 2011 01:42, Stefan Keller <sfkeller@gmail.com> a écrit :
> I'd like to clear the PostgreSQL cache (e.g. for benchmarking purposes).
> And I'd like to preload all tuples of a table (say mytable_one) into the cache.
>
> AFAIK there is no way to force all caches to be cleared in PostgreSQL
> with an SQL command.
> The only way to achieve this, seems to restart PG (server), which is
> neither an option for benchmarking purposes nor for production.
>
> But:
> 1. Isn't it possible to achieve a kind-of cache clearing (in a
> reliable way) by simply doing a "select * from mytable_two" given
> mytable_two is at least as large as mytable_one (which is the one we
> want  to benchmark)?

in postgresql cache, no, because such a query will use a sequential
scan and postgreSQL will protect its cache with a ring: read tuples
are stored in in this short(er than shared_memory) cache; recycled
while reading the table.

>
> 2. I assume that "select * from mytable_one" loads as much of the
> tuples as it can into the cache. Are there better ways for preloading
> the contents of a table?

no, see 1/
So far, there were some ideas on postgresql cache clearing/management
but nothing did it because none have evidences that it is useful (for
performances)

You can use pgfincore: http://pgfoundry.org/projects/pgfincore
to monitor your OS cache, and if you have a system with POSIX_FADVISE
support you can make snpashot, restore, preload ..Etc

For your benchmark, just make a good scenario, else your benchmark
does not bench anything but what you supposed that can happen (wihtout
happening). I don't see why someone would want to clear the postgresql
cache *in production* ! The engine will use its internal mecanism to
decide what to keep and what to remove with (we expect) more
intelligence than us.

--
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation