Thread: Postgresql cache (memory) performance + how to warm up the cache

Postgresql cache (memory) performance + how to warm up the cache

From
Shaul Dar
Date:
I have a DB table with 25M rows, ~3K each (i.e. ~75GB), that together with multiple indexes I use (an additional 15-20GB) will not fit entirely in memory (64GB on machine). A typical query locates 300 rows thru an index,  optionally filters them down to ~50-300 rows using other indexes, finally fetching the matching rows. Response times vary between 20ms on a warm DB to 20 secs on a cold DB. I have two related questions:

1. At any given time how can I check what portion (%) of specific tables and indexes is cached in memory?

2. What is the best way to warm up the cache before opening the DB to queries? E.g. "select *" forces a sequential scan (~15 minutes on cold DB) but response times following it are still poor. Is there a built-in way to do this instead of via queries?a

Thanks, feel free to also reply by email (info@shauldar.com])

-- Shaul

Re: Postgresql cache (memory) performance + how to warm up the cache

From
Greg Smith
Date:
On Mon, 1 Jun 2009, Shaul Dar wrote:

> 1. At any given time how can I check what portion (%) of specific tables and indexes is cached in memory?

This is a bit tricky.  PostgreSQL caches information in its shared_buffers
cache, and you can get visibility into that if you install the
contrib/pg_buffercache library into your database.  I go over the theory
here and give some sample queries, including the one you're asking for, in
my "Inside the PostgreSQL Buffer Cache" presentation at
http://www.westnet.com/~gsmith/content/postgresql/

However, in a normal installation, the operating system cache will have a
significant amount of data stored in it as well.  Figuring out that is
more complicated.  The best integrated script I've seen for that so far as
at http://www.kennygorman.com/wordpress/?p=250 but that's not really
integrated into an easy to use tool yet.  Improving that is on a couple of
people's agendas for the next PostgreSQL release, that's as good as it
gets for what I'm aware that's already public.

> 2. What is the best way to warm up the cache before opening the DB to queries? E.g. "select *" forces a sequential
scan(~15 minutes 
> on cold DB) but response times following it are still poor. Is there a built-in way to do this instead of via
queries?a

There is an optimization in PostgreSQL 8.3 and later that keeps sequential
scans from using large amounts of the PostgreSQL buffer cache, and full
table scans don't pull the index pages at all--and those are likely what
you really want cached.

What you probably want to do is run a query that uses an index
aggressively (confirm this via EXPLAIN) instead.  You might be able to get
that to happen by selecting everything using an ORDER BY that is expensive
(from a planner cost perspective), therefore making the indexed scan seem
more attractive, but the exact behavior here depends on how you've got
your planner parameters setup.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD