Thread: Postgresql cache (memory) performance + how to warm up the cache
I have a DB table with 25M rows, ~3K each (i.e. ~75GB), that together with multiple indexes I use (an additional 15-20GB) will not fit entirely in memory (64GB on machine). A typical query locates 300 rows thru an index, optionally filters them down to ~50-300 rows using other indexes, finally fetching the matching rows. Response times vary between 20ms on a warm DB to 20 secs on a cold DB. I have two related questions:
1. At any given time how can I check what portion (%) of specific tables and indexes is cached in memory?
2. What is the best way to warm up the cache before opening the DB to queries? E.g. "select *" forces a sequential scan (~15 minutes on cold DB) but response times following it are still poor. Is there a built-in way to do this instead of via queries?a
Thanks, feel free to also reply by email (info@shauldar.com])
-- Shaul
1. At any given time how can I check what portion (%) of specific tables and indexes is cached in memory?
2. What is the best way to warm up the cache before opening the DB to queries? E.g. "select *" forces a sequential scan (~15 minutes on cold DB) but response times following it are still poor. Is there a built-in way to do this instead of via queries?a
Thanks, feel free to also reply by email (info@shauldar.com])
-- Shaul
On Mon, 1 Jun 2009, Shaul Dar wrote: > 1. At any given time how can I check what portion (%) of specific tables and indexes is cached in memory? This is a bit tricky. PostgreSQL caches information in its shared_buffers cache, and you can get visibility into that if you install the contrib/pg_buffercache library into your database. I go over the theory here and give some sample queries, including the one you're asking for, in my "Inside the PostgreSQL Buffer Cache" presentation at http://www.westnet.com/~gsmith/content/postgresql/ However, in a normal installation, the operating system cache will have a significant amount of data stored in it as well. Figuring out that is more complicated. The best integrated script I've seen for that so far as at http://www.kennygorman.com/wordpress/?p=250 but that's not really integrated into an easy to use tool yet. Improving that is on a couple of people's agendas for the next PostgreSQL release, that's as good as it gets for what I'm aware that's already public. > 2. What is the best way to warm up the cache before opening the DB to queries? E.g. "select *" forces a sequential scan(~15 minutes > on cold DB) but response times following it are still poor. Is there a built-in way to do this instead of via queries?a There is an optimization in PostgreSQL 8.3 and later that keeps sequential scans from using large amounts of the PostgreSQL buffer cache, and full table scans don't pull the index pages at all--and those are likely what you really want cached. What you probably want to do is run a query that uses an index aggressively (confirm this via EXPLAIN) instead. You might be able to get that to happen by selecting everything using an ORDER BY that is expensive (from a planner cost perspective), therefore making the indexed scan seem more attractive, but the exact behavior here depends on how you've got your planner parameters setup. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD