Home > mailing lists

Re: Estimating hot data size - Mailing list pgsql-performance

From	Greg Smith
Subject	Re: Estimating hot data size
Date	February 16, 2011 18:02:50
Msg-id	4D5C4980.9070807@2ndquadrant.com Whole thread Raw
In response to	Estimating hot data size (Chris Hoover <revoohc@gmail.com>)
List	pgsql-performance

Tree view

Chris Hoover wrote:
> Basically, I'm using the sum(heap_blks_read + idx_blks_read) from
> pg_statio_all_tables, and diffing the numbers over a period of time (1
> hour at least).  Is this a fair estimate?  The reason for doing this
> is we are looking at new server hardware, and I want to try and get
> enough ram on the machine to keep the hot data in memory plus provide
> room for growth.

Those two are measuring reads to the operating system, which isn't
really a good measure of the working data set.  If you switch to the
internal counters that measure what's already cached, that won't be
quite right either.  Those will be repeatedly measuring the same block,
on the truly hot ones, which inflates how big you'll think the working
set is relative to its true size.

If you visit http://projects.2ndquadrant.com/talks you'll find a talk
called "Inside the PostgreSQL Buffer Cache" that goes over how the cache
is actually managed within the database.  There's also some sample
queries that run after you install the pg_buffercache module into a
database.  Check out "Buffer contents summary, with percentages".
That's the only way to really measure what you're trying to see.  I will
sometimes set shared_buffers to a larger value than would normally be
optimal for a bit, just to get a better reading on what the hot data is.

If you also want to get an idea what's in the operating system cache,
the pgfincore module from http://pgfoundry.org/projects/pgfincore/ will
allow that on a Linux system.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

pgsql-performance by date:

From: Bruce Momjian
Date: 16 February 2011, 17:22:54
Subject: Re: Why we don't want hints Was: Slow count(*) again...

From: Jeremy Palmer
Date: 17 February 2011, 01:14:34
Subject: Does exclusive locking improve performance?

Re: Estimating hot data size - Mailing list pgsql-performance

Previous

Next