Thread: Add usage counts to pg_buffercache

Add usage counts to pg_buffercache

From
Greg Smith
Date:
This patch adds the usage count statistic to the information available in
contrib/pgbuffercache.  Earlier this month a discussion about my first
attempt to instrument the background writer had Tom asking for details
about the usage histogram I was seeing, and this patch proved to be the
easiest way I found to take a look at that.

In situations where one is trying to optimize the background writer, it's
very hard to adjust how much to rely on the LRU writer versus the one that
writes everything unless you know whether your dirty buffers are typically
used heavily (like index blocks) or not (like new INSERT data).  Some
statistics about the usage counts in your buffer cache are extremely
helpful in making that decision.

I'll even pass along an ugly but fun query that utilizes this.  The
following will give you a summary of your buffer cache broken into 32
sections.  Each line shows the average usage count of that section, as a
positive number if most buffers dirty and a negative one if most are
clean.  If you refresh this frequently enough, you can actually watch
things like how checkpoints move through the buffer cache:

SELECT current_timestamp,
-- Split into 32 bins of data
round(bufferid / (cast((select setting from pg_settings where
name='shared_buffers') as int) / (32 - 1.0)))
as section, round(
-- Average usage count, capped at 5
case when avg(usagecount)>5 then 5 else avg(usagecount) end *
-- -1 when the majority are clean records, 1 when most are dirty
(case when sum(case when isdirty then 1 else -1 end)>0 then 1 else -1
end)) as color_intensity
FROM pg_buffercache GROUP BY
round(bufferid / (cast((select setting from pg_settings where
name='shared_buffers') as int) / (32 - 1.0)));

The 32 can be changed to anything, that's just what fits on my screen.
The main idea of the above is that if you dump all this to a file
regularly, it's possible to produce a graph of it showing how the cache
has changed over time by assigning a different color intensity based on
the usage count--at a massive cost in overhead, of course.  I'll be
passing along all that code once I get it ready for other people to use.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Attachment

Re: Add usage counts to pg_buffercache

From
Russell Smith
Date:
Possibly minor detail;

from buf_internals.h

uint16        usage_count;    /* usage counter for clock sweep code */

and you have a int16 to store that.  Currently the max buffer count is
5.  But is that a complete safe assumption?  Maybe a compile time check
that  BM_MAX_USAGE_COUNT is < 16k would ensure that things don't go wrong?

Regards

Russell Smith

Greg Smith wrote:
> This patch adds the usage count statistic to the information available
> in contrib/pgbuffercache.  Earlier this month a discussion about my
> first attempt to instrument the background writer had Tom asking for
> details about the usage histogram I was seeing, and this patch proved
> to be the easiest way I found to take a look at that.
>
> In situations where one is trying to optimize the background writer,
> it's very hard to adjust how much to rely on the LRU writer versus the
> one that writes everything unless you know whether your dirty buffers
> are typically used heavily (like index blocks) or not (like new INSERT
> data).  Some statistics about the usage counts in your buffer cache
> are extremely helpful in making that decision.
>
> I'll even pass along an ugly but fun query that utilizes this.  The
> following will give you a summary of your buffer cache broken into 32
> sections.  Each line shows the average usage count of that section, as
> a positive number if most buffers dirty and a negative one if most are
> clean.  If you refresh this frequently enough, you can actually watch
> things like how checkpoints move through the buffer cache:
>
> SELECT current_timestamp,
> -- Split into 32 bins of data
> round(bufferid / (cast((select setting from pg_settings where
> name='shared_buffers') as int) / (32 - 1.0)))
> as section, round(
> -- Average usage count, capped at 5
> case when avg(usagecount)>5 then 5 else avg(usagecount) end *
> -- -1 when the majority are clean records, 1 when most are dirty
> (case when sum(case when isdirty then 1 else -1 end)>0 then 1 else -1
> end)) as color_intensity
> FROM pg_buffercache GROUP BY
> round(bufferid / (cast((select setting from pg_settings where
> name='shared_buffers') as int) / (32 - 1.0)));
>
> The 32 can be changed to anything, that's just what fits on my screen.
> The main idea of the above is that if you dump all this to a file
> regularly, it's possible to produce a graph of it showing how the
> cache has changed over time by assigning a different color intensity
> based on the usage count--at a massive cost in overhead, of course.
> I'll be passing along all that code once I get it ready for other
> people to use.
>
> --
> * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
> ------------------------------------------------------------------------
>
> Index: README.pg_buffercache
> ===================================================================
> RCS file: /projects/cvsroot/pgsql/contrib/pg_buffercache/README.pg_buffercache,v
> retrieving revision 1.3
> diff -c -r1.3 README.pg_buffercache
> *** README.pg_buffercache    26 Apr 2006 22:50:17 -0000    1.3
> --- README.pg_buffercache    1 Apr 2007 02:17:15 -0000
> ***************
> *** 40,46 ****
>      reldatabase    | pg_database.oid      | Database for the relation.
>      relblocknumber |                      | Offset of the page in the relation.
>      isdirty        |                      | Is the page dirty?
> !
>
>     There is one row for each buffer in the shared cache. Unused buffers are
>     shown with all fields null except bufferid.
> --- 40,46 ----
>      reldatabase    | pg_database.oid      | Database for the relation.
>      relblocknumber |                      | Offset of the page in the relation.
>      isdirty        |                      | Is the page dirty?
> !    usagecount     |                      | Page LRU count
>
>     There is one row for each buffer in the shared cache. Unused buffers are
>     shown with all fields null except bufferid.
> ***************
> *** 60,79 ****
>
>     regression=# \d pg_buffercache;
>          View "public.pg_buffercache"
> !        Column     |  Type   | Modifiers
> !   ----------------+---------+-----------
> !    bufferid       | integer |
> !    relfilenode    | oid     |
> !    reltablespace  | oid     |
> !    reldatabase    | oid     |
> !    relblocknumber | bigint  |
> !    isdirty        | boolean |
>     View definition:
>      SELECT p.bufferid, p.relfilenode, p.reltablespace, p.reldatabase,
> !           p.relblocknumber, p.isdirty
>        FROM pg_buffercache_pages() p(bufferid integer, relfilenode oid,
>        reltablespace oid, reldatabase oid, relblocknumber bigint,
> !      isdirty boolean);
>
>     regression=# SELECT c.relname, count(*) AS buffers
>                  FROM pg_class c INNER JOIN pg_buffercache b
> --- 60,81 ----
>
>     regression=# \d pg_buffercache;
>          View "public.pg_buffercache"
> !        Column     |  Type    | Modifiers
> !   ----------------+----------+-----------
> !    bufferid       | integer  |
> !    relfilenode    | oid      |
> !    reltablespace  | oid      |
> !    reldatabase    | oid      |
> !    relblocknumber | bigint   |
> !    isdirty        | boolean  |
> !    usagecount     | smallint |
> !
>     View definition:
>      SELECT p.bufferid, p.relfilenode, p.reltablespace, p.reldatabase,
> !           p.relblocknumber, p.isdirty, p.usagecount
>        FROM pg_buffercache_pages() p(bufferid integer, relfilenode oid,
>        reltablespace oid, reldatabase oid, relblocknumber bigint,
> !      isdirty boolean, usagecount smallint);
>
>     regression=# SELECT c.relname, count(*) AS buffers
>                  FROM pg_class c INNER JOIN pg_buffercache b
> Index: pg_buffercache.sql.in
> ===================================================================
> RCS file: /projects/cvsroot/pgsql/contrib/pg_buffercache/pg_buffercache.sql.in,v
> retrieving revision 1.4
> diff -c -r1.4 pg_buffercache.sql.in
> *** pg_buffercache.sql.in    26 Apr 2006 22:50:17 -0000    1.4
> --- pg_buffercache.sql.in    1 Apr 2007 02:17:15 -0000
> ***************
> *** 12,18 ****
>   CREATE VIEW pg_buffercache AS
>       SELECT P.* FROM pg_buffercache_pages() AS P
>        (bufferid integer, relfilenode oid, reltablespace oid, reldatabase oid,
> !       relblocknumber int8, isdirty bool);
>
>   -- Don't want these to be available at public.
>   REVOKE ALL ON FUNCTION pg_buffercache_pages() FROM PUBLIC;
> --- 12,18 ----
>   CREATE VIEW pg_buffercache AS
>       SELECT P.* FROM pg_buffercache_pages() AS P
>        (bufferid integer, relfilenode oid, reltablespace oid, reldatabase oid,
> !       relblocknumber int8, isdirty bool, usagecount int2);
>
>   -- Don't want these to be available at public.
>   REVOKE ALL ON FUNCTION pg_buffercache_pages() FROM PUBLIC;
> Index: pg_buffercache_pages.c
> ===================================================================
> RCS file: /projects/cvsroot/pgsql/contrib/pg_buffercache/pg_buffercache_pages.c,v
> retrieving revision 1.11
> diff -c -r1.11 pg_buffercache_pages.c
> *** pg_buffercache_pages.c    22 Oct 2006 17:49:21 -0000    1.11
> --- pg_buffercache_pages.c    1 Apr 2007 02:17:15 -0000
> ***************
> *** 16,22 ****
>   #include "utils/relcache.h"
>
>
> ! #define NUM_BUFFERCACHE_PAGES_ELEM    6
>
>   PG_MODULE_MAGIC;
>
> --- 16,22 ----
>   #include "utils/relcache.h"
>
>
> ! #define NUM_BUFFERCACHE_PAGES_ELEM    7
>
>   PG_MODULE_MAGIC;
>
> ***************
> *** 35,40 ****
> --- 35,41 ----
>       BlockNumber blocknum;
>       bool        isvalid;
>       bool        isdirty;
> +     uint16        usagecount;
>   }    BufferCachePagesRec;
>
>
> ***************
> *** 91,96 ****
> --- 92,99 ----
>                              INT8OID, -1, 0);
>           TupleDescInitEntry(tupledesc, (AttrNumber) 6, "isdirty",
>                              BOOLOID, -1, 0);
> +         TupleDescInitEntry(tupledesc, (AttrNumber) 7, "usage_count",
> +                            INT2OID, -1, 0);
>
>           fctx->tupdesc = BlessTupleDesc(tupledesc);
>
> ***************
> *** 126,131 ****
> --- 129,135 ----
>               fctx->record[i].reltablespace = bufHdr->tag.rnode.spcNode;
>               fctx->record[i].reldatabase = bufHdr->tag.rnode.dbNode;
>               fctx->record[i].blocknum = bufHdr->tag.blockNum;
> +             fctx->record[i].usagecount = bufHdr->usage_count;
>
>               if (bufHdr->flags & BM_DIRTY)
>                   fctx->record[i].isdirty = true;
> ***************
> *** 172,177 ****
> --- 176,182 ----
>               nulls[3] = true;
>               nulls[4] = true;
>               nulls[5] = true;
> +             nulls[6] = true;
>           }
>           else
>           {
> ***************
> *** 185,190 ****
> --- 190,197 ----
>               nulls[4] = false;
>               values[5] = BoolGetDatum(fctx->record[i].isdirty);
>               nulls[5] = false;
> +             values[6] = Int16GetDatum(fctx->record[i].usagecount);
> +             nulls[6] = false;
>           }
>
>           /* Build and return the tuple. */
>
> ------------------------------------------------------------------------
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>                http://archives.postgresql.org
>


Re: Add usage counts to pg_buffercache

From
Greg Smith
Date:
On Sun, 1 Apr 2007, Russell Smith wrote:

> Currently the max buffer count is 5. But is that a complete safe
> assumption?  Maybe a compile time check that BM_MAX_USAGE_COUNT is < 16k
> would ensure that things don't go wrong?

I actually wasn't even aware that was a hard limit; I just assumed that
all my systems just never got over 5 before the LRU hit them.  How 'bout
that.

I'm all for paranoia, but the idea of letting LRU usage counts go over 16
bits seems pretty unlikely with the current approach.  That would mean
those pages would need 64K passes over the buffer cache before they could
get evicted, which would take quite a while.  My bet is that bufferid gets
expanded from 32 bits before that happens, which would break
pg_buffercache similarly and isn't being checked for either.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Add usage counts to pg_buffercache

From
Bruce Momjian
Date:
Your patch has been added to the PostgreSQL unapplied patches list at:

    http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---------------------------------------------------------------------------


Greg Smith wrote:
> This patch adds the usage count statistic to the information available in
> contrib/pgbuffercache.  Earlier this month a discussion about my first
> attempt to instrument the background writer had Tom asking for details
> about the usage histogram I was seeing, and this patch proved to be the
> easiest way I found to take a look at that.
>
> In situations where one is trying to optimize the background writer, it's
> very hard to adjust how much to rely on the LRU writer versus the one that
> writes everything unless you know whether your dirty buffers are typically
> used heavily (like index blocks) or not (like new INSERT data).  Some
> statistics about the usage counts in your buffer cache are extremely
> helpful in making that decision.
>
> I'll even pass along an ugly but fun query that utilizes this.  The
> following will give you a summary of your buffer cache broken into 32
> sections.  Each line shows the average usage count of that section, as a
> positive number if most buffers dirty and a negative one if most are
> clean.  If you refresh this frequently enough, you can actually watch
> things like how checkpoints move through the buffer cache:
>
> SELECT current_timestamp,
> -- Split into 32 bins of data
> round(bufferid / (cast((select setting from pg_settings where
> name='shared_buffers') as int) / (32 - 1.0)))
> as section, round(
> -- Average usage count, capped at 5
> case when avg(usagecount)>5 then 5 else avg(usagecount) end *
> -- -1 when the majority are clean records, 1 when most are dirty
> (case when sum(case when isdirty then 1 else -1 end)>0 then 1 else -1
> end)) as color_intensity
> FROM pg_buffercache GROUP BY
> round(bufferid / (cast((select setting from pg_settings where
> name='shared_buffers') as int) / (32 - 1.0)));
>
> The 32 can be changed to anything, that's just what fits on my screen.
> The main idea of the above is that if you dump all this to a file
> regularly, it's possible to produce a graph of it showing how the cache
> has changed over time by assigning a different color intensity based on
> the usage count--at a massive cost in overhead, of course.  I'll be
> passing along all that code once I get it ready for other people to use.
>
> --
> * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
Content-Description:

[ Attachment, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>                http://archives.postgresql.org

--
  Bruce Momjian  <bruce@momjian.us>          http://momjian.us
  EnterpriseDB                               http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: Add usage counts to pg_buffercache

From
Bruce Momjian
Date:
Patch applied.  Thanks.

---------------------------------------------------------------------------


Greg Smith wrote:
> This patch adds the usage count statistic to the information available in
> contrib/pgbuffercache.  Earlier this month a discussion about my first
> attempt to instrument the background writer had Tom asking for details
> about the usage histogram I was seeing, and this patch proved to be the
> easiest way I found to take a look at that.
>
> In situations where one is trying to optimize the background writer, it's
> very hard to adjust how much to rely on the LRU writer versus the one that
> writes everything unless you know whether your dirty buffers are typically
> used heavily (like index blocks) or not (like new INSERT data).  Some
> statistics about the usage counts in your buffer cache are extremely
> helpful in making that decision.
>
> I'll even pass along an ugly but fun query that utilizes this.  The
> following will give you a summary of your buffer cache broken into 32
> sections.  Each line shows the average usage count of that section, as a
> positive number if most buffers dirty and a negative one if most are
> clean.  If you refresh this frequently enough, you can actually watch
> things like how checkpoints move through the buffer cache:
>
> SELECT current_timestamp,
> -- Split into 32 bins of data
> round(bufferid / (cast((select setting from pg_settings where
> name='shared_buffers') as int) / (32 - 1.0)))
> as section, round(
> -- Average usage count, capped at 5
> case when avg(usagecount)>5 then 5 else avg(usagecount) end *
> -- -1 when the majority are clean records, 1 when most are dirty
> (case when sum(case when isdirty then 1 else -1 end)>0 then 1 else -1
> end)) as color_intensity
> FROM pg_buffercache GROUP BY
> round(bufferid / (cast((select setting from pg_settings where
> name='shared_buffers') as int) / (32 - 1.0)));
>
> The 32 can be changed to anything, that's just what fits on my screen.
> The main idea of the above is that if you dump all this to a file
> regularly, it's possible to produce a graph of it showing how the cache
> has changed over time by assigning a different color intensity based on
> the usage count--at a massive cost in overhead, of course.  I'll be
> passing along all that code once I get it ready for other people to use.
>
> --
> * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
Content-Description:

[ Attachment, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>                http://archives.postgresql.org

--
  Bruce Momjian  <bruce@momjian.us>          http://momjian.us
  EnterpriseDB                               http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +