Tracking buffer allocation stats (new LRU method base code) - Mailing list pgsql-patches

From Greg Smith
Subject Tracking buffer allocation stats (new LRU method base code)
Date
Msg-id Pine.GSO.4.64.0707020202220.5578@westnet.com
Whole thread Raw
List pgsql-patches
Now that all the other checkpoint-related patches seem to have settled
down, I've been busy working my way through refreshing and fixing merge
issues in what started as Itagaki Takahiro's auto bgwriter_lru_maxpages
patch.  I split that into two patches before, and am now finished what I
can do revising and testing the first of those.

The attached patch adds counters for the number of buffers allocated and
the number written out by backends, and exposes all of that via
pg_stat_bgwriter.  Here's a sample from after a period of benchmarking;
the buffers_backend and buffers_alloc are the two new things here:

  checkpoints_timed | checkpoints_req | buffers_checkpoint | buffers_clean
-------------------+-----------------+--------------------+--------------
                 15 |             274 |            7616495 |       6737430

buffers_backend | buffers_alloc | maxwritten_clean
----------------+---------------+------------------
        35398659 |      24389383 |            91631

To show how helpful this is, what I've been doing with all this is taking
snapshots of the structure at the beginning and end of an individual
benchmark test, then producing the delta to see the balance of who wrote
once during the test.  Here's an example of a test with 200,000 of the
UPDATE statement from the pgbench test.  The first run had no background
writer, while the second had a moderately active one, and you can see that
the counting all works as expected--and that you can learn quite a bit
about how effective the background writer cleaner was from these numbers
(obviously I had checkpoint_segments set high so there weren't any during
the test):

clients | tps  | chkpts | buf_check | buf_clean | buf_backend | buf_alloc
--------+------+--------+-----------+-----------+-------------+-----------
       1 | 1487 |      0 |         0 |         0 |       70934 |     85859
       1 | 1414 |      0 |         0 |     39005 |       38542 |    100963

This patch was last submitted here:
http://archives.postgresql.org/pgsql-patches/2007-05/msg00142.php

At that time, Itagaki and Heikki seemed to approve the basic direction I'd
gone and how I'd split the original code into monitoring and functional
pieces.  The differences between that version of the patch and the
attached one are I fixed the race-condition bug and terminology issue
Heikki noticed, along with the merge to current HEAD.

Rather than wait until I'd finished testing the next layer on top of this
(retuning the automatic was-LRU-now-cleaner code with Tom's latest insight
on that topic) I figured I might as well send this part now.  So far this
has been independant of the code that builds on it, and I'm done with this
section.  I think it will take a serious look by someone who might commit
it to make any more progress and I want to keep those queues moving.

As for issues in this code I am concerned about having reviewed, most of
them come from my not having completely internalized the engine's forking
model yet:

1) I'm not sure the way the way am_bg_writer was changed here is kosher.

2) The way the buffers are counted in the freelist code and sent back to
the background writer feels like a bit of a hack to me.

3) The recent strategy changes in freelist.c left me unsure how to count
some of what it does; I marked the section I'm concerned about with an XXX
comment.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Attachment

pgsql-patches by date:

Previous
From: Neil Conway
Date:
Subject: Re: [DOCS] rename of a view
Next
From: David Fetter
Date:
Subject: Re: [DOCS] rename of a view