Home > mailing lists

Re: monitoring usage count distribution - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: monitoring usage count distribution
Date	April 4, 2023 23:29:19
Msg-id	20230404232919.uibzbhjdylk3mlvp@awork3.anarazel.de Whole thread Raw
In response to	Re: monitoring usage count distribution (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: monitoring usage count distribution
List	pgsql-hackers

Tree view

Hi,

On 2023-04-04 14:31:36 -0400, Robert Haas wrote:
> On Mon, Jan 30, 2023 at 6:30 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
> > My colleague Jeremy Schneider (CC'd) was recently looking into usage count
> > distributions for various workloads, and he mentioned that it would be nice
> > to have an easy way to do $SUBJECT.  I've attached a patch that adds a
> > pg_buffercache_usage_counts() function.  This function returns a row per
> > possible usage count with some basic information about the corresponding
> > buffers.
> >
> >     postgres=# SELECT * FROM pg_buffercache_usage_counts();
> >      usage_count | buffers | dirty | pinned
> >     -------------+---------+-------+--------
> >                0 |       0 |     0 |      0
> >                1 |    1436 |   671 |      0
> >                2 |     102 |    88 |      0
> >                3 |      23 |    21 |      0
> >                4 |       9 |     7 |      0
> >                5 |     164 |   106 |      0
> >     (6 rows)
> >
> > This new function provides essentially the same information as
> > pg_buffercache_summary(), but pg_buffercache_summary() only shows the
> > average usage count for the buffers in use.  If there is interest in this
> > idea, another approach to consider could be to alter
> > pg_buffercache_summary() instead.
> 
> I'm skeptical that pg_buffercache_summary() is a good idea at all

Why? It's about two orders of magnitude faster than querying the equivalent
data by aggregating in SQL. And knowing how many free and dirty buffers are
over time is something quite useful to monitor / correlate with performance
issues.


> but having it display the average usage count seems like a particularly poor
> idea. That information is almost meaningless.

I agree there are more meaningful ways to represent the data, but I don't
agree that it's almost meaningless. It can give you a rough estimate of
whether data in s_b is referenced or not.


> Replacing that with a six-element integer array would be a clear improvement
> and, IMHO, better than adding yet another function to the extension.

I'd have no issue with that.

Greetings,

Andres Freund

pgsql-hackers by date:

From: Andres Freund
Date: 04 April 2023, 23:25:27
Subject: Re: monitoring usage count distribution

From: Peter Smith
Date: 05 April 2023, 00:27:33
Subject: CREATE SUBSCRIPTION -- add missing tab-completes

Re: monitoring usage count distribution - Mailing list pgsql-hackers

Previous

Next