Re: Summary function for pg_buffercache - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Summary function for pg_buffercache
Date
Msg-id 20220922161014.copbzwdl3ja4nt6z@awork3.anarazel.de
Whole thread Raw
In response to Re: Summary function for pg_buffercache  (Melih Mutlu <m.melihmutlu@gmail.com>)
Responses Re: Summary function for pg_buffercache
List pgsql-hackers
Hi,

On 2022-09-22 18:22:44 +0300, Melih Mutlu wrote:
> Since header locks are removed again, I put some doc changes and comments
> back.

Due to the merge of the meson build system, this needs to adjust meson.build
as well.


> --- a/contrib/pg_buffercache/expected/pg_buffercache.out
> +++ b/contrib/pg_buffercache/expected/pg_buffercache.out
> @@ -8,3 +8,12 @@ from pg_buffercache;
>   t
>  (1 row)
>
> +select buffers_used + buffers_unused > 0,
> +        buffers_dirty < buffers_used,
> +        buffers_pinned < buffers_used

Doesn't these have to be "<=" instead of "<"?


> +    for (int i = 0; i < NBuffers; i++)
> +    {
> +        BufferDesc *bufHdr;
> +        uint32        buf_state;
> +
> +        /*
> +         * No need to get locks on buffer headers as we don't rely on the
> +         * results in detail. Therefore, we don't get a consistent snapshot
> +         * across all buffers and it is not guaranteed that the information of
> +         * each buffer is self-consistent as opposed to pg_buffercache_pages.
> +         */

I think the "consistent snapshot" bit is misleading - even taking buffer
header locks wouldn't give you that.


> +    if (buffers_used != 0)
> +        usagecount_avg = usagecount_avg / buffers_used;

Perhaps the average should be NULL in the buffers_used == 0 case?


> + <para>
> +  <function>pg_buffercache_pages</function> function
> +  returns a set of records, plus a view <structname>pg_buffercache</structname> that wraps the function for
> +  convenient use is provided.
> + </para>
> +
> + <para>
> +  <function>pg_buffercache_summary</function> function returns a table with a single row
> +  that contains summarized and aggregated information about shared buffer caches.
>   </para>

I think these sentences are missing a "The " at the start?

"shared buffer caches" isn't right - I think I'd just drop the "caches".


> +  <para>
> +   There is a single row to show summarized information of all shared buffers.
> +   <function>pg_buffercache_summary</function> is not interested
> +   in the state of each shared buffer, only shows aggregated information.
> +  </para>
> +
> +  <para>
> +   <function>pg_buffercache_summary</function> doesn't take buffer manager
> +   locks. Unlike <function>pg_buffercache_pages</function> function
> +   <function>pg_buffercache_summary</function> doesn't take buffer headers locks
> +   either, thus the result is not consistent. This is intentional. The purpose
> +   of this function is to provide a general idea about the state of shared
> +   buffers as fast as possible. Additionally, <function>pg_buffercache_summary</function>
> +   allocates much less memory.
> +  </para>
> + </sect2>

I don't think this mentioning of buffer header locks is useful for users - nor
is it I think correct. Acquiring the buffer header locks wouldn't add *any*
additional consistency.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: libpq error message refactoring
Next
From: Önder Kalacı
Date:
Subject: Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher