Re: Add pg_buffercache_evict_all() and pg_buffercache_mark_dirty[_all]() functions - Mailing list pgsql-hackers

From Joseph Koshakow
Subject Re: Add pg_buffercache_evict_all() and pg_buffercache_mark_dirty[_all]() functions
Date
Msg-id CAAvxfHeHBL6XxypexiSu0geH3hkHiFX=ApUaeaU+hgQEfjAwxQ@mail.gmail.com
Whole thread Raw
In response to Re: Add pg_buffercache_evict_all() and pg_buffercache_mark_dirty[_all]() functions  (Aidar Imamov <a.imamov@postgrespro.ru>)
Responses Re: Add pg_buffercache_evict_all() and pg_buffercache_mark_dirty[_all]() functions
List pgsql-hackers
Hi I am working with Aidar to give a review and I am also a beginner
reviewer.

> From 813e5ec0da4c65970b4b1ce2ec2918e4652da9ab Mon Sep 17 00:00:00 2001
> From: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
> Date: Fri, 20 Dec 2024 14:06:47 +0300
> Subject: [PATCH v1 1/2] Add pg_buffercache_evict_all() function for testing
>   */
>  bool
> -EvictUnpinnedBuffer(Buffer buf)
> +EvictUnpinnedBuffer(Buffer buf, bool *flushed)

I think you should update the comment above this function to include
details on the new `flushed` argument.

> diff --git a/doc/src/sgml/pgbuffercache.sgml b/doc/src/sgml/pgbuffercache.sgml
> index 802a5112d77..83950ca5cce 100644
> --- a/doc/src/sgml/pgbuffercache.sgml
> +++ b/doc/src/sgml/pgbuffercache.sgml
> @@ -31,8 +31,10 @@
>    This module provides the <function>pg_buffercache_pages()</function>
>    function (wrapped in the <structname>pg_buffercache</structname> view),
>    the <function>pg_buffercache_summary()</function> function, the
> -  <function>pg_buffercache_usage_counts()</function> function and
> -  the <function>pg_buffercache_evict()</function> function.
> +  <function>pg_buffercache_usage_counts()</function> function, the
> +  <function>pg_buffercache_evict()</function>, the
> +  <function>pg_buffercache_evict_relation()</function>, and the
> +  <function>pg_buffercache_evict_all()</function> function.

All the other functions have indexterm tags above this, should we add
indexterms for the new functions?

> + <sect2 id="pgbuffercache-pg-buffercache-evict-relation">
> +  <title>The <structname>pg_buffercache_evict_relation</structname> Function</title>
> +  <para>
> +   The <function>pg_buffercache_evict_relation()</function> function is very similar
> +   to <function>pg_buffercache_evict()</function> function.  The difference is that
> +   <function>pg_buffercache_evict_relation()</function> takes a relation
> +   identifier instead of buffer identifier.  Then, it tries to evict all
> +   buffers in that relation.  The function is intended for developer testing only.
> +  </para>
> + </sect2>
> +
> + <sect2 id="pgbuffercache-pg-buffercache-evict-all">
> +  <title>The <structname>pg_buffercache_evict_all</structname> Function</title>
> +  <para>
> +   The <function>pg_buffercache_evict_all()</function> function is very similar
> +   to <function>pg_buffercache_evict()</function> function.  The difference is,
> +   the <function>pg_buffercache_evict_all()</function> does not take argument;
> +   instead it tries to evict all buffers in the buffer pool.  The function is
> +   intended for developer testing only.
> +  </para>
> + </sect2>

The other difference is that these new functions have an additional
output argument to indicate if the buffer was flushed which we probably
want to mention here.

Also I think it's more gramatically correct to say "the <fn-name>
function" or just "<fn-name>". A couple of times you say "<fn-name>
function" or "the <fn-name>".

Also I think the third to last sentence should end with "... does not
take **an** argument" or "... does not take argument**s**".

> +CREATE FUNCTION pg_buffercache_evict_relation(
> +    IN regclass,
> +    IN fork text default 'main',
> +    OUT buffers_evicted int4,
> +    OUT buffers_flushed int4)
> +AS 'MODULE_PATHNAME', 'pg_buffercache_evict_relation'
> +LANGUAGE C PARALLEL SAFE VOLATILE;
> +
> +CREATE FUNCTION pg_buffercache_evict_all(
> +    OUT buffers_evicted int4,
> +    OUT buffers_flushed int4)
> +AS 'MODULE_PATHNAME', 'pg_buffercache_evict_all'
> +LANGUAGE C PARALLEL SAFE VOLATILE;

Does it make sense to also update pg_buffercache_evict to also return
a bool if the buffer was flushed? Or is that too much of a breaking
change?

> + /* Lock the header and check if it's valid. */
> + buf_state = LockBufHdr(desc);
> + if ((buf_state & BM_VALID) == 0)
> + {
> + UnlockBufHdr(desc, buf_state);
> + return false;
> + }

EvictUnpinnedBuffer first checks if the buffer is locked before
attempting to lock it. Do we need a similar check here?

   /* Lock the header if it is not already locked. */
   if (!buf_state)
    buf_state = LockBufHdr(desc);

>> I don't think *flushed is necessarily accurate this way, as it might detect
>> that it doesn't need to flush the data (via StartBufferIO() returning false).
>
> You are right. It seems there is no way to get that information
> without editing the FlushBuffer(), right? And I think editing
> FlushBuffer() is not worth it. So, I can either remove it or explain
> in the docs that #buffers_flushed may not be completely accurate.

There's already a lot of caveats on EvictUnpinnedBuffer that it might
have unexpected results when run concurrently, so I think it's fine to
add one more.

Thanks,
Joe Koshakow

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: md.c vs elog.c vs smgrreleaseall() in barrier
Next
From: Andrew Dunstan
Date:
Subject: Re: RFC: Additional Directory for Extensions