Thread: Re: Add pg_buffercache_evict_all() and pg_buffercache_mark_dirty[_all]() functions

Hi,

On 2024-12-25 15:57:34 +0300, Nazir Bilal Yavuz wrote:
> 1- It is time consuming to evict all shared buffers one by one using
> the pg_buffercache_evict() function.

This is really useful for benchmarking. When testing IO heavy workloads it
turns out to be a lot lower noise to measure with shared buffers already
"touched" before taking actual measurements. The first time a buffer is used
the kernel needs to actually initialize the memory for the buffer, which can
take a substantial amount of time. Which obviously can hide many performance
differences. And it adds a lot of noise.


> 2- It would be good to have a function to mark buffers as dirty to
> test different scenarios.

This is important to be able to measure checkpointer performance. Otherwise
one has to load data from scratch. That's bad for three reasons:

1) After re-loading the data from scratch the data can be laid-out differently
   on-disk, which can have substantial performance impacts. By re-dirtying
   data one instead can measure the performance effects of a change with the
   same on-disk layaout.
2) It takes a lot of time to reload data from scratch.
3) The first write to data has substantially different performance
   characteristics than later writes, but often isn't the most relevant factor
   for real-world performance test.



> So, this patchset extends pg_buffercache with 3 functions:
> 
> 1- pg_buffercache_evict_all(): This is very similar to the already
> existing pg_buffercache_evict() function. The difference is
> pg_buffercache_evict_all() does not take an argument. Instead it just
> loops over the shared buffers and tries to evict all of them. It
> returns the number of buffers evicted and flushed.

I do think that'd be rather useful.  Perhaps it's also worth having a version
that just evicts a specific relation?


> 2- pg_buffercache_mark_dirty(): This function takes a buffer id as an
> argument and tries to mark this buffer as dirty. Returns true on
> success. This returns false if the buffer is already dirty. Do you
> think this makes sense or do you prefer it to return true if the
> buffer is already dirty?

I don't really have feelings about that ;)


> From 813e5ec0da4c65970b4b1ce2ec2918e4652da9ab Mon Sep 17 00:00:00 2001
> From: Nazir Bilal Yavuz <byavuz81@gmail.com>
> Date: Fri, 20 Dec 2024 14:06:47 +0300
> Subject: [PATCH v1 1/2] Add pg_buffercache_evict_all() function for testing
> 
> This new function provides a mechanism to evict all shared buffers at
> once. It is designed to address the inefficiency of repeatedly calling
> pg_buffercache_evict() for each individual buffer, which can be
> time-consuming when dealing with large shared buffer pools
> (e.g., ~790ms vs. ~270ms for 16GB of shared buffers).

>   */
>  bool
> -EvictUnpinnedBuffer(Buffer buf)
> +EvictUnpinnedBuffer(Buffer buf, bool *flushed)
>  {
>      BufferDesc *desc;
>      uint32        buf_state;
>      bool        result;
>  
> +    *flushed = false;
> +
>      /* Make sure we can pin the buffer. */
>      ResourceOwnerEnlarge(CurrentResourceOwner);
>      ReservePrivateRefCountEntry();
> @@ -6134,6 +6136,7 @@ EvictUnpinnedBuffer(Buffer buf)
>          LWLockAcquire(BufferDescriptorGetContentLock(desc), LW_SHARED);
>          FlushBuffer(desc, NULL, IOOBJECT_RELATION, IOCONTEXT_NORMAL);
>          LWLockRelease(BufferDescriptorGetContentLock(desc));
> +        *flushed = true;
>      }
>  
>      /* This will return false if it becomes dirty or someone else pins
>      it. */

I don't think *flushed is necessarily accurate this way, as it might detect
that it doesn't need to flush the data (via StartBufferIO() returning false).


> + */
> +Datum
> +pg_buffercache_evict_all(PG_FUNCTION_ARGS)
> +{
> +    Datum        result;
> +    TupleDesc    tupledesc;
> +    HeapTuple    tuple;
> +    Datum        values[NUM_BUFFERCACHE_EVICT_ALL_ELEM];
> +    bool        nulls[NUM_BUFFERCACHE_EVICT_ALL_ELEM] = {0};
> +
> +    int32        buffers_evicted = 0;
> +    int32        buffers_flushed = 0;
> +    bool        flushed;
> +
> +    if (get_call_result_type(fcinfo, NULL, &tupledesc) != TYPEFUNC_COMPOSITE)
> +        elog(ERROR, "return type must be a row type");
> +
> +    if (!superuser())
> +        ereport(ERROR,
> +                (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> +                 errmsg("must be superuser to use pg_buffercache_evict_all function")));
> +
> +    for (int buf = 1; buf < NBuffers; buf++)
> +    {
> +        if (EvictUnpinnedBuffer(buf, &flushed))
> +            buffers_evicted++;
> +        if (flushed)
> +            buffers_flushed++;

I'd probably add a pre-check for the buffer not being in use. We don't need an
external function call, with an unconditional LockBufHdr() inside, for that
case.



> +/*
> + * MarkUnpinnedBufferDirty
> + *
> + * This function is intended for testing/development use only!
> + *
> + * To succeed, the buffer must not be pinned on entry, so if the caller had a
> + * particular block in mind, it might already have been replaced by some other
> + * block by the time this function runs.  It's also unpinned on return, so the
> + * buffer might be occupied and flushed by the time control is returned.  This
> + * inherent raciness without other interlocking makes the function unsuitable
> + * for non-testing usage.
> + *
> + * Returns true if the buffer was not dirty and it has now been marked as
> + * dirty.  Returns false if it wasn't valid, if it couldn't be marked as dirty
> + * due to a pin, or if the buffer was already dirty.
> + */

Hm. One potentially problematic thing with this is that it can pin and dirty a
buffer even while its relation is locked exclusively. Which i think some code
doesn't expect.  Not sure if there's a way around that :(


Greetings,

Andres Freund



Hi,

On Mon, 17 Feb 2025 at 19:59, Andres Freund <andres@anarazel.de> wrote:
>
> Hi,

Thanks for the review! And sorry for the late reply.

> On 2024-12-25 15:57:34 +0300, Nazir Bilal Yavuz wrote:
> > So, this patchset extends pg_buffercache with 3 functions:
> >
> > 1- pg_buffercache_evict_all(): This is very similar to the already
> > existing pg_buffercache_evict() function. The difference is
> > pg_buffercache_evict_all() does not take an argument. Instead it just
> > loops over the shared buffers and tries to evict all of them. It
> > returns the number of buffers evicted and flushed.
>
> I do think that'd be rather useful.  Perhaps it's also worth having a version
> that just evicts a specific relation?

That makes sense. I will work on this.

> > 2- pg_buffercache_mark_dirty(): This function takes a buffer id as an
> > argument and tries to mark this buffer as dirty. Returns true on
> > success. This returns false if the buffer is already dirty. Do you
> > think this makes sense or do you prefer it to return true if the
> > buffer is already dirty?
>
> I don't really have feelings about that ;)

I feel the same. I just wanted to have a symmetry between dirty and
evict functions. If you think this would be useless, I can remove it.

> > From 813e5ec0da4c65970b4b1ce2ec2918e4652da9ab Mon Sep 17 00:00:00 2001
> > From: Nazir Bilal Yavuz <byavuz81@gmail.com>
> > Date: Fri, 20 Dec 2024 14:06:47 +0300
> > Subject: [PATCH v1 1/2] Add pg_buffercache_evict_all() function for testing
> >
> > This new function provides a mechanism to evict all shared buffers at
> > once. It is designed to address the inefficiency of repeatedly calling
> > pg_buffercache_evict() for each individual buffer, which can be
> > time-consuming when dealing with large shared buffer pools
> > (e.g., ~790ms vs. ~270ms for 16GB of shared buffers).
>
> >   */
> >  bool
> > -EvictUnpinnedBuffer(Buffer buf)
> > +EvictUnpinnedBuffer(Buffer buf, bool *flushed)
> >  {
> >       BufferDesc *desc;
> >       uint32          buf_state;
> >       bool            result;
> >
> > +     *flushed = false;
> > +
> >       /* Make sure we can pin the buffer. */
> >       ResourceOwnerEnlarge(CurrentResourceOwner);
> >       ReservePrivateRefCountEntry();
> > @@ -6134,6 +6136,7 @@ EvictUnpinnedBuffer(Buffer buf)
> >               LWLockAcquire(BufferDescriptorGetContentLock(desc), LW_SHARED);
> >               FlushBuffer(desc, NULL, IOOBJECT_RELATION, IOCONTEXT_NORMAL);
> >               LWLockRelease(BufferDescriptorGetContentLock(desc));
> > +             *flushed = true;
> >       }
> >
> >       /* This will return false if it becomes dirty or someone else pins
> >       it. */
>
> I don't think *flushed is necessarily accurate this way, as it might detect
> that it doesn't need to flush the data (via StartBufferIO() returning false).

You are right. It seems there is no way to get that information
without editing the FlushBuffer(), right? And I think editing
FlushBuffer() is not worth it. So, I can either remove it or explain
in the docs that #buffers_flushed may not be completely accurate.

> > + */
> > +Datum
> > +pg_buffercache_evict_all(PG_FUNCTION_ARGS)
> > +{
> > +     Datum           result;
> > +     TupleDesc       tupledesc;
> > +     HeapTuple       tuple;
> > +     Datum           values[NUM_BUFFERCACHE_EVICT_ALL_ELEM];
> > +     bool            nulls[NUM_BUFFERCACHE_EVICT_ALL_ELEM] = {0};
> > +
> > +     int32           buffers_evicted = 0;
> > +     int32           buffers_flushed = 0;
> > +     bool            flushed;
> > +
> > +     if (get_call_result_type(fcinfo, NULL, &tupledesc) != TYPEFUNC_COMPOSITE)
> > +             elog(ERROR, "return type must be a row type");
> > +
> > +     if (!superuser())
> > +             ereport(ERROR,
> > +                             (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> > +                              errmsg("must be superuser to use pg_buffercache_evict_all function")));
> > +
> > +     for (int buf = 1; buf < NBuffers; buf++)
> > +     {
> > +             if (EvictUnpinnedBuffer(buf, &flushed))
> > +                     buffers_evicted++;
> > +             if (flushed)
> > +                     buffers_flushed++;
>
> I'd probably add a pre-check for the buffer not being in use. We don't need an
> external function call, with an unconditional LockBufHdr() inside, for that
> case.

I did not understand why we would need this. Does not
EvictUnpinnedBuffer() already check that the buffer is not in use?

> > +/*
> > + * MarkUnpinnedBufferDirty
> > + *
> > + * This function is intended for testing/development use only!
> > + *
> > + * To succeed, the buffer must not be pinned on entry, so if the caller had a
> > + * particular block in mind, it might already have been replaced by some other
> > + * block by the time this function runs.  It's also unpinned on return, so the
> > + * buffer might be occupied and flushed by the time control is returned.  This
> > + * inherent raciness without other interlocking makes the function unsuitable
> > + * for non-testing usage.
> > + *
> > + * Returns true if the buffer was not dirty and it has now been marked as
> > + * dirty.  Returns false if it wasn't valid, if it couldn't be marked as dirty
> > + * due to a pin, or if the buffer was already dirty.
> > + */
>
> Hm. One potentially problematic thing with this is that it can pin and dirty a
> buffer even while its relation is locked exclusively. Which i think some code
> doesn't expect.  Not sure if there's a way around that :(

I see, I did not think about that. Since this function can be used
only by superusers, that problem might not be a big issue. What do you
think?

--
Regards,
Nazir Bilal Yavuz
Microsoft



Hi,

Here is the v2. Changes prior to v1 are:

- pg_buffercache_evict_relation() function is introduced. Which takes
a relation as an argument and evicts all shared buffers in the
relation.
- It is mentioned in the docs that the #buffers_flushed in the
pg_buffercache_evict_relation() and pg_buffercache_evict_all()
functions do not necessarily mean these buffers are flushed by these
functions.

Remaining questions from the v1:

On Wed, 12 Mar 2025 at 19:15, Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
> On Mon, 17 Feb 2025 at 19:59, Andres Freund <andres@anarazel.de> wrote:
> > On 2024-12-25 15:57:34 +0300, Nazir Bilal Yavuz wrote:
> > > + */
> > > +Datum
> > > +pg_buffercache_evict_all(PG_FUNCTION_ARGS)
> > > +{
> > > +     Datum           result;
> > > +     TupleDesc       tupledesc;
> > > +     HeapTuple       tuple;
> > > +     Datum           values[NUM_BUFFERCACHE_EVICT_ALL_ELEM];
> > > +     bool            nulls[NUM_BUFFERCACHE_EVICT_ALL_ELEM] = {0};
> > > +
> > > +     int32           buffers_evicted = 0;
> > > +     int32           buffers_flushed = 0;
> > > +     bool            flushed;
> > > +
> > > +     if (get_call_result_type(fcinfo, NULL, &tupledesc) != TYPEFUNC_COMPOSITE)
> > > +             elog(ERROR, "return type must be a row type");
> > > +
> > > +     if (!superuser())
> > > +             ereport(ERROR,
> > > +                             (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> > > +                              errmsg("must be superuser to use pg_buffercache_evict_all function")));
> > > +
> > > +     for (int buf = 1; buf < NBuffers; buf++)
> > > +     {
> > > +             if (EvictUnpinnedBuffer(buf, &flushed))
> > > +                     buffers_evicted++;
> > > +             if (flushed)
> > > +                     buffers_flushed++;
> >
> > I'd probably add a pre-check for the buffer not being in use. We don't need an
> > external function call, with an unconditional LockBufHdr() inside, for that
> > case.
>
> I did not understand why we would need this. Does not
> EvictUnpinnedBuffer() already check that the buffer is not in use?

> > > +/*
> > > + * MarkUnpinnedBufferDirty
> > > + *
> > > + * This function is intended for testing/development use only!
> > > + *
> > > + * To succeed, the buffer must not be pinned on entry, so if the caller had a
> > > + * particular block in mind, it might already have been replaced by some other
> > > + * block by the time this function runs.  It's also unpinned on return, so the
> > > + * buffer might be occupied and flushed by the time control is returned.  This
> > > + * inherent raciness without other interlocking makes the function unsuitable
> > > + * for non-testing usage.
> > > + *
> > > + * Returns true if the buffer was not dirty and it has now been marked as
> > > + * dirty.  Returns false if it wasn't valid, if it couldn't be marked as dirty
> > > + * due to a pin, or if the buffer was already dirty.
> > > + */
> >
> > Hm. One potentially problematic thing with this is that it can pin and dirty a
> > buffer even while its relation is locked exclusively. Which i think some code
> > doesn't expect.  Not sure if there's a way around that :(
>
> I see, I did not think about that. Since this function can be used
> only by superusers, that problem might not be a big issue. What do you
> think?

-- 
Regards,
Nazir Bilal Yavuz
Microsoft

Attachment
Hi!

This is my first time trying out as a patch reviewer, and I don't claim 
to be
an expert. I am very interested in the topic of buffer cache and would 
like to
learn more about it.

I have reviewed the patches after they were
edited and I would like to share my thoughts on some points.

pg_buffercache_evict_all():
> for (int buf = 1; buf < NBuffers; buf++)
Mb it would be more correct to use <= NBuffers?

I also did not fully understand what A. Freund was referring to. 
However, we
cannot avoid blocking the buffer header, as we need to ensure that the 
buffer
is not being pinned by anyone else. Perhaps it would be beneficial to 
call
LockBufHdr() outside and check if BUT_STATE_GET_REFCOUNT == 0 before 
calling
EvictUnpinnedBuffer(). This would help to prevent unnecessary calls to
EvictUnpinnedBuffer() itself, ResourceOwnerEnlarge() and
ReservePrivateRefCountEntry().


pg_buffercache_mark_dirty_all():
> for (int buf = 1; buf < NBuffers; buf++)
Mb it would be more correct to use <= NBuffers?


pg_buffercache_evict_relation():
> errmsg("must be superuser to use pg_buffercache_evict function")
'_relation' postfix got lost here

>   /* Open relation. */
>  rel = relation_open(relOid, AccessShareLock);
If I understand correctly, this function is meant to replicate the 
behavior of
VACUUM FULL, but only for dirty relation pages. In this case, wouldn't 
it be
necessary to obtain an exclusive access lock?

> for (int buf = 1; buf < NBuffers; buf++)
Mb it would be more correct to use <= NBuffers?

>    /*
>     * No need to call UnlockBufHdr() if BufTagMatchesRelFileLocator()
>     * returns true, EvictUnpinnedBuffer() will take care of it.
>     */
>    buf_state = LockBufHdr(bufHdr);
>    if (BufTagMatchesRelFileLocator(&bufHdr->tag, &rel->rd_locator))
>    {
>      if (EvictUnpinnedBuffer(buf, buf_state, &flushed))
>        buffers_evicted++;
>      if (flushed)
>        buffers_flushed++;
>    }

If you have previously acquired the exclusive access lock to the 
relation,
you may want to consider replacing the order of the 
BufTagMatchesRelFileLocator()
and LockBufHdr() calls for improved efficiency (as mentioned in the 
comment on
the BufTagMatchesRelFileLocator() call in the DropRelationBuffers() 
function in
bufmgr.c).
And it maybe useful also to add BUT_STATE_GET_REFCOUNT == 0 precheck 
before calling
EvictUnpinnedBuffer()?


regards,
Aidar Imamov



Hi I am working with Aidar to give a review and I am also a beginner
reviewer.

> From 813e5ec0da4c65970b4b1ce2ec2918e4652da9ab Mon Sep 17 00:00:00 2001
> From: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
> Date: Fri, 20 Dec 2024 14:06:47 +0300
> Subject: [PATCH v1 1/2] Add pg_buffercache_evict_all() function for testing
>   */
>  bool
> -EvictUnpinnedBuffer(Buffer buf)
> +EvictUnpinnedBuffer(Buffer buf, bool *flushed)

I think you should update the comment above this function to include
details on the new `flushed` argument.

> diff --git a/doc/src/sgml/pgbuffercache.sgml b/doc/src/sgml/pgbuffercache.sgml
> index 802a5112d77..83950ca5cce 100644
> --- a/doc/src/sgml/pgbuffercache.sgml
> +++ b/doc/src/sgml/pgbuffercache.sgml
> @@ -31,8 +31,10 @@
>    This module provides the <function>pg_buffercache_pages()</function>
>    function (wrapped in the <structname>pg_buffercache</structname> view),
>    the <function>pg_buffercache_summary()</function> function, the
> -  <function>pg_buffercache_usage_counts()</function> function and
> -  the <function>pg_buffercache_evict()</function> function.
> +  <function>pg_buffercache_usage_counts()</function> function, the
> +  <function>pg_buffercache_evict()</function>, the
> +  <function>pg_buffercache_evict_relation()</function>, and the
> +  <function>pg_buffercache_evict_all()</function> function.

All the other functions have indexterm tags above this, should we add
indexterms for the new functions?

> + <sect2 id="pgbuffercache-pg-buffercache-evict-relation">
> +  <title>The <structname>pg_buffercache_evict_relation</structname> Function</title>
> +  <para>
> +   The <function>pg_buffercache_evict_relation()</function> function is very similar
> +   to <function>pg_buffercache_evict()</function> function.  The difference is that
> +   <function>pg_buffercache_evict_relation()</function> takes a relation
> +   identifier instead of buffer identifier.  Then, it tries to evict all
> +   buffers in that relation.  The function is intended for developer testing only.
> +  </para>
> + </sect2>
> +
> + <sect2 id="pgbuffercache-pg-buffercache-evict-all">
> +  <title>The <structname>pg_buffercache_evict_all</structname> Function</title>
> +  <para>
> +   The <function>pg_buffercache_evict_all()</function> function is very similar
> +   to <function>pg_buffercache_evict()</function> function.  The difference is,
> +   the <function>pg_buffercache_evict_all()</function> does not take argument;
> +   instead it tries to evict all buffers in the buffer pool.  The function is
> +   intended for developer testing only.
> +  </para>
> + </sect2>

The other difference is that these new functions have an additional
output argument to indicate if the buffer was flushed which we probably
want to mention here.

Also I think it's more gramatically correct to say "the <fn-name>
function" or just "<fn-name>". A couple of times you say "<fn-name>
function" or "the <fn-name>".

Also I think the third to last sentence should end with "... does not
take **an** argument" or "... does not take argument**s**".

> +CREATE FUNCTION pg_buffercache_evict_relation(
> +    IN regclass,
> +    IN fork text default 'main',
> +    OUT buffers_evicted int4,
> +    OUT buffers_flushed int4)
> +AS 'MODULE_PATHNAME', 'pg_buffercache_evict_relation'
> +LANGUAGE C PARALLEL SAFE VOLATILE;
> +
> +CREATE FUNCTION pg_buffercache_evict_all(
> +    OUT buffers_evicted int4,
> +    OUT buffers_flushed int4)
> +AS 'MODULE_PATHNAME', 'pg_buffercache_evict_all'
> +LANGUAGE C PARALLEL SAFE VOLATILE;

Does it make sense to also update pg_buffercache_evict to also return
a bool if the buffer was flushed? Or is that too much of a breaking
change?

> + /* Lock the header and check if it's valid. */
> + buf_state = LockBufHdr(desc);
> + if ((buf_state & BM_VALID) == 0)
> + {
> + UnlockBufHdr(desc, buf_state);
> + return false;
> + }

EvictUnpinnedBuffer first checks if the buffer is locked before
attempting to lock it. Do we need a similar check here?

   /* Lock the header if it is not already locked. */
   if (!buf_state)
    buf_state = LockBufHdr(desc);

>> I don't think *flushed is necessarily accurate this way, as it might detect
>> that it doesn't need to flush the data (via StartBufferIO() returning false).
>
> You are right. It seems there is no way to get that information
> without editing the FlushBuffer(), right? And I think editing
> FlushBuffer() is not worth it. So, I can either remove it or explain
> in the docs that #buffers_flushed may not be completely accurate.

There's already a lot of caveats on EvictUnpinnedBuffer that it might
have unexpected results when run concurrently, so I think it's fine to
add one more.

Thanks,
Joe Koshakow
Hi,

On Wed, 19 Mar 2025 at 01:02, Aidar Imamov <a.imamov@postgrespro.ru> wrote:
>
> Hi!
>
> This is my first time trying out as a patch reviewer, and I don't claim
> to be
> an expert. I am very interested in the topic of buffer cache and would
> like to
> learn more about it.

Thank you so much for the review!

> pg_buffercache_evict_all():
> > for (int buf = 1; buf < NBuffers; buf++)
> Mb it would be more correct to use <= NBuffers?

Yes, done.

> I also did not fully understand what A. Freund was referring to.
> However, we
> cannot avoid blocking the buffer header, as we need to ensure that the
> buffer
> is not being pinned by anyone else. Perhaps it would be beneficial to
> call
> LockBufHdr() outside and check if BUT_STATE_GET_REFCOUNT == 0 before
> calling
> EvictUnpinnedBuffer(). This would help to prevent unnecessary calls to
> EvictUnpinnedBuffer() itself, ResourceOwnerEnlarge() and
> ReservePrivateRefCountEntry().

I had an off-list talk with Andres and learned that
ResourceOwnerEnlarge() and ReservePrivateRefCountEntry() need to be
called before acquiring the buffer header lock. I introduced the
EvictUnpinnedBuffersFromSharedRelation() function for that [1].

> pg_buffercache_mark_dirty_all():
> > for (int buf = 1; buf < NBuffers; buf++)
> Mb it would be more correct to use <= NBuffers?

Done.

> pg_buffercache_evict_relation():
> > errmsg("must be superuser to use pg_buffercache_evict function")
> '_relation' postfix got lost here

Done.

> >   /* Open relation. */
> >  rel = relation_open(relOid, AccessShareLock);
> If I understand correctly, this function is meant to replicate the
> behavior of
> VACUUM FULL, but only for dirty relation pages. In this case, wouldn't
> it be
> necessary to obtain an exclusive access lock?

I think VACUUM FULL does more things but I agree with you, obtaining
an exclusive access lock is the correct way.

> > for (int buf = 1; buf < NBuffers; buf++)
> Mb it would be more correct to use <= NBuffers?

Done.

>
> >    /*
> >     * No need to call UnlockBufHdr() if BufTagMatchesRelFileLocator()
> >     * returns true, EvictUnpinnedBuffer() will take care of it.
> >     */
> >    buf_state = LockBufHdr(bufHdr);
> >    if (BufTagMatchesRelFileLocator(&bufHdr->tag, &rel->rd_locator))
> >    {
> >      if (EvictUnpinnedBuffer(buf, buf_state, &flushed))
> >        buffers_evicted++;
> >      if (flushed)
> >        buffers_flushed++;
> >    }
>
> If you have previously acquired the exclusive access lock to the
> relation,
> you may want to consider replacing the order of the
> BufTagMatchesRelFileLocator()
> and LockBufHdr() calls for improved efficiency (as mentioned in the
> comment on
> the BufTagMatchesRelFileLocator() call in the DropRelationBuffers()
> function in
> bufmgr.c).

I think you are right, we are basically copying FlushRelationBuffers()
to the contrib/pg_buffer_cache. Done.

> And it maybe useful also to add BUT_STATE_GET_REFCOUNT == 0 precheck
> before calling
> EvictUnpinnedBuffer()?

I think we do not need to do this. I see EvictUnpinnedBuffer() as a
central place that checks all conditions. If we add this pre-check
then we should not do the same check in the EvictUnpinnedBuffer(),
creating this logic would add unnecessary complexity to
EvictUnpinnedBuffer().

I addressed these reviews in v3. I will share the patches in my next reply.

--
Regards,
Nazir Bilal Yavuz
Microsoft



Hi,

On Fri, 21 Mar 2025 at 01:25, Joseph Koshakow <koshy44@gmail.com> wrote:
>
> Hi I am working with Aidar to give a review and I am also a beginner
> reviewer.

Thank you so much for the review!

> > From 813e5ec0da4c65970b4b1ce2ec2918e4652da9ab Mon Sep 17 00:00:00 2001
> > From: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
> > Date: Fri, 20 Dec 2024 14:06:47 +0300
> > Subject: [PATCH v1 1/2] Add pg_buffercache_evict_all() function for testing
> >   */
> >  bool
> > -EvictUnpinnedBuffer(Buffer buf)
> > +EvictUnpinnedBuffer(Buffer buf, bool *flushed)
>
> I think you should update the comment above this function to include
> details on the new `flushed` argument.

Yes, done.

> > diff --git a/doc/src/sgml/pgbuffercache.sgml b/doc/src/sgml/pgbuffercache.sgml
> > index 802a5112d77..83950ca5cce 100644
> > --- a/doc/src/sgml/pgbuffercache.sgml
> > +++ b/doc/src/sgml/pgbuffercache.sgml
> > @@ -31,8 +31,10 @@
> >    This module provides the <function>pg_buffercache_pages()</function>
> >    function (wrapped in the <structname>pg_buffercache</structname> view),
> >    the <function>pg_buffercache_summary()</function> function, the
> > -  <function>pg_buffercache_usage_counts()</function> function and
> > -  the <function>pg_buffercache_evict()</function> function.
> > +  <function>pg_buffercache_usage_counts()</function> function, the
> > +  <function>pg_buffercache_evict()</function>, the
> > +  <function>pg_buffercache_evict_relation()</function>, and the
> > +  <function>pg_buffercache_evict_all()</function> function.
>
> All the other functions have indexterm tags above this, should we add
> indexterms for the new functions?

I think so, done.

> > + <sect2 id="pgbuffercache-pg-buffercache-evict-relation">
> > +  <title>The <structname>pg_buffercache_evict_relation</structname> Function</title>
> > +  <para>
> > +   The <function>pg_buffercache_evict_relation()</function> function is very similar
> > +   to <function>pg_buffercache_evict()</function> function.  The difference is that
> > +   <function>pg_buffercache_evict_relation()</function> takes a relation
> > +   identifier instead of buffer identifier.  Then, it tries to evict all
> > +   buffers in that relation.  The function is intended for developer testing only.
> > +  </para>
> > + </sect2>
> > +
> > + <sect2 id="pgbuffercache-pg-buffercache-evict-all">
> > +  <title>The <structname>pg_buffercache_evict_all</structname> Function</title>
> > +  <para>
> > +   The <function>pg_buffercache_evict_all()</function> function is very similar
> > +   to <function>pg_buffercache_evict()</function> function.  The difference is,
> > +   the <function>pg_buffercache_evict_all()</function> does not take argument;
> > +   instead it tries to evict all buffers in the buffer pool.  The function is
> > +   intended for developer testing only.
> > +  </para>
> > + </sect2>
>
> The other difference is that these new functions have an additional
> output argument to indicate if the buffer was flushed which we probably
> want to mention here.

You are right, done.

> Also I think it's more gramatically correct to say "the <fn-name>
> function" or just "<fn-name>". A couple of times you say "<fn-name>
> function" or "the <fn-name>".

I choose "the <fn-name> function", done.

> Also I think the third to last sentence should end with "... does not
> take **an** argument" or "... does not take argument**s**".

I agree, done.

> > +CREATE FUNCTION pg_buffercache_evict_relation(
> > +    IN regclass,
> > +    IN fork text default 'main',
> > +    OUT buffers_evicted int4,
> > +    OUT buffers_flushed int4)
> > +AS 'MODULE_PATHNAME', 'pg_buffercache_evict_relation'
> > +LANGUAGE C PARALLEL SAFE VOLATILE;
> > +
> > +CREATE FUNCTION pg_buffercache_evict_all(
> > +    OUT buffers_evicted int4,
> > +    OUT buffers_flushed int4)
> > +AS 'MODULE_PATHNAME', 'pg_buffercache_evict_all'
> > +LANGUAGE C PARALLEL SAFE VOLATILE;
>
> Does it make sense to also update pg_buffercache_evict to also return
> a bool if the buffer was flushed? Or is that too much of a breaking
> change?

I think it makes sense but I did that change in another patch (0002)
as this may need more discussion.

> > + /* Lock the header and check if it's valid. */
> > + buf_state = LockBufHdr(desc);
> > + if ((buf_state & BM_VALID) == 0)
> > + {
> > + UnlockBufHdr(desc, buf_state);
> > + return false;
> > + }
>
> EvictUnpinnedBuffer first checks if the buffer is locked before
> attempting to lock it. Do we need a similar check here?

I do not think so, for now we do not call MarkUnpinnedBufferDirty()
while the buffer header is locked.

>    /* Lock the header if it is not already locked. */
>    if (!buf_state)
>     buf_state = LockBufHdr(desc);
>
> >> I don't think *flushed is necessarily accurate this way, as it might detect
> >> that it doesn't need to flush the data (via StartBufferIO() returning false).
> >
> > You are right. It seems there is no way to get that information
> > without editing the FlushBuffer(), right? And I think editing
> > FlushBuffer() is not worth it. So, I can either remove it or explain
> > in the docs that #buffers_flushed may not be completely accurate.
>
> There's already a lot of caveats on EvictUnpinnedBuffer that it might
> have unexpected results when run concurrently, so I think it's fine to
> add one more.

I agree.

v3 is attached, I addressed both you and Aidar's reviews in the v3.

-- 
Regards,
Nazir Bilal Yavuz
Microsoft

Attachment
Hi,

It seems that Aidar's email ended up as another thread [1]. I am
copy-pasting mail and answer here to keep the discussion in this
thread.

On Sun, 23 Mar 2025 at 22:16, Aidar Imamov <a.imamov@postgrespro.ru> wrote:
>
> I agree with most of what Joseph said. However, I would like to add some
> comments.
>
> At the moment, the "flushed" flag essentially indicates whether the
> buffer
> was dirty at the time of eviction and it does not guarantee that it has
> been
> written to disk. Therefore, it would be better to rename the
> buffers_flushed
> column in the output of pg_buffer_cache_evict_all() and
> pg_buffercache_evict_relation() functions to dirty_buffers mb? This
> would
> allow us to avoid the confusion that arises from the fact that not all
> dirty
> buffers could have actually been written to disk. In addition, this
> would
> remove the "flushed" parameter from the EvictUnpinnedBuffer() function.
> Because if we explicitly call LockBufHdr() outside of
> EvictUnpinnedBuffer(),
> we can already know in advance whether the buffer is dirty or not.
>
> The same applies to the suggestion to retrieve "flushed" count from the
> pg_buffercache_evict() call. We cannot say this for certain, but we can
> determine whether the buffer was dirty.

I think flushed means 'passing the buffer contents to the kernel' in
the Postgres context (as it is explained in the FlushBuffer()). We
know that flush has happened, we just do not know if the buffer is
flushed by us or someone else.

[1] https://postgr.es/m/flat/76a550315baef9d7424b70144f1c6a2d%40postgrespro.ru

-- 
Regards,
Nazir Bilal Yavuz
Microsoft



Hi!

I've been looking at the patches v3 and there are a few things I want to 
talk
about.

EvictUnpinnedBuffer():
I think we should change the paragraph with "flushed" of the comment to
something more like this: "If the buffer was dirty at the time of the 
call it
will be flushed by calling FlushBuffer() and if 'flushed' is not NULL 
'*flushed'
will be set to true."

I also think it'd be a good idea to add null-checks for "flushed" before
dereferencing it:
> *flushed = false;
> *flushed = true;

If the (!buf_state) clause is entered then we assume that the header is 
not locked.
Maybe edit the comment: "Lock the header if it is not already locked." 
-> "Lock the header"?
>  if (!buf_state)
>  {
>    /* Make sure we can pin the buffer. */
>    ResourceOwnerEnlarge(CurrentResourceOwner);
>    ReservePrivateRefCountEntry();
> 
>    /* Lock the header if it is not already locked. */
>    buf_state = LockBufHdr(desc);
>  }


EvictUnpinnedBuffersFromSharedRelation():
Maybe it will be more accurate to name the function as 
EvictRelUnpinnedBuffers()?

I think the comment will seem more correct in the following manner:
"Try to evict all the shared buffers containing provided relation's 
pages.

This function is intended for testing/development use only!

Before calling this function, it is important to acquire 
AccessExclusiveLock on
the specified relation to avoid replacing the current block of this 
relation with
another one during execution.

If not null, buffers_evicted and buffers_flushed are set to the total 
number of
buffers evicted and flushed respectively."

I also think it'd be a good idea to add null-checks for 
"buffers_evicted" and
"buffers_flushed" before dereferencing them:
> *buffers_evicted = *buffers_flushed = 0;

I think we don't need to check this clause again if AccessExclusiveLock 
was acquired
before function call. Don't we?
> if (BufTagMatchesRelFileLocator(&bufHdr->tag, &rel->rd_locator))
> {
>     if (EvictUnpinnedBuffer(buf, buf_state, &flushed))
>         (*buffers_evicted)++;
>     if (flushed)
>         (*buffers_flushed)++;
> }


MarkUnpinnedBufferDirty():
I think the comment will seem more correct in the following manner:
"Try to mark provided shared buffer as dirty.

This function is intended for testing/development use only!

Same as EvictUnpinnedBuffer() but with MarkBufferDirty() call inside.

Returns true if the buffer was already dirty or it has successfully been 
marked as
dirty."

And also I think the function should return true at the case when the 
buffer was
already dirty. What do you think?


pg_buffercache_evict_relation():
"pg_buffercache_evict_relation function is intended to" - 'be' is missed 
here.


pg_buffercache_mark_dirty():
Maybe edit the comment to: "Try to mark a shared buffer as dirty."?

Maybe edit the elog text to: "bad shared buffer ID" - just to clarify 
the case when
provided buffer number is negative (local buffer number).


pg_buffercache_mark_dirty_all():
Maybe also edit the comment to: "Try to mark all the shared buffers as 
dirty."?


bufmgr.h:
I think it might be a good idea to follow the Postgres formatting style 
and move the
function's arguments to the next line if they exceed 80 characters.


regards,
Aidar Imamov



Hi,

On Fri, 28 Mar 2025 at 01:53, Aidar Imamov <a.imamov@postgrespro.ru> wrote:
>
> Hi!
>
> I've been looking at the patches v3 and there are a few things I want to
> talk
> about.

Thank you for looking into this!

> EvictUnpinnedBuffer():
> I think we should change the paragraph with "flushed" of the comment to
> something more like this: "If the buffer was dirty at the time of the
> call it
> will be flushed by calling FlushBuffer() and if 'flushed' is not NULL
> '*flushed'
> will be set to true."

This is correct if the buffer header lock is acquired before calling
the EvictUnpinnedBuffer(). Otherwise, the buffer might be flushed
after calling the EvictUnpinnedBuffer() and before acquiring the
buffer header lock. I slightly edited this comment and also added a
comment for the buf_state variable:

 * buf_state is used to understand if the buffer header lock is acquired
 * before calling this function. If it has non-zero value, it is assumed that
 * buffer header lock is acquired before calling this function. This is
 * helpful for evicting buffers in the relation as the buffer header lock
 * needs to be taken before calling this function in this case.
 *
 * *flushed is set to true if the buffer was dirty and has been flushed.
 * However, this does not necessarily mean that we flushed the buffer, it
 * could have been flushed by someone else.

> I also think it'd be a good idea to add null-checks for "flushed" before
> dereferencing it:
> > *flushed = false;
> > *flushed = true;

Assert check is added.

> If the (!buf_state) clause is entered then we assume that the header is
> not locked.
> Maybe edit the comment: "Lock the header if it is not already locked."
> -> "Lock the header"?
> >  if (!buf_state)
> >  {
> >    /* Make sure we can pin the buffer. */
> >    ResourceOwnerEnlarge(CurrentResourceOwner);
> >    ReservePrivateRefCountEntry();
> >
> >    /* Lock the header if it is not already locked. */
> >    buf_state = LockBufHdr(desc);
> >  }

I think this is better, done.

> EvictUnpinnedBuffersFromSharedRelation():
> Maybe it will be more accurate to name the function as
> EvictRelUnpinnedBuffers()?

I liked that, done.

> I think the comment will seem more correct in the following manner:
> "Try to evict all the shared buffers containing provided relation's
> pages.
>
> This function is intended for testing/development use only!
>
> Before calling this function, it is important to acquire
> AccessExclusiveLock on
> the specified relation to avoid replacing the current block of this
> relation with
> another one during execution.
>
> If not null, buffers_evicted and buffers_flushed are set to the total
> number of
> buffers evicted and flushed respectively."

I added all comments except the not null part, I also preserved the
explanation of why we need this function.

> I also think it'd be a good idea to add null-checks for
> "buffers_evicted" and
> "buffers_flushed" before dereferencing them:
> > *buffers_evicted = *buffers_flushed = 0;

Assert check is added.

> I think we don't need to check this clause again if AccessExclusiveLock
> was acquired
> before function call. Don't we?
> > if (BufTagMatchesRelFileLocator(&bufHdr->tag, &rel->rd_locator))
> > {
> >       if (EvictUnpinnedBuffer(buf, buf_state, &flushed))
> >               (*buffers_evicted)++;
> >       if (flushed)
> >               (*buffers_flushed)++;
> > }

I think we need it. Copy-pasting a comment from the DropRelationBuffers():

* We can make this a tad faster by prechecking the buffer tag before
* we attempt to lock the buffer; this saves a lot of lock
* acquisitions in typical cases.  It should be safe because the
* caller must have AccessExclusiveLock on the relation, or some other
* reason to be certain that no one is loading new pages of the rel
* into the buffer pool.  (Otherwise we might well miss such pages
* entirely.)  Therefore, while the tag might be changing while we
* look at it, it can't be changing *to* a value we care about, only
* *away* from such a value.  So false negatives are impossible, and
* false positives are safe because we'll recheck after getting the
* buffer lock.

> MarkUnpinnedBufferDirty():
> I think the comment will seem more correct in the following manner:
> "Try to mark provided shared buffer as dirty.
>
> This function is intended for testing/development use only!
>
> Same as EvictUnpinnedBuffer() but with MarkBufferDirty() call inside.
>
> Returns true if the buffer was already dirty or it has successfully been
> marked as
> dirty."

Done.

> And also I think the function should return true at the case when the
> buffer was
> already dirty. What do you think?

I asked the same question in the first email and I still could not
decide but I will be applying this as it was not decided before.

> pg_buffercache_evict_relation():
> "pg_buffercache_evict_relation function is intended to" - 'be' is missed
> here.

Done.

> pg_buffercache_mark_dirty():
> Maybe edit the comment to: "Try to mark a shared buffer as dirty."?

Done.

> Maybe edit the elog text to: "bad shared buffer ID" - just to clarify
> the case when
> provided buffer number is negative (local buffer number).

Although I think your suggestion makes sense, I did not change this
since it is already implemented like that in the
pg_buffercache_evict().

> pg_buffercache_mark_dirty_all():
> Maybe also edit the comment to: "Try to mark all the shared buffers as
> dirty."?

Done.

> bufmgr.h:
> I think it might be a good idea to follow the Postgres formatting style
> and move the
> function's arguments to the next line if they exceed 80 characters.

I thought that pgindent already does this, done.

v4 is attached.

--
Regards,
Nazir Bilal Yavuz
Microsoft

Attachment
Hi!

I've reviewed the latest version of the patches and found a few things 
worth
discussing. This is probably my final feedback on the patches at this 
point.
Maybe Joseph has something to add.

After this discussion, I think it would be helpful if one of the more 
experienced
hackers could take a look at the overall picture (perhaps we should set 
the
status "Ready for Committer"? To be honest, I'm not sure what step that 
status
should be set at).


pg_buffercache--1.5--1.6.sql:
It seems that there is no need for the OR REPLACE clause after the
pg_buffercache_evict() function has been dropped.

Maybe we should remove the RETURNS clause from function declarations 
that have
OUT parameters?
Doc: "When there are OUT or INOUT parameters, the RETURNS clause can be 
omitted"


pg_buffercache_evict_relation():
The function is marked as STRICT so I think we do not need for redundant 
check:
> if (PG_ARGISNULL(0))
>  ereport(ERROR,
>    (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>     errmsg("relation cannot be null")));
Doc: "returns null whenever any of its arguments are null", "the 
function is not
executed when there are null arguments;".


EvictUnpinnedBuffer():
Maybe edit this comment a little (just not to repeat the sentences):
> buf_state is used to check if the buffer header lock has been acquired 
> before
> calling this function. If buf_state has a non-zero value, it means that 
> the buffer
> header has already been locked. This information is useful for evicting 
> specific
> relation's buffers, as the buffers headers need to be locked before 
> this function
> can be called with such a intention.


EvictUnpinnedBuffer() & EvictRelUnpinnedBuffers():
Why did you decide to use the assert to check for NULLs?
I understand that currently, the use of these functions is limited to a 
specific set
of calls through the pg_buffercache interface. However, this may not 
always be the
case. Wouldn't it be better to allow users to choose whether or not they 
want to
receive additional information? For example, you could simply ignore any 
arguments
that are passed as NULL.


Additionally, I believe it would be beneficial to include some 
regression tests to
check at least the following cases: return type, being a superuser, bad 
buffer id,
local buffers case.


Also, there's a little thing about declaring functions as PARALLEL SAFE. 
To be honest,
I don't really have any solid arguments against it. I just have some 
doubts. For
example, how will it work if the plan is split up and we try to work on 
an object in
one part, while the other part of the plan evicts the pages of that 
object or marks
them as dirty... I can't really say for sure about that. And in that 
context, I'd
suggest referring to that awesome statement in the documentation: "If in 
doubt,
functions should be labeled as UNSAFE, which is the default."


regards,
Aidar Imamov



Hi,

On Mon, 31 Mar 2025 at 16:37, Aidar Imamov <a.imamov@postgrespro.ru> wrote:
>
> Hi!
>
> I've reviewed the latest version of the patches and found a few things
> worth
> discussing. This is probably my final feedback on the patches at this
> point.
> Maybe Joseph has something to add.

Thank you so much for the reviews, these were very helpful!

> After this discussion, I think it would be helpful if one of the more
> experienced
> hackers could take a look at the overall picture (perhaps we should set
> the
> status "Ready for Committer"? To be honest, I'm not sure what step that
> status
> should be set at).

I agree. I will mark this as 'ready for committer' after writing the tests.

> pg_buffercache--1.5--1.6.sql:
> It seems that there is no need for the OR REPLACE clause after the
> pg_buffercache_evict() function has been dropped.
>
> Maybe we should remove the RETURNS clause from function declarations
> that have
> OUT parameters?
> Doc: "When there are OUT or INOUT parameters, the RETURNS clause can be
> omitted"

You are right, both are done.

> pg_buffercache_evict_relation():
> The function is marked as STRICT so I think we do not need for redundant
> check:
> > if (PG_ARGISNULL(0))
> >  ereport(ERROR,
> >    (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> >     errmsg("relation cannot be null")));
> Doc: "returns null whenever any of its arguments are null", "the
> function is not
> executed when there are null arguments;".

Done.

> EvictUnpinnedBuffer():
> Maybe edit this comment a little (just not to repeat the sentences):
> > buf_state is used to check if the buffer header lock has been acquired
> > before
> > calling this function. If buf_state has a non-zero value, it means that
> > the buffer
> > header has already been locked. This information is useful for evicting
> > specific
> > relation's buffers, as the buffers headers need to be locked before
> > this function
> > can be called with such a intention.

This is better, done.

> EvictUnpinnedBuffer() & EvictRelUnpinnedBuffers():
> Why did you decide to use the assert to check for NULLs?
> I understand that currently, the use of these functions is limited to a
> specific set
> of calls through the pg_buffercache interface. However, this may not
> always be the
> case. Wouldn't it be better to allow users to choose whether or not they
> want to
> receive additional information? For example, you could simply ignore any
> arguments
> that are passed as NULL.

I do not have a strong opinion on this. I agree that these functions
can be used somewhere else later but it is easier to ignore these on
the return instead of handling NULLs in the functions. If we want to
consider the NULL possibility in the EvictUnpinnedBuffer() &
EvictRelUnpinnedBuffers(), then we need to write additional if
clauses. I think this increases the complexity unnecessarily.

> Additionally, I believe it would be beneficial to include some
> regression tests to
> check at least the following cases: return type, being a superuser, bad
> buffer id,
> local buffers case.

You are right, I will try to write the tests but I am sharing the new
version without tests to speed things up.

> Also, there's a little thing about declaring functions as PARALLEL SAFE.
> To be honest,
> I don't really have any solid arguments against it. I just have some
> doubts. For
> example, how will it work if the plan is split up and we try to work on
> an object in
> one part, while the other part of the plan evicts the pages of that
> object or marks
> them as dirty... I can't really say for sure about that. And in that
> context, I'd
> suggest referring to that awesome statement in the documentation: "If in
> doubt,
> functions should be labeled as UNSAFE, which is the default."

You may be right. I thought they are parallel safe as we acquire the
buffer header lock for each buffer but now, I have the same doubts as
you. I want to hear other people's opinions on that before labeling
them as UNSAFE.

v5 is attached.

--
Regards,
Nazir Bilal Yavuz
Microsoft

Attachment
Hi,

On 2025-03-31 19:49:25 +0300, Nazir Bilal Yavuz wrote:
> > After this discussion, I think it would be helpful if one of the more
> > experienced
> > hackers could take a look at the overall picture (perhaps we should set
> > the
> > status "Ready for Committer"? To be honest, I'm not sure what step that
> > status
> > should be set at).
> 
> I agree. I will mark this as 'ready for committer' after writing the tests.

Are you still working on tests?


> > Also, there's a little thing about declaring functions as PARALLEL SAFE.
> > To be honest,
> > I don't really have any solid arguments against it. I just have some
> > doubts. For
> > example, how will it work if the plan is split up and we try to work on
> > an object in
> > one part, while the other part of the plan evicts the pages of that
> > object or marks
> > them as dirty... I can't really say for sure about that. And in that
> > context, I'd
> > suggest referring to that awesome statement in the documentation: "If in
> > doubt,
> > functions should be labeled as UNSAFE, which is the default."
> 
> You may be right. I thought they are parallel safe as we acquire the
> buffer header lock for each buffer but now, I have the same doubts as
> you. I want to hear other people's opinions on that before labeling
> them as UNSAFE.

I don't see a problem with them being parallel safe.


> From a59be4b50d7691ba952d0abd32198e972d4a6ea0 Mon Sep 17 00:00:00 2001
> From: Nazir Bilal Yavuz <byavuz81@gmail.com>
> Date: Mon, 24 Mar 2025 13:52:04 +0300
> Subject: [PATCH v5 1/3] Add pg_buffercache_evict_[relation | all]() functions
>  for testing
> 
> pg_buffercache_evict_relation(): Evicts all shared buffers in a
> relation at once.
> pg_buffercache_evict_all(): Evicts all shared buffers at once.
> 
> Both functions provide mechanism to evict multiple shared buffers at
> once. They are designed to address the inefficiency of repeatedly calling
> pg_buffercache_evict() for each individual buffer, which can be
> time-consuming when dealing with large shared buffer pools.
> (e.g., ~790ms vs. ~270ms for 16GB of shared buffers).
> 
> These functions are intended for developer testing and debugging
> purposes and are available to superusers only.


> +    if (!superuser())
> +        ereport(ERROR,
> +                (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> +                 errmsg("must be superuser to use pg_buffercache_evict_relation function")));

I think it'd be nicer if we didn't ask translators to translate 3 different
versions of this message, for different functions. Why not pass in the functio
name as a parameter?


> +    if (RelationUsesLocalBuffers(rel))
> +        ereport(ERROR,
> +                (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> +                 errmsg("relation uses local buffers,"
> +                        "pg_buffercache_evict_relation function is intended to"
> +                        "be used for shared buffers only")));

We try not to break messages across multiple lines, as that makes searching
for them harder.


> +    EvictRelUnpinnedBuffers(rel, &buffers_evicted, &buffers_flushed);
> +
> +    /* Close relation, release lock. */

This comment imo isn't useful, it just restates the code verbatim.


> +    relation_close(rel, AccessExclusiveLock);

Hm. Why are we dropping the lock here early? It's probably ok, but it's not
clear to me why we should do so.


> +    /* Build and return the tuple. */

Similar to above, the comment is just a restatement of a short piece of code.



> + <para>
> +  The <function>pg_buffercache_evict_relation()</function> function allows all
> +  shared buffers in the relation to be evicted from the buffer pool given a
> +  relation identifier.  Use of this function is restricted to superusers only.
> + </para>

I'd say "all unpinned shared buffers".

I wonder if these functions ought to also return the number of buffers that
could not be evicted?

Should this, or the longer comment for the function, mention that buffers for
all relation forks are evicted?


> + <para>
> +  The <function>pg_buffercache_evict_all()</function> function allows all
> +  shared buffers to be evicted in the buffer pool.  Use of this function is
> +  restricted to superusers only.
> + </para>

Basically the same comments as above, except for the fork portion.


> diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
> index f9681d09e1e..5d82e3fa297 100644
> --- a/src/backend/storage/buffer/bufmgr.c
> +++ b/src/backend/storage/buffer/bufmgr.c
> @@ -6512,26 +6512,47 @@ ResOwnerPrintBufferPin(Datum res)
>   * even by the same block.  This inherent raciness without other interlocking
>   * makes the function unsuitable for non-testing usage.
>   *
> + * buf_state is used to check if the buffer header lock has been acquired
> + * before calling this function.  If buf_state has a non-zero value, it means
> + * that the buffer header has already been locked.  This information is useful
> + * for evicting a specific relation's buffers, as the buffers' headers need to
> + * be locked before this function can be called with such an intention.

I don't like this aspect of the API changes one bit.  IMO something callable
from outside storage/buffer should have no business knowing about buf_state.

And detecting whether already hold a lock by checking whether the buf_state is
0 feels really wrong to me.


> +/*
> + * Try to evict all the shared buffers containing provided relation's pages.
> + *
> + * This function is intended for testing/development use only!
> + *
> + * We need this helper function because of the following reason.
> + * ReservePrivateRefCountEntry() needs to be called before acquiring the
> + * buffer header lock but ReservePrivateRefCountEntry() is static and it would
> + * be better to have it as static. Hence, it can't be called from outside of
> + * this file. This helper function is created to bypass that problem.

I think there are other reasons - we should minimize the amount of code
outside of storage/buffer that needs to know about BuferDescs etc, it's a
layering violation to access that outside.


> + * Before calling this function, it is important to acquire
> + * AccessExclusiveLock on the specified relation to avoid replacing the
> + * current block of this relation with another one during execution.

What do you mean with "current block of this relation"?


> + * buffers_evicted and buffers_flushed are set the total number of buffers
> + * evicted and flushed respectively.
> + */
> +void
> +EvictRelUnpinnedBuffers(Relation rel, int32 *buffers_evicted, int32 *buffers_flushed)
> +{
> +    bool        flushed;
> +
> +    Assert(buffers_evicted && buffers_flushed);
> +    *buffers_evicted = *buffers_flushed = 0;

I'd personally not bother with the Assert, the next line would just crash
anyway...




> diff --git a/contrib/pg_buffercache/pg_buffercache--1.5--1.6.sql
b/contrib/pg_buffercache/pg_buffercache--1.5--1.6.sql
> index 2e255f3fc10..d54bb1fd6f8 100644
> --- a/contrib/pg_buffercache/pg_buffercache--1.5--1.6.sql
> +++ b/contrib/pg_buffercache/pg_buffercache--1.5--1.6.sql
> @@ -1,5 +1,13 @@
>  \echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.6'" to load this file. \quit
>  
> +DROP FUNCTION pg_buffercache_evict(integer);
> +CREATE FUNCTION pg_buffercache_evict(
> +    IN int,
> +    OUT evicted boolean,
> +    OUT flushed boolean)
> +AS 'MODULE_PATHNAME', 'pg_buffercache_evict'
> +LANGUAGE C PARALLEL SAFE VOLATILE STRICT;

I assume the function has to be dropped because the return type changes?


Greetings,

Andres Freund



Hi,

Thanks for the review!

On Wed, 2 Apr 2025 at 20:06, Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2025-03-31 19:49:25 +0300, Nazir Bilal Yavuz wrote:
> > > After this discussion, I think it would be helpful if one of the more
> > > experienced
> > > hackers could take a look at the overall picture (perhaps we should set
> > > the
> > > status "Ready for Committer"? To be honest, I'm not sure what step that
> > > status
> > > should be set at).
> >
> > I agree. I will mark this as 'ready for committer' after writing the tests.
>
> Are you still working on tests?

Sorry, I forgot. They are attached as 0004.

> > > Also, there's a little thing about declaring functions as PARALLEL SAFE.
> > > To be honest,
> > > I don't really have any solid arguments against it. I just have some
> > > doubts. For
> > > example, how will it work if the plan is split up and we try to work on
> > > an object in
> > > one part, while the other part of the plan evicts the pages of that
> > > object or marks
> > > them as dirty... I can't really say for sure about that. And in that
> > > context, I'd
> > > suggest referring to that awesome statement in the documentation: "If in
> > > doubt,
> > > functions should be labeled as UNSAFE, which is the default."
> >
> > You may be right. I thought they are parallel safe as we acquire the
> > buffer header lock for each buffer but now, I have the same doubts as
> > you. I want to hear other people's opinions on that before labeling
> > them as UNSAFE.
>
> I don't see a problem with them being parallel safe.
>
>
> > From a59be4b50d7691ba952d0abd32198e972d4a6ea0 Mon Sep 17 00:00:00 2001
> > From: Nazir Bilal Yavuz <byavuz81@gmail.com>
> > Date: Mon, 24 Mar 2025 13:52:04 +0300
> > Subject: [PATCH v5 1/3] Add pg_buffercache_evict_[relation | all]() functions
> >  for testing
> >
> > pg_buffercache_evict_relation(): Evicts all shared buffers in a
> > relation at once.
> > pg_buffercache_evict_all(): Evicts all shared buffers at once.
> >
> > Both functions provide mechanism to evict multiple shared buffers at
> > once. They are designed to address the inefficiency of repeatedly calling
> > pg_buffercache_evict() for each individual buffer, which can be
> > time-consuming when dealing with large shared buffer pools.
> > (e.g., ~790ms vs. ~270ms for 16GB of shared buffers).
> >
> > These functions are intended for developer testing and debugging
> > purposes and are available to superusers only.
>
>
> > +     if (!superuser())
> > +             ereport(ERROR,
> > +                             (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> > +                              errmsg("must be superuser to use pg_buffercache_evict_relation function")));
>
> I think it'd be nicer if we didn't ask translators to translate 3 different
> versions of this message, for different functions. Why not pass in the functio
> name as a parameter?

Ah, I didn't think of this aspect, done.

> > +     if (RelationUsesLocalBuffers(rel))
> > +             ereport(ERROR,
> > +                             (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> > +                              errmsg("relation uses local buffers,"
> > +                                             "pg_buffercache_evict_relation function is intended to"
> > +                                             "be used for shared buffers only")));
>
> We try not to break messages across multiple lines, as that makes searching
> for them harder.

Done.

> > +     EvictRelUnpinnedBuffers(rel, &buffers_evicted, &buffers_flushed);
> > +
> > +     /* Close relation, release lock. */
>
> This comment imo isn't useful, it just restates the code verbatim.

Removed.

>
> > +     relation_close(rel, AccessExclusiveLock);
>
> Hm. Why are we dropping the lock here early? It's probably ok, but it's not
> clear to me why we should do so.

We are dropping the lock after we processed the relation. I didn't
understand what could be the problem here. Why do you think it is
early?

> > +     /* Build and return the tuple. */
>
> Similar to above, the comment is just a restatement of a short piece of code.

Done. It exists in other places in the pg_buffercache_pages.c file, I
only removed the ones that I added.

> > + <para>
> > +  The <function>pg_buffercache_evict_relation()</function> function allows all
> > +  shared buffers in the relation to be evicted from the buffer pool given a
> > +  relation identifier.  Use of this function is restricted to superusers only.
> > + </para>
>
> I'd say "all unpinned shared buffers".

Done.

> I wonder if these functions ought to also return the number of buffers that
> could not be evicted?

I didn't add this as I thought this information could be gathered if wanted.

> Should this, or the longer comment for the function, mention that buffers for
> all relation forks are evicted?

I added this to a longer comment for the function.

> > + <para>
> > +  The <function>pg_buffercache_evict_all()</function> function allows all
> > +  shared buffers to be evicted in the buffer pool.  Use of this function is
> > +  restricted to superusers only.
> > + </para>
>
> Basically the same comments as above, except for the fork portion.

Done.

> > diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
> > index f9681d09e1e..5d82e3fa297 100644
> > --- a/src/backend/storage/buffer/bufmgr.c
> > +++ b/src/backend/storage/buffer/bufmgr.c
> > @@ -6512,26 +6512,47 @@ ResOwnerPrintBufferPin(Datum res)
> >   * even by the same block.  This inherent raciness without other interlocking
> >   * makes the function unsuitable for non-testing usage.
> >   *
> > + * buf_state is used to check if the buffer header lock has been acquired
> > + * before calling this function.  If buf_state has a non-zero value, it means
> > + * that the buffer header has already been locked.  This information is useful
> > + * for evicting a specific relation's buffers, as the buffers' headers need to
> > + * be locked before this function can be called with such an intention.
>
> I don't like this aspect of the API changes one bit.  IMO something callable
> from outside storage/buffer should have no business knowing about buf_state.
>
> And detecting whether already hold a lock by checking whether the buf_state is
> 0 feels really wrong to me.

I changed this. I basically copied EvictUnpinnedBuffer() to the inside
of EvictRelUnpinnedBuffers(), we don't need any hacky methods now.

> > +/*
> > + * Try to evict all the shared buffers containing provided relation's pages.
> > + *
> > + * This function is intended for testing/development use only!
> > + *
> > + * We need this helper function because of the following reason.
> > + * ReservePrivateRefCountEntry() needs to be called before acquiring the
> > + * buffer header lock but ReservePrivateRefCountEntry() is static and it would
> > + * be better to have it as static. Hence, it can't be called from outside of
> > + * this file. This helper function is created to bypass that problem.
>
> I think there are other reasons - we should minimize the amount of code
> outside of storage/buffer that needs to know about BuferDescs etc, it's a
> layering violation to access that outside.

I added that as a comment.

> > + * Before calling this function, it is important to acquire
> > + * AccessExclusiveLock on the specified relation to avoid replacing the
> > + * current block of this relation with another one during execution.
>
> What do you mean with "current block of this relation"?

It is used timewise, like a snapshot of the blocks when the function
is called. I updated this with:

'The caller must hold at least AccessShareLock on the relation to
prevent the relation from being dropped.' [1]

> > + * buffers_evicted and buffers_flushed are set the total number of buffers
> > + * evicted and flushed respectively.
> > + */
> > +void
> > +EvictRelUnpinnedBuffers(Relation rel, int32 *buffers_evicted, int32 *buffers_flushed)
> > +{
> > +     bool            flushed;
> > +
> > +     Assert(buffers_evicted && buffers_flushed);
> > +     *buffers_evicted = *buffers_flushed = 0;
>
> I'd personally not bother with the Assert, the next line would just crash
> anyway...

AIO tests already use EvictUnpinnedBuffer() without flushed
information. I made '*buffers_evicted, *buffers_flushed in the
EvictRelUnpinnedBuffers() and *flushed in the EvictUnpinnedBuffer()'
optional as Aidar suggested at upthread.

> > diff --git a/contrib/pg_buffercache/pg_buffercache--1.5--1.6.sql
b/contrib/pg_buffercache/pg_buffercache--1.5--1.6.sql
> > index 2e255f3fc10..d54bb1fd6f8 100644
> > --- a/contrib/pg_buffercache/pg_buffercache--1.5--1.6.sql
> > +++ b/contrib/pg_buffercache/pg_buffercache--1.5--1.6.sql
> > @@ -1,5 +1,13 @@
> >  \echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.6'" to load this file. \quit
> >
> > +DROP FUNCTION pg_buffercache_evict(integer);
> > +CREATE FUNCTION pg_buffercache_evict(
> > +    IN int,
> > +    OUT evicted boolean,
> > +    OUT flushed boolean)
> > +AS 'MODULE_PATHNAME', 'pg_buffercache_evict'
> > +LANGUAGE C PARALLEL SAFE VOLATILE STRICT;
>
> I assume the function has to be dropped because the return type changes?

Correct.

v6 is attached. Additional changes prior to v5:

* Tests are added.
* Andres said off-list that AccessExclusiveLock is too much, all the
lock needs to do is to prevent the relation from being dropped. So,
pg_buffercache_evict_relation() opens a relation with AccessShareLock
now. Function comment in EvictRelUnpinnedBuffers() is updated
regarding that. [1]
* EvictRelUnpinnedBuffers() does not call EvictUnpinnedBuffer() now. I
basically copied EvictUnpinnedBuffer() to the inside of
EvictRelUnpinnedBuffers() with the needed updates.

--
Regards,
Nazir Bilal Yavuz
Microsoft

Attachment